From 032baaacb8771a3c2c8798e98973896a2005458b Mon Sep 17 00:00:00 2001 From: Andreas Stadelmeier Date: Wed, 27 Mar 2024 01:54:08 +0100 Subject: [PATCH] Cleanup Unify. Add explanation to adopt rule and add lessdot markers --- unify.tex | 156 ++++++++++++++++++++---------------------------------- 1 file changed, 57 insertions(+), 99 deletions(-) diff --git a/unify.tex b/unify.tex index 76a7076..508a187 100644 --- a/unify.tex +++ b/unify.tex @@ -6,17 +6,6 @@ % the algorithm only removes wildcards, never adds them \section{Unify}\label{sec:unify} -The wildcard placeholders are used for intermediat types. -It is not possible to create all super types of a type. -The General rule only creates the ones expressable by Java syntax, which still are infinitly many in some cases \cite{TamingWildcards}. -%thats not true. it can spawn X^T_T2.List where T and T2 are types and we need to choose one inbetween them -Otherwise the algorithm could generate more solutions, but they have to be filterd out afterwards, because they cannot be translated into Java. - -Any type can be inserted into wildcard placeholders. -Normal placeholders have to contain types, which are well-formed under the supplied environment. -% It is important that the algorithm also works for any subset of constraints -%TODO: The unify algorithm can do any operation on wildcard placeholders the same as on normal ones. -%TODO: Unify could make way more substitutions for wtvs especially in step 2 \subsection{Description} The \unify{} algorithm tries to find a solution for a set of constraints like @@ -82,20 +71,25 @@ We define two types as equal if they are mutual subtypes of each other. The equality relation on Capture constraints is not reflexive. -$(\type{T} \lessdotCC \type{S}) \neq (\type{T} \lessdotCC \type{S})$ eventhough it's the same constraint. +A capture constraint is never equal to another capture constraint even when structurally the same +($\type{T} \lessdotCC \type{S} \neq \type{T} \lessdotCC \type{S}$). +An implementation of the algorithm has to take this into account. +All constraints are stored in a set and there are no dublicates of subtype constraints in a constraint set. +Capture constraints however have to be stored as a list or have an unique number assigned +so that duplicates don't get automatically discarded. -Capture conversion is done during the \unify{} algorithm. -\unify{} has to make two promises to ensure soundness of our type inference algorithm. -Capture conversion can only be applied at capture constraints. -Free variables are not allowed to leave their scope. -This is ensured by type variables which are not allowed to be assigned type holding free variables. +% \subsection{Capture Conversion} +% % TODO: Describe Capture conversion +% Capture conversion is done during the \unify{} algorithm. +% \unify{} has to make two promises to ensure soundness of our type inference algorithm. +% Capture conversion can only be applied at capture constraints. +% Free variables are not allowed to leave their scope. +% This is ensured by type variables which are not allowed to be assigned type holding free variables. \subsection{Algorithm} \newcommand{\gtype}[1]{\type{#1}} %\newcommand{\tw}[1]{\tv{#1}_?} -The \unify{} algorithm computes the type solution. - \begin{description} \item[Input:] An environment $\Delta'$ and a set of constraints $C = \set{ \type{T} \lessdot \type{T}, \type{T} \doteq \type{T} \ldots}$ @@ -108,21 +102,6 @@ The input constraints must be of the following format: $\ntype{N}, \ntype{S}$ & $::=$ & $\wctype{\overline{\wildcard{X}{U}{L}}}{C}{\ol{T}} $ & Class Type \\ \end{tabular}\\[0.5em] -\noindent -Additional requirements: -\begin{itemize} - - \item All types have to be well-formed: $\wcNtype{\Delta}{N} \in C \implies \Delta_{in} \vdash \wcNtype{\Delta}{N} \ \ok$ -\item Naming scheme of every wildcard environment has to be the same. -%TODO: We need this so that wildcard substitutions get the correct name. also the Equals rule needs this condition -%Example: -Although alpha renaming of wildcards inside a type is allowed by the type system the \unify{} algorithm never does it. -Renaming wildcards leads to additional problems in the substitution rules and in the result containing substitutions with renamed wildcards. -For the \rulename{Equals} to work properly it is adviced to name all wildcards in a specific scheme. -For example by numbering them according to their appereance inside the type parameters -(e.g. $\wctype{\rwildcard{1}, \rwildcard{2}}{Pair}{\rwildcard{1}, \rwildcard{2}}$). -\end{itemize} - \item[Output:] Set of unifiers $Uni = \set{\sigma_1, \ldots, \sigma_n}$ and an environment $\Delta$ \end{description} @@ -142,28 +121,18 @@ The \unify{} algorithm internally uses the following data types: The $\wtv{a}$ type variables are flagged as wildcard type variables. These type variables can be substituted by a wildcard or a type with free wildcard variables. -As long as a type variable is flagged with as $\wtv{a}$ it will only be used by the subst-wc rule in step 1. -In step 2 all of the wildcard flags are dismissed. -The output therefore never contains these flags. - -\unify{} applies a capture conversion everywhere it is possible (see \rulename{Capture} rule). -Capture conversion removes a types bounding environment $\Delta$. -Type variables used in its type parameters are now bound to a global scope and not locally anymore. +As long as a type variable is flagged as $\wtv{a}$ it can be used by the \rulename{Subst-WC} rule in step 1. With \texttt{C} being class names and \texttt{A} being wildcard names. The wildcard type $\wildcard{X}{U}{L}$ consist of an upper bound $\type{U}$, a lower bound $\type{L}$ and a name $\mathtt{X}$. -The \rulename{Normalize} rule eliminates wildcards. % TODO +The \rulename{Tame} rule eliminates wildcards. %TODO This is done by setting the upper and lower bound to the same value. -\unify{} generates wildcards with the \rulename{\generalizeRule} rule. -It is important to generate new wildcards in a standardized fashion. -When having two constraints $\type{T} \lessdot \tv{a}$ and $\type{T} \lessdot \tv{b}$, -then after applying the \rulename{\generalizeRule} rule the freshly generated constraints are -$\tv{a} \doteq \type{T'}, \tv{b} \doteq \type{T'}$. -Both type variables get assigned the same type. -This is necessary for the \rulename{Equals} rule to work properly. +\unify{} applies a capture conversion everywhere it is possible (see \rulename{Capture} rule). +Capture conversion removes a types bounding environment $\Delta$. +Type variables used in its type parameters are now bound to a global scope and not locally anymore. \begin{figure} \begin{center} @@ -182,7 +151,7 @@ This is necessary for the \rulename{Equals} rule to work properly. \text{fv}(\type{T}) \subseteq \Delta', \, \text{wtv}(\type{T}) = \emptyset \end{array}$\\ \\ - \rulename{Remove} & + \rulename{Normalize} & $\begin{array}[c]{l} \wildcardEnv \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\ \hline @@ -579,14 +548,14 @@ $ \begin{array}[c]{@{}ll} \begin{array}[c]{l} \wildcardEnv \vdash C \cup \, \set{ - \tv{b} \lessdot \tv{a}, -\tv{a} \lessdot \type{N}, \tv{b} \lessdot \type{N'}} \\ + \tv{b} \lessdot_1 \tv{a}, +\tv{a} \lessdot_2 \type{N}, \tv{b} \lessdot_3 \type{N'}} \\ \hline \vspace*{-0.4cm}\\ \wildcardEnv \vdash C \cup \, \set{ \tv{b} \lessdot \type{N}, - \tv{b} \lessdot \tv{a}, -\tv{a} \lessdot \type{N} , \tv{b} \lessdot \type{N'} + \tv{b} \lessdot_1 \tv{a}, +\tv{a} \lessdot_2 \type{N} , \tv{b} \lessdot_3 \type{N'} } \end{array} \end{array} @@ -643,35 +612,29 @@ C \cup \set{ \tv{a} \lessdot \wctype{\Delta'}{C}{[\ol{U}/\ol{X}]\ol{S}}} \caption{Constraint reduce rules}\label{fig:reduce-rules} \end{figure} -The new constraint generated by the adopt rule may be eliminated by the match rule. -The adopt rule still needs to be applied only once per constraint. - -Wildcards consist out of three parts. -A name, a scope and a upper and lower bound. - -% The \unify{} algorithm from \cite{plue09_1} substitutes type variables with wildcards. -% A constraint $\wctype{\wildcard{X}{\type{Object}}{\bot}}{C}{\rwildcard{X}} \lessdot \exptype{C}{\tv{a}}$ -% has no solution. -% Replacing the type variable $\tv{a}$ with the wildcard $\rwildcard{X}$ is not correct! -% The wildcard $\rwildcard{X}$ cannot leave its scope and the type $\exptype{C}{\rwildcard{X}}$ -% is considered invalid. - -Wildcards are not reflexive. A box of type $\wctype{\rwildcard{X}}{Box}{\rwildcard{X}}$ -is able to hold a value of any type. It could be a $\exptype{Box}{String}$ or a $\exptype{Box}{Integer}$ etc. -Also two of those boxes do not necessarily contain the same type. -But there are situations where it is possible to assume that. -For example the type $\wctype{\rwildcard{X}}{Pair}{\exptype{Box}{\rwildcard{X}}, \exptype{Box}{\rwildcard{X}}}$ -is a pair of two boxes, which always contain the same type. -Inside the scope of the \texttt{Pair} type, the wildcard $\rwildcard{X}$ stays the same. +Capture constraints are treated like regular subtype constraints. +All transformations for subtype constraints work for capture constraints aswell. +For clarification Subtype constraints are marked with a number. +Subtype constraints holding the same number keep their respective type. +Newly created subtype constraints are always regular subtype constrains unless stated otherwise. +The \rulename{Adopt} rule for example takes multiple subtype constraints and adds a new one. +Having the constraints +$\ntv{a} \lessdotCC \wtv{b}, \ntv{a} \lessdot \type{String}, \wtv{b} \lessdot \type{Object}$ +would lead to +$\wtv{b} \lessdot \type{String}, \ntv{a} \lessdotCC \wtv{b}, \ntv{a} \lessdot \type{String}, \wtv{b} \lessdotCC \type{Object}$ +after applying \rulename{Adopt}. +The new generated constraint $\wtv{b} \lessdot \type{String}$ is a normal subtype constraint. +The type placeholders which are annotated as wildcard placeholders also stay wildcard placeholders. +The only rule that replaces wildcard type placeholders with regular placeholders is the \rulename{Normalize} rule. +The algorithm holds two sets. +The input constraints and a wildcard environment $\wildcardEnv{}$ keeping free variables. The algorithm starts with an empty wildcard environment $\wildcardEnv{}$. -Only the reduce rule adds wildcards to that environment. +Only the \rulename{Capture} rule adds wildcards to that environment. And every added wildcard gets a fresh unique name. This ensures the wildcard environment $\wildcardEnv{}$ never gets the same wildcard twice. -When a new type is generated by the \unify{} algorithm it always has the form -$\wctype{\ol{\rwildcard{A}}}{C}{\ol{\rwildcard{A}}}$. \textbf{Step 1:} Apply the rules depicted in the figures \ref{fig:normalizing-rules}, \ref{fig:reduce-rules} and \ref{fig:wildcard-rules} exhaustively. @@ -1007,28 +970,8 @@ We have to consider both possibilities. \\[1em] \noindent -\textbf{Step 3} -\textbf{(Eliminate Wildcard Variables):} -If no more rules in step 2 are applicable \unify{} has to eliminate all wildcard variables and $\lessdotCC$ constraints by applying the \rulename{Remove} rule -and start over at step 1. -If $C$ does not contain any wildcard variables the algorithm proceeds with step 4. -\begin{center} - \fbox{ - \begin{tabular}[t]{l@{~}l} -\rulename{Remove-Cons} & $ - \begin{array}[c]{l} - \wildcardEnv \vdash C \cup \set{\type{S} \lessdotCC \type{T} } \\ - \hline - \vspace*{-0.4cm}\\ - \wildcardEnv \vdash C \cup \set{\type{S} \lessdot \type{T} } - \end{array} - $ -\end{tabular}} -\end{center} - -\noindent -\textbf{Step 4:} -We apply the rules in figure \ref{fig:cleanup-rules} exhaustively and proceed with step 6. +\textbf{Step 3:} +We apply the rules in figure \ref{fig:cleanup-rules} exhaustively and proceed with step 4. \begin{figure} \begin{center} @@ -1237,3 +1180,18 @@ But this renders additional problems: \item Capture Converted variables are not allowed to leave their scope \item \unify{} generates type substitution which cannot be translated to Java types. \end{itemize} + + +\subsection{Completeness} +It is not possible to create all super types of a type. +The General rule only creates the ones expressable by Java syntax, which still are infinitly many in some cases \cite{TamingWildcards}. +%thats not true. it can spawn X^T_T2.List where T and T2 are types and we need to choose one inbetween them +Otherwise the algorithm could generate more solutions, but they have to be filterd out afterwards, because they cannot be translated into Java. + + +\subsection{Implementation} +%List this under implementation details +Every constraint that is not in solved form and is not able to be processed by any of the rules accounts as a error constraint and renders the constraint set unsolvable + +The new constraint generated by the adopt rule may be eliminated by the match rule. +The adopt rule still needs to be applied only once per constraint.