Cleanup Unify. Add explanation to adopt rule and add lessdot markers

This commit is contained in:
Andreas Stadelmeier 2024-03-27 01:54:08 +01:00
parent e93a762441
commit 032baaacb8

156
unify.tex
View File

@ -6,17 +6,6 @@
% the algorithm only removes wildcards, never adds them
\section{Unify}\label{sec:unify}
The wildcard placeholders are used for intermediat types.
It is not possible to create all super types of a type.
The General rule only creates the ones expressable by Java syntax, which still are infinitly many in some cases \cite{TamingWildcards}.
%thats not true. it can spawn X^T_T2.List<X> where T and T2 are types and we need to choose one inbetween them
Otherwise the algorithm could generate more solutions, but they have to be filterd out afterwards, because they cannot be translated into Java.
Any type can be inserted into wildcard placeholders.
Normal placeholders have to contain types, which are well-formed under the supplied environment.
% It is important that the algorithm also works for any subset of constraints
%TODO: The unify algorithm can do any operation on wildcard placeholders the same as on normal ones.
%TODO: Unify could make way more substitutions for wtvs especially in step 2
\subsection{Description}
The \unify{} algorithm tries to find a solution for a set of constraints like
@ -82,20 +71,25 @@ We define two types as equal if they are mutual subtypes of each other.
The equality relation on Capture constraints is not reflexive.
$(\type{T} \lessdotCC \type{S}) \neq (\type{T} \lessdotCC \type{S})$ eventhough it's the same constraint.
A capture constraint is never equal to another capture constraint even when structurally the same
($\type{T} \lessdotCC \type{S} \neq \type{T} \lessdotCC \type{S}$).
An implementation of the algorithm has to take this into account.
All constraints are stored in a set and there are no dublicates of subtype constraints in a constraint set.
Capture constraints however have to be stored as a list or have an unique number assigned
so that duplicates don't get automatically discarded.
Capture conversion is done during the \unify{} algorithm.
\unify{} has to make two promises to ensure soundness of our type inference algorithm.
Capture conversion can only be applied at capture constraints.
Free variables are not allowed to leave their scope.
This is ensured by type variables which are not allowed to be assigned type holding free variables.
% \subsection{Capture Conversion}
% % TODO: Describe Capture conversion
% Capture conversion is done during the \unify{} algorithm.
% \unify{} has to make two promises to ensure soundness of our type inference algorithm.
% Capture conversion can only be applied at capture constraints.
% Free variables are not allowed to leave their scope.
% This is ensured by type variables which are not allowed to be assigned type holding free variables.
\subsection{Algorithm}
\newcommand{\gtype}[1]{\type{#1}}
%\newcommand{\tw}[1]{\tv{#1}_?}
The \unify{} algorithm computes the type solution.
\begin{description}
\item[Input:] An environment $\Delta'$
and a set of constraints $C = \set{ \type{T} \lessdot \type{T}, \type{T} \doteq \type{T} \ldots}$
@ -108,21 +102,6 @@ The input constraints must be of the following format:
$\ntype{N}, \ntype{S}$ & $::=$ & $\wctype{\overline{\wildcard{X}{U}{L}}}{C}{\ol{T}} $ & Class Type \\
\end{tabular}\\[0.5em]
\noindent
Additional requirements:
\begin{itemize}
\item All types have to be well-formed: $\wcNtype{\Delta}{N} \in C \implies \Delta_{in} \vdash \wcNtype{\Delta}{N} \ \ok$
\item Naming scheme of every wildcard environment has to be the same.
%TODO: We need this so that wildcard substitutions get the correct name. also the Equals rule needs this condition
%Example:
Although alpha renaming of wildcards inside a type is allowed by the type system the \unify{} algorithm never does it.
Renaming wildcards leads to additional problems in the substitution rules and in the result containing substitutions with renamed wildcards.
For the \rulename{Equals} to work properly it is adviced to name all wildcards in a specific scheme.
For example by numbering them according to their appereance inside the type parameters
(e.g. $\wctype{\rwildcard{1}, \rwildcard{2}}{Pair}{\rwildcard{1}, \rwildcard{2}}$).
\end{itemize}
\item[Output:]
Set of unifiers $Uni = \set{\sigma_1, \ldots, \sigma_n}$ and an environment $\Delta$
\end{description}
@ -142,28 +121,18 @@ The \unify{} algorithm internally uses the following data types:
The $\wtv{a}$ type variables are flagged as wildcard type variables.
These type variables can be substituted by a wildcard or a type with free wildcard variables.
As long as a type variable is flagged with as $\wtv{a}$ it will only be used by the subst-wc rule in step 1.
In step 2 all of the wildcard flags are dismissed.
The output therefore never contains these flags.
\unify{} applies a capture conversion everywhere it is possible (see \rulename{Capture} rule).
Capture conversion removes a types bounding environment $\Delta$.
Type variables used in its type parameters are now bound to a global scope and not locally anymore.
As long as a type variable is flagged as $\wtv{a}$ it can be used by the \rulename{Subst-WC} rule in step 1.
With \texttt{C} being class names and \texttt{A} being wildcard names.
The wildcard type $\wildcard{X}{U}{L}$ consist of an upper bound $\type{U}$, a lower bound $\type{L}$
and a name $\mathtt{X}$.
The \rulename{Normalize} rule eliminates wildcards. % TODO
The \rulename{Tame} rule eliminates wildcards. %TODO
This is done by setting the upper and lower bound to the same value.
\unify{} generates wildcards with the \rulename{\generalizeRule} rule.
It is important to generate new wildcards in a standardized fashion.
When having two constraints $\type{T} \lessdot \tv{a}$ and $\type{T} \lessdot \tv{b}$,
then after applying the \rulename{\generalizeRule} rule the freshly generated constraints are
$\tv{a} \doteq \type{T'}, \tv{b} \doteq \type{T'}$.
Both type variables get assigned the same type.
This is necessary for the \rulename{Equals} rule to work properly.
\unify{} applies a capture conversion everywhere it is possible (see \rulename{Capture} rule).
Capture conversion removes a types bounding environment $\Delta$.
Type variables used in its type parameters are now bound to a global scope and not locally anymore.
\begin{figure}
\begin{center}
@ -182,7 +151,7 @@ This is necessary for the \rulename{Equals} rule to work properly.
\text{fv}(\type{T}) \subseteq \Delta', \, \text{wtv}(\type{T}) = \emptyset
\end{array}$\\
\\
\rulename{Remove} &
\rulename{Normalize} &
$\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline
@ -579,14 +548,14 @@ $
\begin{array}[c]{@{}ll}
\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \, \set{
\tv{b} \lessdot \tv{a},
\tv{a} \lessdot \type{N}, \tv{b} \lessdot \type{N'}} \\
\tv{b} \lessdot_1 \tv{a},
\tv{a} \lessdot_2 \type{N}, \tv{b} \lessdot_3 \type{N'}} \\
\hline
\vspace*{-0.4cm}\\
\wildcardEnv \vdash C \cup \, \set{
\tv{b} \lessdot \type{N},
\tv{b} \lessdot \tv{a},
\tv{a} \lessdot \type{N} , \tv{b} \lessdot \type{N'}
\tv{b} \lessdot_1 \tv{a},
\tv{a} \lessdot_2 \type{N} , \tv{b} \lessdot_3 \type{N'}
}
\end{array}
\end{array}
@ -643,35 +612,29 @@ C \cup \set{ \tv{a} \lessdot \wctype{\Delta'}{C}{[\ol{U}/\ol{X}]\ol{S}}}
\caption{Constraint reduce rules}\label{fig:reduce-rules}
\end{figure}
The new constraint generated by the adopt rule may be eliminated by the match rule.
The adopt rule still needs to be applied only once per constraint.
Wildcards consist out of three parts.
A name, a scope and a upper and lower bound.
% The \unify{} algorithm from \cite{plue09_1} substitutes type variables with wildcards.
% A constraint $\wctype{\wildcard{X}{\type{Object}}{\bot}}{C}{\rwildcard{X}} \lessdot \exptype{C}{\tv{a}}$
% has no solution.
% Replacing the type variable $\tv{a}$ with the wildcard $\rwildcard{X}$ is not correct!
% The wildcard $\rwildcard{X}$ cannot leave its scope and the type $\exptype{C}{\rwildcard{X}}$
% is considered invalid.
Wildcards are not reflexive. A box of type $\wctype{\rwildcard{X}}{Box}{\rwildcard{X}}$
is able to hold a value of any type. It could be a $\exptype{Box}{String}$ or a $\exptype{Box}{Integer}$ etc.
Also two of those boxes do not necessarily contain the same type.
But there are situations where it is possible to assume that.
For example the type $\wctype{\rwildcard{X}}{Pair}{\exptype{Box}{\rwildcard{X}}, \exptype{Box}{\rwildcard{X}}}$
is a pair of two boxes, which always contain the same type.
Inside the scope of the \texttt{Pair} type, the wildcard $\rwildcard{X}$ stays the same.
Capture constraints are treated like regular subtype constraints.
All transformations for subtype constraints work for capture constraints aswell.
For clarification Subtype constraints are marked with a number.
Subtype constraints holding the same number keep their respective type.
Newly created subtype constraints are always regular subtype constrains unless stated otherwise.
The \rulename{Adopt} rule for example takes multiple subtype constraints and adds a new one.
Having the constraints
$\ntv{a} \lessdotCC \wtv{b}, \ntv{a} \lessdot \type{String}, \wtv{b} \lessdot \type{Object}$
would lead to
$\wtv{b} \lessdot \type{String}, \ntv{a} \lessdotCC \wtv{b}, \ntv{a} \lessdot \type{String}, \wtv{b} \lessdotCC \type{Object}$
after applying \rulename{Adopt}.
The new generated constraint $\wtv{b} \lessdot \type{String}$ is a normal subtype constraint.
The type placeholders which are annotated as wildcard placeholders also stay wildcard placeholders.
The only rule that replaces wildcard type placeholders with regular placeholders is the \rulename{Normalize} rule.
The algorithm holds two sets.
The input constraints and a wildcard environment $\wildcardEnv{}$ keeping free variables.
The algorithm starts with an empty wildcard environment $\wildcardEnv{}$.
Only the reduce rule adds wildcards to that environment.
Only the \rulename{Capture} rule adds wildcards to that environment.
And every added wildcard gets a fresh unique name.
This ensures the wildcard environment $\wildcardEnv{}$ never
gets the same wildcard twice.
When a new type is generated by the \unify{} algorithm it always has the form
$\wctype{\ol{\rwildcard{A}}}{C}{\ol{\rwildcard{A}}}$.
\textbf{Step 1:}
Apply the rules depicted in the figures \ref{fig:normalizing-rules}, \ref{fig:reduce-rules} and \ref{fig:wildcard-rules} exhaustively.
@ -1007,28 +970,8 @@ We have to consider both possibilities.
\\[1em]
\noindent
\textbf{Step 3}
\textbf{(Eliminate Wildcard Variables):}
If no more rules in step 2 are applicable \unify{} has to eliminate all wildcard variables and $\lessdotCC$ constraints by applying the \rulename{Remove} rule
and start over at step 1.
If $C$ does not contain any wildcard variables the algorithm proceeds with step 4.
\begin{center}
\fbox{
\begin{tabular}[t]{l@{~}l}
\rulename{Remove-Cons} & $
\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\type{S} \lessdotCC \type{T} } \\
\hline
\vspace*{-0.4cm}\\
\wildcardEnv \vdash C \cup \set{\type{S} \lessdot \type{T} }
\end{array}
$
\end{tabular}}
\end{center}
\noindent
\textbf{Step 4:}
We apply the rules in figure \ref{fig:cleanup-rules} exhaustively and proceed with step 6.
\textbf{Step 3:}
We apply the rules in figure \ref{fig:cleanup-rules} exhaustively and proceed with step 4.
\begin{figure}
\begin{center}
@ -1237,3 +1180,18 @@ But this renders additional problems:
\item Capture Converted variables are not allowed to leave their scope
\item \unify{} generates type substitution which cannot be translated to Java types.
\end{itemize}
\subsection{Completeness}
It is not possible to create all super types of a type.
The General rule only creates the ones expressable by Java syntax, which still are infinitly many in some cases \cite{TamingWildcards}.
%thats not true. it can spawn X^T_T2.List<X> where T and T2 are types and we need to choose one inbetween them
Otherwise the algorithm could generate more solutions, but they have to be filterd out afterwards, because they cannot be translated into Java.
\subsection{Implementation}
%List this under implementation details
Every constraint that is not in solved form and is not able to be processed by any of the rules accounts as a error constraint and renders the constraint set unsolvable
The new constraint generated by the adopt rule may be eliminated by the match rule.
The adopt rule still needs to be applied only once per constraint.