Compare commits

...

2 Commits

Author SHA1 Message Date
Andreas Stadelmeier
032baaacb8 Cleanup Unify. Add explanation to adopt rule and add lessdot markers 2024-03-27 01:54:08 +01:00
Andreas Stadelmeier
e93a762441 Add Wildcard Environment to intro Unify example 2024-03-27 01:53:23 +01:00
3 changed files with 74 additions and 103 deletions

View File

@ -287,10 +287,10 @@ $\begin{array}{c}
\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}, \, \type{String} \lessdotCC \wtv{a} \wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}, \, \type{String} \lessdotCC \wtv{a}
\\ \\
\hline \hline
\textit{Capture Conversion:}\ \exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}, \, \type{String} \lessdot \wtv{a} \textit{Capture Conversion:}\ \wildcard{Y}{\type{Object}}{\type{String}} \wcSep \exptype{List}{\rwildcard{Y}} \lessdot \exptype{List}{\wtv{a}}, \, \type{String} \lessdot \wtv{a}
\\ \\
\hline \hline
\textit{Solution:}\ \wtv{a} \doteq \rwildcard{X} \implies \exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\rwildcard{X}}, \, \type{String} \lessdot \rwildcard{X} \textit{Solution:}\ \wtv{a} \doteq \rwildcard{Y} \implies \wildcard{Y}{\type{Object}}{\type{String}} \wcSep \exptype{List}{\rwildcard{Y}} \lessdot \exptype{List}{\rwildcard{Y}}, \, \type{String} \lessdot \rwildcard{Y}
\end{array} \end{array}
$ $
\end{center} \end{center}
@ -304,7 +304,11 @@ which converts a constraint of the form $(\wctype{\rwildcard{X}}{C}{\rwildcard{X
The type of \texttt{l} can be capture converted by a let statement if needed (see listing \ref{lst:addExampleLet}). The type of \texttt{l} can be capture converted by a let statement if needed (see listing \ref{lst:addExampleLet}).
Therefore we assign the constraint $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$ Therefore we assign the constraint $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$
which allows \unify{} to do a capture conversion to $\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$. which allows \unify{} to do a capture conversion to $\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$.
\textit{Note:} The constraint $\type{String} \lessdot \rwildcard{X}$ is satisfied because $\rwildcard{X}$ has $\type{String}$ as lower bound. The captured wildcard $\rwildcard{X}$ gets a fresh name and is stored in the wildcard environment of the \unify{} algorithm.
\textit{Note:} The constraint $\type{String} \lessdot \rwildcard{Y}$ is satisfied
because $\rwildcard{Y}$ has $\type{String}$ as lower bound.
For the example shown in listing \ref{shuffleExample} the method call \texttt{shuffle(l2d)} creates the constraints: For the example shown in listing \ref{shuffleExample} the method call \texttt{shuffle(l2d)} creates the constraints:
@ -696,3 +700,11 @@ But afterwards a capture conversion is applied, which can generate different typ
\item $\text{CC}(\wctype{\rwildcard{X}}{List}{\rwildcard{X}}) \implies \exptype{List}{\rwildcard{Y}}$ \item $\text{CC}(\wctype{\rwildcard{X}}{List}{\rwildcard{X}}) \implies \exptype{List}{\rwildcard{Y}}$
\item $\text{CC}(\wctype{\rwildcard{X}}{List}{\rwildcard{X}}) \implies \exptype{List}{\rwildcard{Z}}$ \item $\text{CC}(\wctype{\rwildcard{X}}{List}{\rwildcard{X}}) \implies \exptype{List}{\rwildcard{Z}}$
\end{itemize} \end{itemize}
Wildcards are not reflexive. A box of type $\wctype{\rwildcard{X}}{Box}{\rwildcard{X}}$
is able to hold a value of any type. It could be a $\exptype{Box}{String}$ or a $\exptype{Box}{Integer}$ etc.
Also two of those boxes do not necessarily contain the same type.
But there are situations where it is possible to assume that.
For example the type $\wctype{\rwildcard{X}}{Pair}{\exptype{Box}{\rwildcard{X}}, \exptype{Box}{\rwildcard{X}}}$
is a pair of two boxes, which always contain the same type.
Inside the scope of the \texttt{Pair} type, the wildcard $\rwildcard{X}$ stays the same.

View File

@ -20,6 +20,7 @@
\newcommand{\tifj}{\texttt{TamedFJ}} \newcommand{\tifj}{\texttt{TamedFJ}}
\newcommand{\wcSep}{\vdash}
\newcommand\mv[1]{{\tt #1}} \newcommand\mv[1]{{\tt #1}}
\newcommand{\ol}[1]{\overline{\tt #1}} \newcommand{\ol}[1]{\overline{\tt #1}}
\newcommand{\exptype}[2]{\mathtt{#1 \texttt{<} #2 \texttt{>} }} \newcommand{\exptype}[2]{\mathtt{#1 \texttt{<} #2 \texttt{>} }}

156
unify.tex
View File

@ -6,17 +6,6 @@
% the algorithm only removes wildcards, never adds them % the algorithm only removes wildcards, never adds them
\section{Unify}\label{sec:unify} \section{Unify}\label{sec:unify}
The wildcard placeholders are used for intermediat types.
It is not possible to create all super types of a type.
The General rule only creates the ones expressable by Java syntax, which still are infinitly many in some cases \cite{TamingWildcards}.
%thats not true. it can spawn X^T_T2.List<X> where T and T2 are types and we need to choose one inbetween them
Otherwise the algorithm could generate more solutions, but they have to be filterd out afterwards, because they cannot be translated into Java.
Any type can be inserted into wildcard placeholders.
Normal placeholders have to contain types, which are well-formed under the supplied environment.
% It is important that the algorithm also works for any subset of constraints
%TODO: The unify algorithm can do any operation on wildcard placeholders the same as on normal ones.
%TODO: Unify could make way more substitutions for wtvs especially in step 2
\subsection{Description} \subsection{Description}
The \unify{} algorithm tries to find a solution for a set of constraints like The \unify{} algorithm tries to find a solution for a set of constraints like
@ -82,20 +71,25 @@ We define two types as equal if they are mutual subtypes of each other.
The equality relation on Capture constraints is not reflexive. The equality relation on Capture constraints is not reflexive.
$(\type{T} \lessdotCC \type{S}) \neq (\type{T} \lessdotCC \type{S})$ eventhough it's the same constraint. A capture constraint is never equal to another capture constraint even when structurally the same
($\type{T} \lessdotCC \type{S} \neq \type{T} \lessdotCC \type{S}$).
An implementation of the algorithm has to take this into account.
All constraints are stored in a set and there are no dublicates of subtype constraints in a constraint set.
Capture constraints however have to be stored as a list or have an unique number assigned
so that duplicates don't get automatically discarded.
Capture conversion is done during the \unify{} algorithm. % \subsection{Capture Conversion}
\unify{} has to make two promises to ensure soundness of our type inference algorithm. % % TODO: Describe Capture conversion
Capture conversion can only be applied at capture constraints. % Capture conversion is done during the \unify{} algorithm.
Free variables are not allowed to leave their scope. % \unify{} has to make two promises to ensure soundness of our type inference algorithm.
This is ensured by type variables which are not allowed to be assigned type holding free variables. % Capture conversion can only be applied at capture constraints.
% Free variables are not allowed to leave their scope.
% This is ensured by type variables which are not allowed to be assigned type holding free variables.
\subsection{Algorithm} \subsection{Algorithm}
\newcommand{\gtype}[1]{\type{#1}} \newcommand{\gtype}[1]{\type{#1}}
%\newcommand{\tw}[1]{\tv{#1}_?} %\newcommand{\tw}[1]{\tv{#1}_?}
The \unify{} algorithm computes the type solution.
\begin{description} \begin{description}
\item[Input:] An environment $\Delta'$ \item[Input:] An environment $\Delta'$
and a set of constraints $C = \set{ \type{T} \lessdot \type{T}, \type{T} \doteq \type{T} \ldots}$ and a set of constraints $C = \set{ \type{T} \lessdot \type{T}, \type{T} \doteq \type{T} \ldots}$
@ -108,21 +102,6 @@ The input constraints must be of the following format:
$\ntype{N}, \ntype{S}$ & $::=$ & $\wctype{\overline{\wildcard{X}{U}{L}}}{C}{\ol{T}} $ & Class Type \\ $\ntype{N}, \ntype{S}$ & $::=$ & $\wctype{\overline{\wildcard{X}{U}{L}}}{C}{\ol{T}} $ & Class Type \\
\end{tabular}\\[0.5em] \end{tabular}\\[0.5em]
\noindent
Additional requirements:
\begin{itemize}
\item All types have to be well-formed: $\wcNtype{\Delta}{N} \in C \implies \Delta_{in} \vdash \wcNtype{\Delta}{N} \ \ok$
\item Naming scheme of every wildcard environment has to be the same.
%TODO: We need this so that wildcard substitutions get the correct name. also the Equals rule needs this condition
%Example:
Although alpha renaming of wildcards inside a type is allowed by the type system the \unify{} algorithm never does it.
Renaming wildcards leads to additional problems in the substitution rules and in the result containing substitutions with renamed wildcards.
For the \rulename{Equals} to work properly it is adviced to name all wildcards in a specific scheme.
For example by numbering them according to their appereance inside the type parameters
(e.g. $\wctype{\rwildcard{1}, \rwildcard{2}}{Pair}{\rwildcard{1}, \rwildcard{2}}$).
\end{itemize}
\item[Output:] \item[Output:]
Set of unifiers $Uni = \set{\sigma_1, \ldots, \sigma_n}$ and an environment $\Delta$ Set of unifiers $Uni = \set{\sigma_1, \ldots, \sigma_n}$ and an environment $\Delta$
\end{description} \end{description}
@ -142,28 +121,18 @@ The \unify{} algorithm internally uses the following data types:
The $\wtv{a}$ type variables are flagged as wildcard type variables. The $\wtv{a}$ type variables are flagged as wildcard type variables.
These type variables can be substituted by a wildcard or a type with free wildcard variables. These type variables can be substituted by a wildcard or a type with free wildcard variables.
As long as a type variable is flagged with as $\wtv{a}$ it will only be used by the subst-wc rule in step 1. As long as a type variable is flagged as $\wtv{a}$ it can be used by the \rulename{Subst-WC} rule in step 1.
In step 2 all of the wildcard flags are dismissed.
The output therefore never contains these flags.
\unify{} applies a capture conversion everywhere it is possible (see \rulename{Capture} rule).
Capture conversion removes a types bounding environment $\Delta$.
Type variables used in its type parameters are now bound to a global scope and not locally anymore.
With \texttt{C} being class names and \texttt{A} being wildcard names. With \texttt{C} being class names and \texttt{A} being wildcard names.
The wildcard type $\wildcard{X}{U}{L}$ consist of an upper bound $\type{U}$, a lower bound $\type{L}$ The wildcard type $\wildcard{X}{U}{L}$ consist of an upper bound $\type{U}$, a lower bound $\type{L}$
and a name $\mathtt{X}$. and a name $\mathtt{X}$.
The \rulename{Normalize} rule eliminates wildcards. % TODO The \rulename{Tame} rule eliminates wildcards. %TODO
This is done by setting the upper and lower bound to the same value. This is done by setting the upper and lower bound to the same value.
\unify{} generates wildcards with the \rulename{\generalizeRule} rule.
It is important to generate new wildcards in a standardized fashion.
When having two constraints $\type{T} \lessdot \tv{a}$ and $\type{T} \lessdot \tv{b}$,
then after applying the \rulename{\generalizeRule} rule the freshly generated constraints are
$\tv{a} \doteq \type{T'}, \tv{b} \doteq \type{T'}$.
Both type variables get assigned the same type.
This is necessary for the \rulename{Equals} rule to work properly.
\unify{} applies a capture conversion everywhere it is possible (see \rulename{Capture} rule).
Capture conversion removes a types bounding environment $\Delta$.
Type variables used in its type parameters are now bound to a global scope and not locally anymore.
\begin{figure} \begin{figure}
\begin{center} \begin{center}
@ -182,7 +151,7 @@ This is necessary for the \rulename{Equals} rule to work properly.
\text{fv}(\type{T}) \subseteq \Delta', \, \text{wtv}(\type{T}) = \emptyset \text{fv}(\type{T}) \subseteq \Delta', \, \text{wtv}(\type{T}) = \emptyset
\end{array}$\\ \end{array}$\\
\\ \\
\rulename{Remove} & \rulename{Normalize} &
$\begin{array}[c]{l} $\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\ \wildcardEnv \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline \hline
@ -579,14 +548,14 @@ $
\begin{array}[c]{@{}ll} \begin{array}[c]{@{}ll}
\begin{array}[c]{l} \begin{array}[c]{l}
\wildcardEnv \vdash C \cup \, \set{ \wildcardEnv \vdash C \cup \, \set{
\tv{b} \lessdot \tv{a}, \tv{b} \lessdot_1 \tv{a},
\tv{a} \lessdot \type{N}, \tv{b} \lessdot \type{N'}} \\ \tv{a} \lessdot_2 \type{N}, \tv{b} \lessdot_3 \type{N'}} \\
\hline \hline
\vspace*{-0.4cm}\\ \vspace*{-0.4cm}\\
\wildcardEnv \vdash C \cup \, \set{ \wildcardEnv \vdash C \cup \, \set{
\tv{b} \lessdot \type{N}, \tv{b} \lessdot \type{N},
\tv{b} \lessdot \tv{a}, \tv{b} \lessdot_1 \tv{a},
\tv{a} \lessdot \type{N} , \tv{b} \lessdot \type{N'} \tv{a} \lessdot_2 \type{N} , \tv{b} \lessdot_3 \type{N'}
} }
\end{array} \end{array}
\end{array} \end{array}
@ -643,35 +612,29 @@ C \cup \set{ \tv{a} \lessdot \wctype{\Delta'}{C}{[\ol{U}/\ol{X}]\ol{S}}}
\caption{Constraint reduce rules}\label{fig:reduce-rules} \caption{Constraint reduce rules}\label{fig:reduce-rules}
\end{figure} \end{figure}
The new constraint generated by the adopt rule may be eliminated by the match rule. Capture constraints are treated like regular subtype constraints.
The adopt rule still needs to be applied only once per constraint. All transformations for subtype constraints work for capture constraints aswell.
For clarification Subtype constraints are marked with a number.
Wildcards consist out of three parts. Subtype constraints holding the same number keep their respective type.
A name, a scope and a upper and lower bound. Newly created subtype constraints are always regular subtype constrains unless stated otherwise.
The \rulename{Adopt} rule for example takes multiple subtype constraints and adds a new one.
% The \unify{} algorithm from \cite{plue09_1} substitutes type variables with wildcards. Having the constraints
% A constraint $\wctype{\wildcard{X}{\type{Object}}{\bot}}{C}{\rwildcard{X}} \lessdot \exptype{C}{\tv{a}}$ $\ntv{a} \lessdotCC \wtv{b}, \ntv{a} \lessdot \type{String}, \wtv{b} \lessdot \type{Object}$
% has no solution. would lead to
% Replacing the type variable $\tv{a}$ with the wildcard $\rwildcard{X}$ is not correct! $\wtv{b} \lessdot \type{String}, \ntv{a} \lessdotCC \wtv{b}, \ntv{a} \lessdot \type{String}, \wtv{b} \lessdotCC \type{Object}$
% The wildcard $\rwildcard{X}$ cannot leave its scope and the type $\exptype{C}{\rwildcard{X}}$ after applying \rulename{Adopt}.
% is considered invalid. The new generated constraint $\wtv{b} \lessdot \type{String}$ is a normal subtype constraint.
The type placeholders which are annotated as wildcard placeholders also stay wildcard placeholders.
Wildcards are not reflexive. A box of type $\wctype{\rwildcard{X}}{Box}{\rwildcard{X}}$ The only rule that replaces wildcard type placeholders with regular placeholders is the \rulename{Normalize} rule.
is able to hold a value of any type. It could be a $\exptype{Box}{String}$ or a $\exptype{Box}{Integer}$ etc.
Also two of those boxes do not necessarily contain the same type.
But there are situations where it is possible to assume that.
For example the type $\wctype{\rwildcard{X}}{Pair}{\exptype{Box}{\rwildcard{X}}, \exptype{Box}{\rwildcard{X}}}$
is a pair of two boxes, which always contain the same type.
Inside the scope of the \texttt{Pair} type, the wildcard $\rwildcard{X}$ stays the same.
The algorithm holds two sets.
The input constraints and a wildcard environment $\wildcardEnv{}$ keeping free variables.
The algorithm starts with an empty wildcard environment $\wildcardEnv{}$. The algorithm starts with an empty wildcard environment $\wildcardEnv{}$.
Only the reduce rule adds wildcards to that environment. Only the \rulename{Capture} rule adds wildcards to that environment.
And every added wildcard gets a fresh unique name. And every added wildcard gets a fresh unique name.
This ensures the wildcard environment $\wildcardEnv{}$ never This ensures the wildcard environment $\wildcardEnv{}$ never
gets the same wildcard twice. gets the same wildcard twice.
When a new type is generated by the \unify{} algorithm it always has the form
$\wctype{\ol{\rwildcard{A}}}{C}{\ol{\rwildcard{A}}}$.
\textbf{Step 1:} \textbf{Step 1:}
Apply the rules depicted in the figures \ref{fig:normalizing-rules}, \ref{fig:reduce-rules} and \ref{fig:wildcard-rules} exhaustively. Apply the rules depicted in the figures \ref{fig:normalizing-rules}, \ref{fig:reduce-rules} and \ref{fig:wildcard-rules} exhaustively.
@ -1007,28 +970,8 @@ We have to consider both possibilities.
\\[1em] \\[1em]
\noindent \noindent
\textbf{Step 3} \textbf{Step 3:}
\textbf{(Eliminate Wildcard Variables):} We apply the rules in figure \ref{fig:cleanup-rules} exhaustively and proceed with step 4.
If no more rules in step 2 are applicable \unify{} has to eliminate all wildcard variables and $\lessdotCC$ constraints by applying the \rulename{Remove} rule
and start over at step 1.
If $C$ does not contain any wildcard variables the algorithm proceeds with step 4.
\begin{center}
\fbox{
\begin{tabular}[t]{l@{~}l}
\rulename{Remove-Cons} & $
\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\type{S} \lessdotCC \type{T} } \\
\hline
\vspace*{-0.4cm}\\
\wildcardEnv \vdash C \cup \set{\type{S} \lessdot \type{T} }
\end{array}
$
\end{tabular}}
\end{center}
\noindent
\textbf{Step 4:}
We apply the rules in figure \ref{fig:cleanup-rules} exhaustively and proceed with step 6.
\begin{figure} \begin{figure}
\begin{center} \begin{center}
@ -1237,3 +1180,18 @@ But this renders additional problems:
\item Capture Converted variables are not allowed to leave their scope \item Capture Converted variables are not allowed to leave their scope
\item \unify{} generates type substitution which cannot be translated to Java types. \item \unify{} generates type substitution which cannot be translated to Java types.
\end{itemize} \end{itemize}
\subsection{Completeness}
It is not possible to create all super types of a type.
The General rule only creates the ones expressable by Java syntax, which still are infinitly many in some cases \cite{TamingWildcards}.
%thats not true. it can spawn X^T_T2.List<X> where T and T2 are types and we need to choose one inbetween them
Otherwise the algorithm could generate more solutions, but they have to be filterd out afterwards, because they cannot be translated into Java.
\subsection{Implementation}
%List this under implementation details
Every constraint that is not in solved form and is not able to be processed by any of the rules accounts as a error constraint and renders the constraint set unsolvable
The new constraint generated by the adopt rule may be eliminated by the match rule.
The adopt rule still needs to be applied only once per constraint.