Add Global Type Inference introduction

This commit is contained in:
Andreas Stadelmeier 2024-03-05 17:12:56 +01:00
parent 3904304a1d
commit e15d61cdae
2 changed files with 149 additions and 228 deletions

View File

@ -23,37 +23,29 @@ resulting in a correctly typed program (see figure \ref{fig:nested-list-example-
We support capture conversion and Java style method calls.
This requires existential types in a form which is not denotable by Java syntax \cite{aModelForJavaWithWildcards}.
The algorithm is able find the correct type for the method \texttt{m} in the following example:
\begin{verbatim}
<X> Pair<X,X> make(List<X> l)
Boolean compare(Pair<X,X> p)
List<?> b;
Boolean m() = this.compare(this.make(b));
\end{verbatim}
We present a novel approach to deal with existential types and capture conversion during constraint unification.
The algorithm is split in two parts. A constraint generation step and an unification step.
We proof soundness and aim for a good compromise between completeness and time complexity.
Our algorithm finds a correct type solution for the following example, where the Java local type inference fails:
\begin{verbatim}
class SuperPair<A,B>{
A a;
B b;
}
class Pair<A,B> extends SuperPair<B,A>{
A a;
B b;
% Our algorithm finds a correct type solution for the following example, where the Java local type inference fails:
% \begin{verbatim}
% class SuperPair<A,B>{
% A a;
% B b;
% }
% class Pair<A,B> extends SuperPair<B,A>{
% A a;
% B b;
<X> X choose(X a, X b){ return b; }
% <X> X choose(X a, X b){ return b; }
String m(List<? extends Pair<Integer, String>> a, List<? extends Pair<Integer, String>> b){
return choose(choose(a,b).value.a,b.value.b);
}
}
\end{verbatim}
% String m(List<? extends Pair<Integer, String>> a, List<? extends Pair<Integer, String>> b){
% return choose(choose(a,b).value.a,b.value.b);
% }
% }
% \end{verbatim}
\begin{figure}[tp]
\begin{subfigure}[t]{\linewidth}
@ -165,68 +157,6 @@ List<?> genList() {
% \texttt{List<Object>} is not a valid return type for the method \texttt{genList}.
% The type inference algorithm has to find the correct type involving wildcards (\texttt{List<?>}).
\subsection{Global Type Inference}
A global type inference algorithm works on an input with no type annotations at all.
%TODO: Describe global type inference and lateron why it is so hard to
Global type inference means that there are no type annotations at all.
\begin{verbatim}
m(l) {
return l.add(l);
}
\end{verbatim}
$\tv{l} \lessdotCC \exptype{List}{\wtv{x}}, \tv{l} \lessdotCC \wtv{x}$
No capture conversion for methods in the same class:
Given two methods without type annotations like
\begin{verbatim}
// m :: () -> r
m() = new List<String>() :? new List<Integer>();
// id :: (a) -> a
id(a) = a
\end{verbatim}
and a method call \texttt{id(m())} would lead to the constraints:
$\exptype{List}{\type{String}} \lessdot \ntv{r},
\exptype{List}{\type{Integer}} \lessdot \ntv{r},
\ntv{r} \lessdot \ntv{a}$
In this example capture conversion is not applicable,
because the \texttt{id} method is not polymorphic.
The type solution provided by \unify{} for this constraint set is:
$\sigma(\ntv{r}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}},
\sigma(\ntv{a}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}}$
\begin{verbatim}
List<?> m() = new List<String>() :? new List<Integer>();
List<?> id(List<?> a) = a
\end{verbatim}
The following example has the \texttt{id} method already typed and the method \texttt{m}
extended by a recursive call \texttt{id(m())}:
\begin{verbatim}
<A> List<A> id(List<A> a) = a
m() = new List<String>() :? new List<Integer>() :? id(m());
\end{verbatim}
Now the constraints make use of a $\lessdotCC$ constraint:
$\exptype{List}{\type{String}} \lessdot \ntv{r},
\exptype{List}{\type{Integer}} \lessdot \ntv{r},
\ntv{r} \lessdotCC \exptype{List}{\wtv{a}}$
After substituting $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ for $\ntv{r}$ like in the example before
we get the constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$.
Due to the $\lessdotCC$ \unify{} is allowed to perform a capture conversion yielding
$\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$.
\textit{Note:} The wildcard placeholder $\wtv{a}$ is allowed to hold free variables whereas a normal placeholder like $\ntv{r}$
is never assigned a type containing free variables.
Therefore \unify{} sets $\wtv{a} \doteq \rwildcard{X}$, completing the constraint set and resulting in the type solution:
\begin{verbatim}
List<?> m() = new List<String>() :? new List<Integer>() :? id(m());
\end{verbatim}
\subsection{Java Wildcards}
Wildcards are expressed by a \texttt{?} in Java and can be used as parameter types.
Wildcards can be formalized as existential types \cite{WildFJ}.
@ -261,47 +191,125 @@ $\generics{\rwildcard{Y}}\texttt{head}(\exptype{List}{\rwildcard{Y}})$
with $\rwildcard{Y}$ being used as generic parameter \texttt{X}.
The $\rwildcard{Y}$ in $\exptype{List}{\rwildcard{Y}}$ is a free variable now.
%TODO: Read taming wildcards and see if we solve some of the problems presented in section 5 and 6
% Additionally they can hold a upper or lower bound restriction like \texttt{List<? super String>}.
% Our representation of this type is: $\wctype{\wildcard{X}{\type{String}}{\type{Object}}}{List}{\rwildcard{X}}$
% Every wildcard has a name ($\rwildcard{X}$ in this case) and an upper and lower bound (respectively \texttt{Object} and \texttt{String}).
% \subsection{Extensibility} % NOTE: this thing is useless, because final local variables do not need to contain wildcards
% In \cite{WildcardsNeedWitnessProtection} capture converson is made explicit with \texttt{let} statements.
% We chain let statements in a way that emulates Java behaviour. Allowing the example in \cite{aModelForJavaWithWildcards}
% % TODO: Explain the advantage of constraints and how we control the way capture conversion is executed
% But it would be also possible to alter the constraint generation step to include additional features.
% Final variables or effectively final variables could be expressed with a
\subsection{Global Type Inference}
% A global type inference algorithm works on an input with no type annotations at all.
% \begin{verbatim}
% <X> concat(List<X> l1, List<X> l2){...}
% m(l) {
% return l.add(l);
% }
% \end{verbatim}
% \begin{minipage}{0.50\textwidth}
% Java:
% \begin{verbatim}
% final List<?> l = ...;
% concat(l, l);
% \end{verbatim}
% \end{minipage}%
% \begin{minipage}{0.50\textwidth}
% Featherweight Java:
% \begin{verbatim}
% let l : X.List<X> = ... in
% concat(l, l)
% \end{verbatim}
% \end{minipage}
% $\tv{l} \lessdotCC \exptype{List}{\wtv{x}}, \tv{l} \lessdotCC \wtv{x}$
The input to our type inference algorithm is a modified version of the \letfj{} calculus presented in \cite{WildcardsNeedWitnessProtection}.
\fjtype{} generates constraints from an input program and is the first step of the algorithm.
Afterwards \unify{} computes a solution for the given constraint set.
Constraints consist out of subtype constraints $(\type{T} \lessdot \type{T})$ and capture constraints $(\type{T} \lessdotCC \type{T})$.
Additionally to types constraints can also hold type placeholders $\ntv{a}$ and wildcard type placeholders $\wtv{a}$.
A subtype constraint is satisfied if the left side is a subtype of the right side according to the rules in figure \ref{fig:subtyping}.
\textit{Example:} $\exptype{List}{\tv{a}} \lessdot \exptype{List}{\type{String}}$ is fulfilled by replacing type placeholder $\tv{a}$ with the type $\type{String}$.
Subtype constraints and type placeholders act the same as the ones used in \emph{Type Inference for Featherweight Generic Java} \cite{TIforGFJ}.
The novel capture constraints and wildcard placeholders are needed for method invocations involving wildcards.
%show input and a correct letFJ representation
Even in a full typed program local type inference can be necessary.
%TODO: first show local type inference and explain lessdotCC constraints. then show example with global TI
Our algorithm has to find a type solution and a \letfj{} representation for a given input program.
Let's start with an example where all types are already given:
\begin{lstlisting}[style=tfgj, caption=Valid Java program, label=lst:addExample]
<A> List<A> add(List<A> l, A v)
List<? super String> l = ...;
add(l, "String");
\end{lstlisting}
\begin{lstlisting}[style=letfj, caption=\letfj{} representation of \texttt{add(l, "String")}, label=lst:addExampleLet]
let l2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$@*) = l in <X>add(l2, "String");
\end{lstlisting}
In \letfj{} there is no local type inference and all type parameters for a method call are mandatory.
If wildcards are involved the so called capture conversion has to be done manually via let statements.
A let statement \emph{opens} an existential type.
In the body of the let statement the \textit{capture type} $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$
becomes $\exptype{List}{\rwildcard{X}}$ and the wildcard $\wildcard{X}{\type{Object}}{\type{String}}$ is free and can be used as
a type parameter to \texttt{<X>add(...)}.
%This is a valid Java program where the type parameters for the polymorphic method \texttt{add}
%are determined by local type inference.
Our type inference algorithm has to add let statements if necessary, including the capture types.
Lets have a look at the constraints generated by \fjtype{} for the example in listing \ref{lst:addExample}:
\begin{constraintset}
$\begin{array}{l}
\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}, \, \type{String} \lessdotCC \wtv{a}
\\
\hline
\text{Capture Conversion:}\ \exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}, \, \type{String} \lessdot \wtv{a}
\\
\hline
\text{Solution:}\ \wtv{a} \doteq \rwildcard{X} \implies \exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\rwildcard{X}}, \, \type{String} \lessdot \rwildcard{X}
\end{array}
$
\end{constraintset}
%Why do we need the lessdotCC constraints here?
The type of \texttt{l} can be capture converted by a let statement if needed (see listing \ref{lst:addExampleLet}).
Therefore we assign the constraint $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$
which allows \unify{} to do a capture conversion to $\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$.
\textit{Note:} The constraint $\type{String} \lessdot \rwildcard{X}$ is satisfied because $\rwildcard{X}$ has $\type{String}$ as lower bound.
% No capture conversion for methods in the same class:
% Given two methods without type annotations like
% \begin{verbatim}
% // m :: () -> r
% m() = new List<String>() :? new List<Integer>();
% // id :: (a) -> a
% id(a) = a
% \end{verbatim}
% and a method call \texttt{id(m())} would lead to the constraints:
% $\exptype{List}{\type{String}} \lessdot \ntv{r},
% \exptype{List}{\type{Integer}} \lessdot \ntv{r},
% \ntv{r} \lessdotCC \ntv{a}$
% In this example capture conversion is not applicable,
% because the \texttt{id} method is not polymorphic.
% The type solution provided by \unify{} for this constraint set is:
% $\sigma(\ntv{r}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}},
% \sigma(\ntv{a}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}}$
% \begin{verbatim}
% List<?> m() = new List<String>() :? new List<Integer>();
% List<?> id(List<?> a) = a
% \end{verbatim}
The following example has the \texttt{id} method already typed and the method \texttt{m}
extended by a recursive call \texttt{id(m())}:
\begin{verbatim}
<A> List<A> id(List<A> a) = a
m() = new List<String>() :? new List<Integer>() :? id(m());
\end{verbatim}
Now the constraints make use of a $\lessdotCC$ constraint:
$\exptype{List}{\type{String}} \lessdot \ntv{r},
\exptype{List}{\type{Integer}} \lessdot \ntv{r},
\ntv{r} \lessdotCC \exptype{List}{\wtv{a}}$
After substituting $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ for $\ntv{r}$ like in the example before
we get the constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$.
Due to the $\lessdotCC$ \unify{} is allowed to perform a capture conversion yielding
$\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$.
\textit{Note:} The wildcard placeholder $\wtv{a}$ is allowed to hold free variables whereas a normal placeholder like $\ntv{r}$
is never assigned a type containing free variables.
Therefore \unify{} sets $\wtv{a} \doteq \rwildcard{X}$, completing the constraint set and resulting in the type solution:
\begin{verbatim}
List<?> m() = new List<String>() :? new List<Integer>() :? id(m());
\end{verbatim}
\subsection{Challenges}\label{challenges}
%TODO: Wildcard subtyping is infinite see \cite{TamingWildcards}
One problem is the divergence between denotable and expressable types in Java \cite{semanticWildcardModel}.
A wildcard in the Java syntax has no name and is bound to its enclosing type.
\texttt{List<List<?>>} equates to $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
During type checking intermediate types like $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$
$\exptype{List}{\exptype{List}{\type{?}}}$ equates to $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
During type checking \emph{intermediate types} like $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$
or $\wctype{\rwildcard{X}}{Pair}{\rwildcard{X}, \rwildcard{X}}$ can emerge, which have no equivalent in the Java syntax.
Our type inference algorithm uses existential types internally but spawns type solutions compatible with Java.
@ -330,19 +338,11 @@ Knowing that the type \texttt{String} is a subtype of any type the wildcard \tex
it is safe to pass \texttt{"String"} for the first parameter of the function.
\begin{verbatim}
<A> List<A> add(A a, List<A> la) {}
<A> List<A> add(List<A> l, A v) {}
List<? super String> list = ...;
add("String", list);
List<? super String> list = ...;
add("String", list);
\end{verbatim}
The constraints representing this code snippet are:
\begin{displaymath}
\type{String} \lessdotCC \wtv{a},\,
\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}
\end{displaymath}
Here $\sigma(\tv{a}) = \rwildcard{X}$ is a valid solution.
\end{example}
\item \begin{example}\label{intro-example2}
@ -350,117 +350,36 @@ This example displays an incorrect Java program.
The method call to \texttt{concat} with two wildcard lists is unsound.
Each list could be of a different kind and therefore the \texttt{concat} cannot succeed.
\begin{verbatim}
<A> List<A> concat(List<A> l1, List<A> l2) {}
<A> List<A> concat(List<A> l1, List<A> l2) { ... }
List<?> list = ... ;
concat(list, list);
List<?> list = ... ;
concat(list, list);
\end{verbatim}
The constraints for this example are:
$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}, \\
\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$
%$\exptype{List}{?} \lessdot \exptype{List}{\tv{a}},
%\exptype{List}{?} \lessdot \exptype{List}{\tv{a}}$
Remember that the given constraints cannot have a valid solution.
%This is due to the fact that the wildcard variables cannot leave their scope.
In this example the \unify{} algorithm should not replace $\tv{a}$ with the captured wildcard $\rwildcard{X}$.
% $\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}, \\
% \wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$
\end{example}
\item
\unify{} can only remove wildcards, but never add wildcards to an existing type.
Java subtyping is invariant.
The type $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ cannot be a subtype of $\exptype{List}{\tv{a}}$
% \item
% \unify{} morphs a constraint set into a correct type solution
% gradually assigning types to type placeholders during that process.
% Solved constraints are removed and never considered again.
% In the following example \unify{} solves the constraint generated by the expression
% \texttt{l.add(l.head())} first, which results in $\ntv{l} \lessdot \exptype{List}{\wtv{a}}$.
% \begin{verbatim}
% anyList() = new List<String>() :? new List<Integer>()
\unify{} morphs a constraint set into a correct type solution
gradually assigning types to type placeholders during that process.
Solved constraints are removed and never considered again.
In the following example \unify{} solves the constraint generated by the expression
\texttt{l.add(l.head())} first, which results in $\ntv{l} \lessdot \exptype{List}{\wtv{a}}$.
\begin{verbatim}
List<?> l = ...;
m2() {
l.add(l.head());
}
\end{verbatim}
The type for \texttt{l} can be any kind of list, but it has to be a invariant one.
Assigning a \texttt{List<?>} for \texttt{l} is unsound, because the type list hiding behind
\texttt{List<?>} could be a different one for the \texttt{add} call than the \texttt{head} method call.
An additional constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$
is solved by removing the wildcard $\rwildcard{X}$ if possible.
% add(anyList(), anyList().head());
% \end{verbatim}
% The type for \texttt{l} can be any kind of list, but it has to be a invariant one.
% Assigning a \texttt{List<?>} for \texttt{l} is unsound, because the type list hiding behind
% \texttt{List<?>} could be a different one for the \texttt{add} call than the \texttt{head} method call.
% An additional constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$
% is solved by removing the wildcard $\rwildcard{X}$ if possible.
\end{itemize}
The \unify{} algorithm only sees the constraints with no information about the program they originated from.
The main challenge was to find an algorithm which computes $\sigma(\tv{a}) = \rwildcard{X}$ for example \ref{intro-example1} but not for example \ref{intro-example2}.
% Solution: A type variable can only take one type and not a wildcard type.
% A wildcard type is only treated like a wildcard while his definition is in scope (during the reduce rule)
% \subsection{Capture Conversion}
% \begin{verbatim}
% <A> List<A> add(A a, List<A> la) {}
% List<? super String> list = ...;
% add("String", list);
% \end{verbatim}
% The constraints representig this code snippet are:
% $\type{String} \lessdot \tv{a},
% \wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\tv{a}}$
% Under the hood the typechecker has to find a replacement for the generic method parameter \texttt{A}.
% Java allows a method \texttt{<A> List<A> add(A a, List<A> la)} to be called with
% \texttt{String} and \texttt{List<? super String>}.
% A naive approach would be to treat wildcards like any other type and replace \texttt{A}
% with \texttt{? super String}.
% Generating the method type \texttt{List<? super String> add(? super String a, List<? super String> la)}.
% This does not work for a method like
% \texttt{<A> void merge(List<A> l1, List<A> l2)}.
% It is crucial for this method to be called with two lists of the same type.
% Substituting a wildcard for the generic \texttt{A} leads to
% \texttt{void merge(List<?> l1, List<?> l2)},
% which is unsound.
% Capture conversion utilizes the fact that instantiated classes always have an actual type.
% \texttt{new List<String>()} and \texttt{new List<Object>()} are
% valid instances.
% \texttt{new List<? super String>()} is not.
% Every type $\wcNtype{\Delta}{N}$ contains an underlying instance of a type $\type{N}$.
% Knowing this fact allows to make additional assumptions.
% \wildFJ{} type rules allow for generic parameters to be replaced by wildcard types.
% This is possible because wildcards are split into two parts.
% Their declaration at the environment part $\Delta$ of a type $\wcNtype{\Delta}{N}$
% and their actual use in the body of the type $\type{N}$.
% Replacing the generic \texttt{A} in the type \texttt{List<A>}
% by a wildcard $\type{X}$ will not generate the type \texttt{List<?>}.
% \texttt{List<?>} is the equivalent of $\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\type{X}}$.
% The generated type here is $\exptype{List}{\type{X}}$, which has different subtype properties.
% There is no information about what the type $\exptype{List}{\type{X}}$ actually is.
% It has itself as a subtype and for example a type like $\exptype{ArrayList}{\type{X}}$.
% A method call for example only works with values, which are correct instances of classes.
% Capture conversion automatically converts wildcard type to a concrete class type,
% which then can be used as a parameter for a method call.
% In \wildFJ{} this is done via let statements.
% Written in FJ syntax: $\type{N}$ is a initialized type and
% $\wcNtype{\Delta}{N}$ is a type containing wildcards $\Delta$.
% It is crucial for the wildcard names to never be used twice.
% Otherwise the members of a list with type $\wctype{\type{X}}{List}{\type{X}}$ and
% a list with type $\wctype{\type{X}}{List}{\type{X}}$ would be the same. $\type{X}$ in fact.
% The let statement adds wildcards to the environment.
% Those wildcards are not allowed to escape the scope of the let statement.
% Which is a problem for our type inference algorithm.
% The capture converted version of the type $\wctype{\rwildcard{X}}{Box}{\rwildcard{X}}$ is
% $\exptype{Box}{\rwildcard{X}}$.
% The captured type is only used as an intermediate type during a method call.
The main challenge was to find an algorithm which computes $\sigma(\wtv{a}) = \rwildcard{X}$ for example \ref{intro-example1} but not for example \ref{intro-example2}.

View File

@ -13,6 +13,7 @@
}
\lstdefinestyle{fgj}{backgroundcolor=\color{lime!20}}
\lstdefinestyle{tfgj}{backgroundcolor=\color{lightgray!20}}
\lstdefinestyle{letfj}{backgroundcolor=\color{lightgray!20}}
\newcommand\mv[1]{{\tt #1}}
@ -98,6 +99,7 @@
\newcommand{\localVarAssumption}{\ensuremath{\mathtt{\eta}}}
\newcommand{\expandLB}{\textit{expandLB}}
\newcommand{\letfj}{\emph{LetFJ}}
\newcommand{\tph}{\text{tph}}
\def\exptypett#1#2{\texttt{#1}\textrm{{\tt <#2}}\textrm{{\tt >}}\xspace}