From e15d61cdae5c0bec2346f9c4c30b45fc8d24dd30 Mon Sep 17 00:00:00 2001 From: Andreas Stadelmeier Date: Tue, 5 Mar 2024 17:12:56 +0100 Subject: [PATCH] Add Global Type Inference introduction --- introduction.tex | 375 +++++++++++++++++++---------------------------- prolog.tex | 2 + 2 files changed, 149 insertions(+), 228 deletions(-) diff --git a/introduction.tex b/introduction.tex index b07d6a8..ea9a85e 100644 --- a/introduction.tex +++ b/introduction.tex @@ -23,37 +23,29 @@ resulting in a correctly typed program (see figure \ref{fig:nested-list-example- We support capture conversion and Java style method calls. This requires existential types in a form which is not denotable by Java syntax \cite{aModelForJavaWithWildcards}. -The algorithm is able find the correct type for the method \texttt{m} in the following example: -\begin{verbatim} - Pair make(List l) - Boolean compare(Pair p) - - List b; - Boolean m() = this.compare(this.make(b)); -\end{verbatim} We present a novel approach to deal with existential types and capture conversion during constraint unification. The algorithm is split in two parts. A constraint generation step and an unification step. We proof soundness and aim for a good compromise between completeness and time complexity. -Our algorithm finds a correct type solution for the following example, where the Java local type inference fails: -\begin{verbatim} -class SuperPair{ - A a; - B b; -} -class Pair extends SuperPair{ - A a; - B b; +% Our algorithm finds a correct type solution for the following example, where the Java local type inference fails: +% \begin{verbatim} +% class SuperPair{ +% A a; +% B b; +% } +% class Pair extends SuperPair{ +% A a; +% B b; - X choose(X a, X b){ return b; } +% X choose(X a, X b){ return b; } - String m(List> a, List> b){ - return choose(choose(a,b).value.a,b.value.b); - } -} -\end{verbatim} +% String m(List> a, List> b){ +% return choose(choose(a,b).value.a,b.value.b); +% } +% } +% \end{verbatim} \begin{figure}[tp] \begin{subfigure}[t]{\linewidth} @@ -165,68 +157,6 @@ List genList() { % \texttt{List} is not a valid return type for the method \texttt{genList}. % The type inference algorithm has to find the correct type involving wildcards (\texttt{List}). -\subsection{Global Type Inference} -A global type inference algorithm works on an input with no type annotations at all. -%TODO: Describe global type inference and lateron why it is so hard to -Global type inference means that there are no type annotations at all. -\begin{verbatim} -m(l) { - return l.add(l); -} -\end{verbatim} - -$\tv{l} \lessdotCC \exptype{List}{\wtv{x}}, \tv{l} \lessdotCC \wtv{x}$ - - -No capture conversion for methods in the same class: -Given two methods without type annotations like -\begin{verbatim} -// m :: () -> r -m() = new List() :? new List(); - -// id :: (a) -> a -id(a) = a -\end{verbatim} - -and a method call \texttt{id(m())} would lead to the constraints: -$\exptype{List}{\type{String}} \lessdot \ntv{r}, -\exptype{List}{\type{Integer}} \lessdot \ntv{r}, -\ntv{r} \lessdot \ntv{a}$ - -In this example capture conversion is not applicable, -because the \texttt{id} method is not polymorphic. -The type solution provided by \unify{} for this constraint set is: -$\sigma(\ntv{r}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}}, -\sigma(\ntv{a}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ -\begin{verbatim} -List m() = new List() :? new List(); -List id(List a) = a -\end{verbatim} - -The following example has the \texttt{id} method already typed and the method \texttt{m} -extended by a recursive call \texttt{id(m())}: -\begin{verbatim} - List id(List a) = a - -m() = new List() :? new List() :? id(m()); -\end{verbatim} -Now the constraints make use of a $\lessdotCC$ constraint: -$\exptype{List}{\type{String}} \lessdot \ntv{r}, -\exptype{List}{\type{Integer}} \lessdot \ntv{r}, -\ntv{r} \lessdotCC \exptype{List}{\wtv{a}}$ - -After substituting $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ for $\ntv{r}$ like in the example before -we get the constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$. -Due to the $\lessdotCC$ \unify{} is allowed to perform a capture conversion yielding -$\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$. -\textit{Note:} The wildcard placeholder $\wtv{a}$ is allowed to hold free variables whereas a normal placeholder like $\ntv{r}$ -is never assigned a type containing free variables. -Therefore \unify{} sets $\wtv{a} \doteq \rwildcard{X}$, completing the constraint set and resulting in the type solution: -\begin{verbatim} -List m() = new List() :? new List() :? id(m()); -\end{verbatim} - - \subsection{Java Wildcards} Wildcards are expressed by a \texttt{?} in Java and can be used as parameter types. Wildcards can be formalized as existential types \cite{WildFJ}. @@ -261,47 +191,125 @@ $\generics{\rwildcard{Y}}\texttt{head}(\exptype{List}{\rwildcard{Y}})$ with $\rwildcard{Y}$ being used as generic parameter \texttt{X}. The $\rwildcard{Y}$ in $\exptype{List}{\rwildcard{Y}}$ is a free variable now. -%TODO: Read taming wildcards and see if we solve some of the problems presented in section 5 and 6 - -% Additionally they can hold a upper or lower bound restriction like \texttt{List}. -% Our representation of this type is: $\wctype{\wildcard{X}{\type{String}}{\type{Object}}}{List}{\rwildcard{X}}$ -% Every wildcard has a name ($\rwildcard{X}$ in this case) and an upper and lower bound (respectively \texttt{Object} and \texttt{String}). - -% \subsection{Extensibility} % NOTE: this thing is useless, because final local variables do not need to contain wildcards -% In \cite{WildcardsNeedWitnessProtection} capture converson is made explicit with \texttt{let} statements. -% We chain let statements in a way that emulates Java behaviour. Allowing the example in \cite{aModelForJavaWithWildcards} -% % TODO: Explain the advantage of constraints and how we control the way capture conversion is executed -% But it would be also possible to alter the constraint generation step to include additional features. -% Final variables or effectively final variables could be expressed with a +\subsection{Global Type Inference} +% A global type inference algorithm works on an input with no type annotations at all. % \begin{verbatim} -% concat(List l1, List l2){...} +% m(l) { +% return l.add(l); +% } % \end{verbatim} -% \begin{minipage}{0.50\textwidth} -% Java: -% \begin{verbatim} -% final List l = ...; -% concat(l, l); -% \end{verbatim} -% \end{minipage}% -% \begin{minipage}{0.50\textwidth} -% Featherweight Java: -% \begin{verbatim} -% let l : X.List = ... in -% concat(l, l) -% \end{verbatim} -% \end{minipage} +% $\tv{l} \lessdotCC \exptype{List}{\wtv{x}}, \tv{l} \lessdotCC \wtv{x}$ +The input to our type inference algorithm is a modified version of the \letfj{} calculus presented in \cite{WildcardsNeedWitnessProtection}. +\fjtype{} generates constraints from an input program and is the first step of the algorithm. +Afterwards \unify{} computes a solution for the given constraint set. +Constraints consist out of subtype constraints $(\type{T} \lessdot \type{T})$ and capture constraints $(\type{T} \lessdotCC \type{T})$. +Additionally to types constraints can also hold type placeholders $\ntv{a}$ and wildcard type placeholders $\wtv{a}$. +A subtype constraint is satisfied if the left side is a subtype of the right side according to the rules in figure \ref{fig:subtyping}. +\textit{Example:} $\exptype{List}{\tv{a}} \lessdot \exptype{List}{\type{String}}$ is fulfilled by replacing type placeholder $\tv{a}$ with the type $\type{String}$. +Subtype constraints and type placeholders act the same as the ones used in \emph{Type Inference for Featherweight Generic Java} \cite{TIforGFJ}. +The novel capture constraints and wildcard placeholders are needed for method invocations involving wildcards. + +%show input and a correct letFJ representation +Even in a full typed program local type inference can be necessary. +%TODO: first show local type inference and explain lessdotCC constraints. then show example with global TI +Our algorithm has to find a type solution and a \letfj{} representation for a given input program. +Let's start with an example where all types are already given: +\begin{lstlisting}[style=tfgj, caption=Valid Java program, label=lst:addExample] + List add(List l, A v) + +List l = ...; +add(l, "String"); +\end{lstlisting} +\begin{lstlisting}[style=letfj, caption=\letfj{} representation of \texttt{add(l, "String")}, label=lst:addExampleLet] + let l2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$@*) = l in add(l2, "String"); +\end{lstlisting} +In \letfj{} there is no local type inference and all type parameters for a method call are mandatory. +If wildcards are involved the so called capture conversion has to be done manually via let statements. +A let statement \emph{opens} an existential type. +In the body of the let statement the \textit{capture type} $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$ +becomes $\exptype{List}{\rwildcard{X}}$ and the wildcard $\wildcard{X}{\type{Object}}{\type{String}}$ is free and can be used as +a type parameter to \texttt{add(...)}. +%This is a valid Java program where the type parameters for the polymorphic method \texttt{add} +%are determined by local type inference. +Our type inference algorithm has to add let statements if necessary, including the capture types. + +Lets have a look at the constraints generated by \fjtype{} for the example in listing \ref{lst:addExample}: +\begin{constraintset} +$\begin{array}{l} +\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}, \, \type{String} \lessdotCC \wtv{a} +\\ +\hline +\text{Capture Conversion:}\ \exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}, \, \type{String} \lessdot \wtv{a} +\\ +\hline +\text{Solution:}\ \wtv{a} \doteq \rwildcard{X} \implies \exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\rwildcard{X}}, \, \type{String} \lessdot \rwildcard{X} +\end{array} +$ +\end{constraintset} +%Why do we need the lessdotCC constraints here? +The type of \texttt{l} can be capture converted by a let statement if needed (see listing \ref{lst:addExampleLet}). +Therefore we assign the constraint $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$ +which allows \unify{} to do a capture conversion to $\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$. +\textit{Note:} The constraint $\type{String} \lessdot \rwildcard{X}$ is satisfied because $\rwildcard{X}$ has $\type{String}$ as lower bound. + +% No capture conversion for methods in the same class: +% Given two methods without type annotations like +% \begin{verbatim} +% // m :: () -> r +% m() = new List() :? new List(); + +% // id :: (a) -> a +% id(a) = a +% \end{verbatim} + +% and a method call \texttt{id(m())} would lead to the constraints: +% $\exptype{List}{\type{String}} \lessdot \ntv{r}, +% \exptype{List}{\type{Integer}} \lessdot \ntv{r}, +% \ntv{r} \lessdotCC \ntv{a}$ + +% In this example capture conversion is not applicable, +% because the \texttt{id} method is not polymorphic. +% The type solution provided by \unify{} for this constraint set is: +% $\sigma(\ntv{r}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}}, +% \sigma(\ntv{a}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ +% \begin{verbatim} +% List m() = new List() :? new List(); +% List id(List a) = a +% \end{verbatim} + +The following example has the \texttt{id} method already typed and the method \texttt{m} +extended by a recursive call \texttt{id(m())}: +\begin{verbatim} + List id(List a) = a + +m() = new List() :? new List() :? id(m()); +\end{verbatim} +Now the constraints make use of a $\lessdotCC$ constraint: +$\exptype{List}{\type{String}} \lessdot \ntv{r}, +\exptype{List}{\type{Integer}} \lessdot \ntv{r}, +\ntv{r} \lessdotCC \exptype{List}{\wtv{a}}$ + +After substituting $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ for $\ntv{r}$ like in the example before +we get the constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$. +Due to the $\lessdotCC$ \unify{} is allowed to perform a capture conversion yielding +$\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$. +\textit{Note:} The wildcard placeholder $\wtv{a}$ is allowed to hold free variables whereas a normal placeholder like $\ntv{r}$ +is never assigned a type containing free variables. +Therefore \unify{} sets $\wtv{a} \doteq \rwildcard{X}$, completing the constraint set and resulting in the type solution: +\begin{verbatim} +List m() = new List() :? new List() :? id(m()); +\end{verbatim} \subsection{Challenges}\label{challenges} %TODO: Wildcard subtyping is infinite see \cite{TamingWildcards} - One problem is the divergence between denotable and expressable types in Java \cite{semanticWildcardModel}. A wildcard in the Java syntax has no name and is bound to its enclosing type. -\texttt{List>} equates to $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$. -During type checking intermediate types like $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$ +$\exptype{List}{\exptype{List}{\type{?}}}$ equates to $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$. +During type checking \emph{intermediate types} like $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$ or $\wctype{\rwildcard{X}}{Pair}{\rwildcard{X}, \rwildcard{X}}$ can emerge, which have no equivalent in the Java syntax. Our type inference algorithm uses existential types internally but spawns type solutions compatible with Java. @@ -330,19 +338,11 @@ Knowing that the type \texttt{String} is a subtype of any type the wildcard \tex it is safe to pass \texttt{"String"} for the first parameter of the function. \begin{verbatim} - List add(A a, List la) {} + List add(List l, A v) {} - List list = ...; - add("String", list); +List list = ...; +add("String", list); \end{verbatim} -The constraints representing this code snippet are: - -\begin{displaymath} - \type{String} \lessdotCC \wtv{a},\, -\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}} -\end{displaymath} - -Here $\sigma(\tv{a}) = \rwildcard{X}$ is a valid solution. \end{example} \item \begin{example}\label{intro-example2} @@ -350,117 +350,36 @@ This example displays an incorrect Java program. The method call to \texttt{concat} with two wildcard lists is unsound. Each list could be of a different kind and therefore the \texttt{concat} cannot succeed. \begin{verbatim} - List concat(List l1, List l2) {} + List concat(List l1, List l2) { ... } - List list = ... ; - concat(list, list); +List list = ... ; +concat(list, list); \end{verbatim} -The constraints for this example are: -$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}, \\ -\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$ -%$\exptype{List}{?} \lessdot \exptype{List}{\tv{a}}, -%\exptype{List}{?} \lessdot \exptype{List}{\tv{a}}$ -Remember that the given constraints cannot have a valid solution. -%This is due to the fact that the wildcard variables cannot leave their scope. -In this example the \unify{} algorithm should not replace $\tv{a}$ with the captured wildcard $\rwildcard{X}$. +% $\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}, \\ +% \wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$ \end{example} -\item -\unify{} can only remove wildcards, but never add wildcards to an existing type. -Java subtyping is invariant. -The type $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ cannot be a subtype of $\exptype{List}{\tv{a}}$ +% \item +% \unify{} morphs a constraint set into a correct type solution +% gradually assigning types to type placeholders during that process. +% Solved constraints are removed and never considered again. +% In the following example \unify{} solves the constraint generated by the expression +% \texttt{l.add(l.head())} first, which results in $\ntv{l} \lessdot \exptype{List}{\wtv{a}}$. +% \begin{verbatim} +% anyList() = new List() :? new List() -\unify{} morphs a constraint set into a correct type solution -gradually assigning types to type placeholders during that process. -Solved constraints are removed and never considered again. -In the following example \unify{} solves the constraint generated by the expression -\texttt{l.add(l.head())} first, which results in $\ntv{l} \lessdot \exptype{List}{\wtv{a}}$. -\begin{verbatim} -List l = ...; - -m2() { - l.add(l.head()); -} -\end{verbatim} -The type for \texttt{l} can be any kind of list, but it has to be a invariant one. -Assigning a \texttt{List} for \texttt{l} is unsound, because the type list hiding behind -\texttt{List} could be a different one for the \texttt{add} call than the \texttt{head} method call. -An additional constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$ -is solved by removing the wildcard $\rwildcard{X}$ if possible. +% add(anyList(), anyList().head()); +% \end{verbatim} +% The type for \texttt{l} can be any kind of list, but it has to be a invariant one. +% Assigning a \texttt{List} for \texttt{l} is unsound, because the type list hiding behind +% \texttt{List} could be a different one for the \texttt{add} call than the \texttt{head} method call. +% An additional constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$ +% is solved by removing the wildcard $\rwildcard{X}$ if possible. \end{itemize} The \unify{} algorithm only sees the constraints with no information about the program they originated from. -The main challenge was to find an algorithm which computes $\sigma(\tv{a}) = \rwildcard{X}$ for example \ref{intro-example1} but not for example \ref{intro-example2}. -% Solution: A type variable can only take one type and not a wildcard type. - % A wildcard type is only treated like a wildcard while his definition is in scope (during the reduce rule) - - -% \subsection{Capture Conversion} -% \begin{verbatim} -% List add(A a, List la) {} - -% List list = ...; - -% add("String", list); -% \end{verbatim} -% The constraints representig this code snippet are: -% $\type{String} \lessdot \tv{a}, -% \wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\tv{a}}$ -% Under the hood the typechecker has to find a replacement for the generic method parameter \texttt{A}. - -% Java allows a method \texttt{ List add(A a, List la)} to be called with -% \texttt{String} and \texttt{List}. -% A naive approach would be to treat wildcards like any other type and replace \texttt{A} -% with \texttt{? super String}. -% Generating the method type \texttt{List add(? super String a, List la)}. - -% This does not work for a method like -% \texttt{ void merge(List l1, List l2)}. -% It is crucial for this method to be called with two lists of the same type. -% Substituting a wildcard for the generic \texttt{A} leads to -% \texttt{void merge(List l1, List l2)}, -% which is unsound. - -% Capture conversion utilizes the fact that instantiated classes always have an actual type. -% \texttt{new List()} and \texttt{new List()} are -% valid instances. -% \texttt{new List()} is not. -% Every type $\wcNtype{\Delta}{N}$ contains an underlying instance of a type $\type{N}$. -% Knowing this fact allows to make additional assumptions. - -% \wildFJ{} type rules allow for generic parameters to be replaced by wildcard types. -% This is possible because wildcards are split into two parts. -% Their declaration at the environment part $\Delta$ of a type $\wcNtype{\Delta}{N}$ -% and their actual use in the body of the type $\type{N}$. -% Replacing the generic \texttt{A} in the type \texttt{List} -% by a wildcard $\type{X}$ will not generate the type \texttt{List}. -% \texttt{List} is the equivalent of $\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\type{X}}$. -% The generated type here is $\exptype{List}{\type{X}}$, which has different subtype properties. -% There is no information about what the type $\exptype{List}{\type{X}}$ actually is. -% It has itself as a subtype and for example a type like $\exptype{ArrayList}{\type{X}}$. - -% A method call for example only works with values, which are correct instances of classes. -% Capture conversion automatically converts wildcard type to a concrete class type, -% which then can be used as a parameter for a method call. - -% In \wildFJ{} this is done via let statements. - -% Written in FJ syntax: $\type{N}$ is a initialized type and -% $\wcNtype{\Delta}{N}$ is a type containing wildcards $\Delta$. - -% It is crucial for the wildcard names to never be used twice. -% Otherwise the members of a list with type $\wctype{\type{X}}{List}{\type{X}}$ and -% a list with type $\wctype{\type{X}}{List}{\type{X}}$ would be the same. $\type{X}$ in fact. - -% The let statement adds wildcards to the environment. -% Those wildcards are not allowed to escape the scope of the let statement. -% Which is a problem for our type inference algorithm. - -% The capture converted version of the type $\wctype{\rwildcard{X}}{Box}{\rwildcard{X}}$ is -% $\exptype{Box}{\rwildcard{X}}$. -% The captured type is only used as an intermediate type during a method call. - +The main challenge was to find an algorithm which computes $\sigma(\wtv{a}) = \rwildcard{X}$ for example \ref{intro-example1} but not for example \ref{intro-example2}. diff --git a/prolog.tex b/prolog.tex index fc60133..60a6401 100755 --- a/prolog.tex +++ b/prolog.tex @@ -13,6 +13,7 @@ } \lstdefinestyle{fgj}{backgroundcolor=\color{lime!20}} \lstdefinestyle{tfgj}{backgroundcolor=\color{lightgray!20}} +\lstdefinestyle{letfj}{backgroundcolor=\color{lightgray!20}} \newcommand\mv[1]{{\tt #1}} @@ -98,6 +99,7 @@ \newcommand{\localVarAssumption}{\ensuremath{\mathtt{\eta}}} \newcommand{\expandLB}{\textit{expandLB}} +\newcommand{\letfj}{\emph{LetFJ}} \newcommand{\tph}{\text{tph}} \def\exptypett#1#2{\texttt{#1}\textrm{{\tt <#2}}\textrm{{\tt >}}\xspace}