From 7e1ef7b3c19b86f781b47f5747c47a7140ce1fbc Mon Sep 17 00:00:00 2001 From: Peter Thiemann Date: Wed, 29 May 2024 14:31:13 +0200 Subject: [PATCH] intro --- introduction.tex | 142 +++++++++++++++++++++++++++-------------------- 1 file changed, 82 insertions(+), 60 deletions(-) diff --git a/introduction.tex b/introduction.tex index 38d06e8..5b6ca9e 100644 --- a/introduction.tex +++ b/introduction.tex @@ -387,6 +387,10 @@ let l1' : (*@$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$@*) = \section{Global Type Inference Algorithm} \label{sec:glob-type-infer} +This section gives an overview of the global type inference algorithm +with examples and identifies the challenges that we had to overcome in +the design of the algorithm. + \begin{figure}[h] \begin{minipage}{0.49\textwidth} \begin{lstlisting}[style=tfgj, caption=Valid Java program, label=lst:addExample] @@ -428,25 +432,30 @@ let l2:(*@$\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{ % \end{description} %Our algorithm is an extension of the \emph{Global Type Inference for Featherweight Generic Java}\cite{TIforFGJ} algorithm. -Listings \ref{lst:addExample}, \ref{lst:addExampleLet}, \ref{lst:addExampleCons}, and -\ref{lst:addExampleSolution} showcase our global type inference algorithm step by step. -In this example we know that the type of the variable \texttt{l} is an existential type and has to undergo a capture conversion +Listings~\ref{lst:addExample}, \ref{lst:addExampleLet}, \ref{lst:addExampleCons}, and +\ref{lst:addExampleSolution} showcase the global type inference algorithm step by step. +In the Java code in Listing~\ref{lst:addExample}, the type of variable +\texttt{l} is an existential type so that it has to undergo a capture conversion before being passed to a method call. -This is done by converting the program to A-Normal form \ref{lst:addExampleLet}, +To this end we convert the program to A-Normal form (Listing~\ref{lst:addExampleLet}), which introduces a let statement defining a new variable \texttt{v}. -Afterwards unknown types are replaced by type placeholders ($\tv{v}$ for the type of \texttt{v}) and constraints are generated (see \ref{lst:addExampleCons}). -These constraints mirror the type rules of our \TamedFJ{} calculus. +As the type of \texttt{v} is unknown, the algorithm puts a type +placeholder $\tv{v}$ for the type of \texttt{v} before generating constraints (see Listing~\ref{lst:addExampleCons}). +These constraints mirror the typing rules of the \TamedFJ{} calculus (section~\ref{sec:tifj}). % During the constraint generation step the type of the variable \texttt{v} is unknown % and given the type placeholder $\tv{v}$. -The methodcall to \texttt{add} spawns the constraint $\tv{v} \lessdotCC \exptype{List}{\wtv{a}}$. -Here we introduce a capture constraint ($\lessdotCC$) %a new type of subtype constraint -expressing that the left side of the constraint is subject to a capture conversion. -Now our unification algorithm \unify{} (defined in chapter \ref{sec:unify}) is used to solve these constraints. -In the starting set of constraints no type is assigned to $\tv{v}$ yet. -During the course of \unify{} more and more types emerge. -As soon as a concrete type for $\tv{v}$ is given \unify{} can conduct a capture conversion if needed. +The call to \texttt{add} generates the \emph{capture constraint} $\tv{v} \lessdotCC \exptype{List}{\wtv{a}}$. +This constraint ($\lessdotCC$) is a kind of subtype constraint, which additionally +expresses that the left side of the constraint is subject to a capture conversion. +Next, the unification algorithm \unify{} (section~\ref{sec:unify}) solves the constraints. +Initially, no type is assigned to $\tv{v}$. +% During the course of \unify{} more and more types emerge. +As soon as a concrete type for $\tv{v}$ is appears during constraint +solving, \unify{} can conduct a capture conversion if needed. %The constraints where this is possible are marked as capture constraints. -In this example $\tv{v}$ will be set to $\wctype{\wildcard{X}{\type{String}}{\bot}}{List}{\rwildcard{X}}$ leaving us with the following constraints: +In this example, $\tv{v}$ will be set to +$\wctype{\wildcard{X}{\type{String}}{\bot}}{List}{\rwildcard{X}}$ +leaving us with the following constraints: \begin{displaymath} \prftree[r]{Capture}{ @@ -460,17 +469,20 @@ In this example $\tv{v}$ will be set to $\wctype{\wildcard{X}{\type{String}}{\bo %which converts a constraint of the form $(\wctype{\rwildcard{X}}{C}{\rwildcard{X}} \lessdotCC \type{T})$ to $(\exptype{C}{\rwildcard{X}} \lessdot \type{T})$ The constraint $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$ allows \unify{} to do a capture conversion to $\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$. -The captured wildcard $\rwildcard{X}$ gets a fresh name and is stored in the wildcard environment of the \unify{} algorithm. -Leaving us with the solution $\exptype{List}{\rwildcard{Y}} \lessdot \exptype{List}{\rwildcard{Y}}$, $\type{String} \lessdot \rwildcard{Y}$ -The constraint $\type{String} \lessdot \rwildcard{Y}$ is satisfied +The captured wildcard $\rwildcard{X}$ gets a fresh name \texttt{Y}, which is stored in the wildcard environment of the \unify{} algorithm. +Substituting \texttt{Y} for $\wtv{a}$ yields the constraints almost +solved: $\exptype{List}{\rwildcard{Y}} \lessdot +\exptype{List}{\rwildcard{Y}}$, $\type{String} \lessdot +\rwildcard{Y}$. +The first constraint is obviously satisfied and $\type{String} \lessdot \rwildcard{Y}$ is satisfied because $\rwildcard{Y}$ has $\type{String}$ as lower bound. -A correct Featherweight Java program including all type annotations and an explicit capture conversion via let statement is shown in listing \ref{lst:addExampleSolution}. -This program can be deducted from the type solution of our \unify{} algorithm presented in chapter \ref{sec:unify}. +A correct Featherweight Java program including all type annotations and an explicit capture conversion via let statement is shown in Listing \ref{lst:addExampleSolution}. +This program can be deduced from the solution of the \unify{} algorithm presented in Section~\ref{sec:unify}. In the body of the let statement the type $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$ -becomes $\exptype{List}{\rwildcard{X}}$ and the wildcard $\wildcard{X}{\type{Object}}{\type{String}}$ is free and can be used as -a type parameter to method call \texttt{add(v, "String")}. +becomes $\exptype{List}{\rwildcard{X}}$ and the capture converted wildcard $\wildcard{X}{\type{Object}}{\type{String}}$ can be used as +a type parameter of the call \texttt{add(v, "String")}. % The input to our type inference algorithm is a modified version of the calculus in \cite{WildcardsNeedWitnessProtection} (see chapter \ref{sec:tifj}). % First \fjtype{} (see section \ref{chapter:constraintGeneration}) generates constraints @@ -496,32 +508,39 @@ a type parameter to method call \texttt{add(v, "String")}. % do not substitute free type variables -Global Type inference for Featherweight Java with generics but without wildcards is solved already. -But adding Wildcards to the calculus creates new problems we did not foresee -and which have not been recognized by an existing type unification algorithm for Java with wildcards \ref{plue09_1}. +Global type inference for Featherweight Generic Java without wildcards +has been considered elsewher \cite{TIforFGJ}. +Adding wildcards to the calculus created new problems, which have not +been recognized by existing work on type unification for Java with wildcards~\cite{plue09_1}. % what is the problem? % Java does invisible capture conversion steps -Java Wildcard types are represented as existential types and have to be opened before they can be used. -This can either be done implicitly (\cite{aModelForJavaWithWildcards}, \cite{javaTIisBroken}) or explicitly via let statements (\cite{WildcardsNeedWitnessProtection}). -For all of those variations it is vital to know the argument types with which a method is called. -But our input program does not contain any type annotations. -We do not know where an existential type will emerge and where a capture conversion is necessary. -This has to be figured out during the type inference algorithm. -We detected three main challenges related to Java Wildcards and Global Type Inference: +Java's wildcard types are represented as existential types that have to be opened before they can be used. +Opening can either be done implicitly +(\cite{aModelForJavaWithWildcards}, \cite{javaTIisBroken}) or +explicitly via a let statement (\cite{WildcardsNeedWitnessProtection}). +For all variations it is vital to know the argument types with which a method is called. +In the absence of type annotations, +we do not know where an existential type will emerge and where a capture conversion is necessary. +Rather, the type inference algorithm has to figure out the placement +of existentials. +We identified three main challenges related to Java wildcards and global type inference. \begin{enumerate} \item \label{challenge:1} -One challenge is to design the algorithm in a way that it finds a correct solution for the program shown in listing \ref{lst:addExample} -and rejects the program in listing \ref{lst:concatError}. +One challenge is to design the algorithm in a way that it finds a +correct solution for programs like the one shown in Listing~\ref{lst:addExample} +and rejects programs like the one in Listing~\ref{lst:concatError}. The first one is a valid Java program, because the type \texttt{List} is \textit{capture converted} to a fresh type variable $\rwildcard{X}$ -which is used as the generic method parameter for the call to \texttt{add} as shown in listing \ref{lst:addExampleLet}. -Knowing that the type \texttt{String} is a subtype of the free variable $\rwildcard{X}$ +which is used as the generic method parameter for the call to \texttt{add} as shown in Listing~\ref{lst:addExampleLet}. +Because we know that the type \texttt{String} is a subtype of the free variable $\rwildcard{X}$, it is safe to pass \texttt{"String"} for the first parameter of the function. -The second program shown in listing \ref{lst:concatError} is incorrect. +The second program shown in Listing~\ref{lst:concatError} is incorrect. The method call to \texttt{concat} with two wildcard lists is unsound. -Each list could be of a different kind and therefore the \texttt{concat} cannot succeed. -The problem gets apparent when we try to write the \texttt{concat} method call in our \TamedFJ{} calculus (listing \ref{lst:concatTamedFJ}): +The element types of the lists may be unrelated, therefore the call to \texttt{concat} cannot succeed. +The problem gets apparent if we try to write the \texttt{concat} +method call in the \TamedFJ{} calculus +(Listing~\ref{lst:concatTamedFJ}): \texttt{l1'} and \texttt{l2'} are two different lists inside the body of the let statements, namely $\exptype{List}{\rwildcard{X}}$ and $\exptype{List}{\rwildcard{Y}}$. For the method call \texttt{concat(x1, x2)} no replacement for the generic \texttt{A} @@ -533,33 +552,33 @@ $\exptype{List}{\type{A}} <: \exptype{List}{\type{X}}, % Capture Conversion during Unify. \item -\unify{} morphs a constraint set into a correct type solution +\unify{} transforms a constraint set into a correct type solution by gradually assigning types to type placeholders during that process. Solved constraints are removed and never considered again. -In the following example \unify{} solves the constraint generated by the expression +In the following example, \unify{} solves the constraint generated by the expression \texttt{l.add(l.head())} first, which results in $\ntv{l} \lessdot \exptype{List}{\wtv{a}}$. \begin{verbatim} anyList() = new List() :? new List() add(anyList(), anyList().head()); \end{verbatim} -The type for \texttt{l} can be any kind of list, but it has to be a invariant one. +The type for \texttt{l} can be an arbitrary list, but it has to be a invariant one. Assigning a \texttt{List} for \texttt{l} is unsound, because the type list hiding behind \texttt{List} could be a different one for the \texttt{add} call than the \texttt{head} method call. An additional constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$ is solved by removing the wildcard $\rwildcard{X}$ if possible. -this problem is solved by ANF transformation +This problem is solved by ANF transformation \item % This problem is solved by assuming everything is a wildcard and lateron erasing excessive wildcards % this is solved by having wildcards bound to a type. But this makes it necessary to remove wildcards lateron otherwise Unify would have to backtrack -The program in listing \ref{shuffleExample} shows a challenge involving wildcards and subtyping. -The method call \texttt{shuffle(l)} is incorrect, because \texttt{l} has the type +The program in Listing~\ref{shuffleExample} exhibits a challenge involving wildcards and subtyping. +The method call \texttt{shuffle(l)} is incorrect, because \texttt{l} has type $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$ representing a list of unknown lists. -Whereas $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$ is a subtype of -$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$ representing a list of lists, all of the same type $\rwildcard{X}$, -and can savely be passed to \texttt{shuffle}. +However, $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$ is a subtype of +$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$ which represents a list of lists, all of the same type $\rwildcard{X}$, +and can be passed safely to \texttt{shuffle}. This behavior can also be explained by looking at the types and their capture converted versions: \begin{center} \begin{tabular}{l | l | l} @@ -619,7 +638,7 @@ shuffle(l2d); // Valid % \end{center} % \end{constraintset} -Given such a program our type inference algorithm has to allow the call \texttt{shuffle(l2d)} and +Given such a program the type inference algorithm has to allow the call \texttt{shuffle(l2d)} and decline the call to \texttt{shuffle(l)}. % The method call \texttt{shuffle(l)} is invalid however, @@ -628,10 +647,10 @@ decline the call to \texttt{shuffle(l)}. % There is no solution for the subtype constraint: % $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}$ -\item \label{challenge3} \textbf{Free Variables cannot leaver their scope}: +\item \label{challenge3} \textbf{Free variables cannot leave their scope}: Let's assume we have a variable \texttt{ls} with type $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$ %When calling the \texttt{id} function with an element of this list we have to apply capture conversion. -and the following input program: +and the following program: \noindent \begin{minipage}{0.62\textwidth} @@ -648,14 +667,15 @@ let x : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) = ls.get(0) in id(x) % the variable z has to Take the Java program in listing \ref{lst:mapExample} for example. -It maps every element of a list -$\expr{l} : \exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$ -to a polymorphic \texttt{id} method. -We want to use our \unify{} algorithm to determine the correct type for the -variable \expr{l2}. +It uses map to apply a polymorphic method \texttt{id} to every element of a list +$\expr{l} : +\exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$. + +How do we get the \unify{} algorithm to determine the correct type for the +variable \expr{l2}? Although we do not specify constraint generation for language constructs like lambda expressions used in this example, -we can imagine that the constraints have to look something like this: +we can imagine that the constraints have to look like in Listing~\ref{lst:mapExampleCons}. \begin{minipage}{0.45\textwidth} \begin{lstlisting}[caption=List Map Example,label=lst:mapExample] @@ -682,16 +702,18 @@ stem from the body of the lambda expression The T-Let rule prevents us from using free variables created by the method call to \expr{id} to be used in the return type $\tv{z}$. -But this has to be signaled to the \unify{} algorithm, which does not know about the origin and context of +But this restriction has to be communicated to the \unify{} algorithm, +which does not know about the origin and context of the constraints. If we naively substitute $\sigma(\tv{z}) = \exptype{List}{\rwildcard{A}}$ -the return type of the \texttt{map} function would be the type $\exptype{List}{\exptype{List}{\rwildcard{A}}}$. -This type solution is unsound. +the return type of the \texttt{map} function would be the type +$\exptype{List}{\exptype{List}{\rwildcard{A}}}$, which would be unsound. The type of \expr{l2} is the same as the one of \expr{l}: $\exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$ \textbf{Solution:} -We solve this by distinguishing between wildcard placeholders and normal placeholders. +We solve this issue by distinguishing between wildcard placeholders +and normal placeholders and introducing them as needed. $\ntv{z}$ is a normal placeholder and is not allowed to contain free variables. \end{enumerate}