This commit is contained in:
Peter Thiemann 2024-05-29 14:31:13 +02:00
parent 8b0e77cf50
commit 7e1ef7b3c1

View File

@ -387,6 +387,10 @@ let l1' : (*@$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$@*) =
\section{Global Type Inference Algorithm}
\label{sec:glob-type-infer}
This section gives an overview of the global type inference algorithm
with examples and identifies the challenges that we had to overcome in
the design of the algorithm.
\begin{figure}[h]
\begin{minipage}{0.49\textwidth}
\begin{lstlisting}[style=tfgj, caption=Valid Java program, label=lst:addExample]
@ -428,25 +432,30 @@ let l2:(*@$\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{
% \end{description}
%Our algorithm is an extension of the \emph{Global Type Inference for Featherweight Generic Java}\cite{TIforFGJ} algorithm.
Listings \ref{lst:addExample}, \ref{lst:addExampleLet}, \ref{lst:addExampleCons}, and
\ref{lst:addExampleSolution} showcase our global type inference algorithm step by step.
In this example we know that the type of the variable \texttt{l} is an existential type and has to undergo a capture conversion
Listings~\ref{lst:addExample}, \ref{lst:addExampleLet}, \ref{lst:addExampleCons}, and
\ref{lst:addExampleSolution} showcase the global type inference algorithm step by step.
In the Java code in Listing~\ref{lst:addExample}, the type of variable
\texttt{l} is an existential type so that it has to undergo a capture conversion
before being passed to a method call.
This is done by converting the program to A-Normal form \ref{lst:addExampleLet},
To this end we convert the program to A-Normal form (Listing~\ref{lst:addExampleLet}),
which introduces a let statement defining a new variable \texttt{v}.
Afterwards unknown types are replaced by type placeholders ($\tv{v}$ for the type of \texttt{v}) and constraints are generated (see \ref{lst:addExampleCons}).
These constraints mirror the type rules of our \TamedFJ{} calculus.
As the type of \texttt{v} is unknown, the algorithm puts a type
placeholder $\tv{v}$ for the type of \texttt{v} before generating constraints (see Listing~\ref{lst:addExampleCons}).
These constraints mirror the typing rules of the \TamedFJ{} calculus (section~\ref{sec:tifj}).
% During the constraint generation step the type of the variable \texttt{v} is unknown
% and given the type placeholder $\tv{v}$.
The methodcall to \texttt{add} spawns the constraint $\tv{v} \lessdotCC \exptype{List}{\wtv{a}}$.
Here we introduce a capture constraint ($\lessdotCC$) %a new type of subtype constraint
expressing that the left side of the constraint is subject to a capture conversion.
Now our unification algorithm \unify{} (defined in chapter \ref{sec:unify}) is used to solve these constraints.
In the starting set of constraints no type is assigned to $\tv{v}$ yet.
During the course of \unify{} more and more types emerge.
As soon as a concrete type for $\tv{v}$ is given \unify{} can conduct a capture conversion if needed.
The call to \texttt{add} generates the \emph{capture constraint} $\tv{v} \lessdotCC \exptype{List}{\wtv{a}}$.
This constraint ($\lessdotCC$) is a kind of subtype constraint, which additionally
expresses that the left side of the constraint is subject to a capture conversion.
Next, the unification algorithm \unify{} (section~\ref{sec:unify}) solves the constraints.
Initially, no type is assigned to $\tv{v}$.
% During the course of \unify{} more and more types emerge.
As soon as a concrete type for $\tv{v}$ is appears during constraint
solving, \unify{} can conduct a capture conversion if needed.
%The constraints where this is possible are marked as capture constraints.
In this example $\tv{v}$ will be set to $\wctype{\wildcard{X}{\type{String}}{\bot}}{List}{\rwildcard{X}}$ leaving us with the following constraints:
In this example, $\tv{v}$ will be set to
$\wctype{\wildcard{X}{\type{String}}{\bot}}{List}{\rwildcard{X}}$
leaving us with the following constraints:
\begin{displaymath}
\prftree[r]{Capture}{
@ -460,17 +469,20 @@ In this example $\tv{v}$ will be set to $\wctype{\wildcard{X}{\type{String}}{\bo
%which converts a constraint of the form $(\wctype{\rwildcard{X}}{C}{\rwildcard{X}} \lessdotCC \type{T})$ to $(\exptype{C}{\rwildcard{X}} \lessdot \type{T})$
The constraint $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$
allows \unify{} to do a capture conversion to $\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$.
The captured wildcard $\rwildcard{X}$ gets a fresh name and is stored in the wildcard environment of the \unify{} algorithm.
Leaving us with the solution $\exptype{List}{\rwildcard{Y}} \lessdot \exptype{List}{\rwildcard{Y}}$, $\type{String} \lessdot \rwildcard{Y}$
The constraint $\type{String} \lessdot \rwildcard{Y}$ is satisfied
The captured wildcard $\rwildcard{X}$ gets a fresh name \texttt{Y}, which is stored in the wildcard environment of the \unify{} algorithm.
Substituting \texttt{Y} for $\wtv{a}$ yields the constraints almost
solved: $\exptype{List}{\rwildcard{Y}} \lessdot
\exptype{List}{\rwildcard{Y}}$, $\type{String} \lessdot
\rwildcard{Y}$.
The first constraint is obviously satisfied and $\type{String} \lessdot \rwildcard{Y}$ is satisfied
because $\rwildcard{Y}$ has $\type{String}$ as lower bound.
A correct Featherweight Java program including all type annotations and an explicit capture conversion via let statement is shown in listing \ref{lst:addExampleSolution}.
This program can be deducted from the type solution of our \unify{} algorithm presented in chapter \ref{sec:unify}.
A correct Featherweight Java program including all type annotations and an explicit capture conversion via let statement is shown in Listing \ref{lst:addExampleSolution}.
This program can be deduced from the solution of the \unify{} algorithm presented in Section~\ref{sec:unify}.
In the body of the let statement the type $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$
becomes $\exptype{List}{\rwildcard{X}}$ and the wildcard $\wildcard{X}{\type{Object}}{\type{String}}$ is free and can be used as
a type parameter to method call \texttt{<X>add(v, "String")}.
becomes $\exptype{List}{\rwildcard{X}}$ and the capture converted wildcard $\wildcard{X}{\type{Object}}{\type{String}}$ can be used as
a type parameter of the call \texttt{<X>add(v, "String")}.
% The input to our type inference algorithm is a modified version of the calculus in \cite{WildcardsNeedWitnessProtection} (see chapter \ref{sec:tifj}).
% First \fjtype{} (see section \ref{chapter:constraintGeneration}) generates constraints
@ -496,32 +508,39 @@ a type parameter to method call \texttt{<X>add(v, "String")}.
% do not substitute free type variables
Global Type inference for Featherweight Java with generics but without wildcards is solved already.
But adding Wildcards to the calculus creates new problems we did not foresee
and which have not been recognized by an existing type unification algorithm for Java with wildcards \ref{plue09_1}.
Global type inference for Featherweight Generic Java without wildcards
has been considered elsewher \cite{TIforFGJ}.
Adding wildcards to the calculus created new problems, which have not
been recognized by existing work on type unification for Java with wildcards~\cite{plue09_1}.
% what is the problem?
% Java does invisible capture conversion steps
Java Wildcard types are represented as existential types and have to be opened before they can be used.
This can either be done implicitly (\cite{aModelForJavaWithWildcards}, \cite{javaTIisBroken}) or explicitly via let statements (\cite{WildcardsNeedWitnessProtection}).
For all of those variations it is vital to know the argument types with which a method is called.
But our input program does not contain any type annotations.
We do not know where an existential type will emerge and where a capture conversion is necessary.
This has to be figured out during the type inference algorithm.
We detected three main challenges related to Java Wildcards and Global Type Inference:
Java's wildcard types are represented as existential types that have to be opened before they can be used.
Opening can either be done implicitly
(\cite{aModelForJavaWithWildcards}, \cite{javaTIisBroken}) or
explicitly via a let statement (\cite{WildcardsNeedWitnessProtection}).
For all variations it is vital to know the argument types with which a method is called.
In the absence of type annotations,
we do not know where an existential type will emerge and where a capture conversion is necessary.
Rather, the type inference algorithm has to figure out the placement
of existentials.
We identified three main challenges related to Java wildcards and global type inference.
\begin{enumerate}
\item \label{challenge:1}
One challenge is to design the algorithm in a way that it finds a correct solution for the program shown in listing \ref{lst:addExample}
and rejects the program in listing \ref{lst:concatError}.
One challenge is to design the algorithm in a way that it finds a
correct solution for programs like the one shown in Listing~\ref{lst:addExample}
and rejects programs like the one in Listing~\ref{lst:concatError}.
The first one is a valid Java program,
because the type \texttt{List<? super String>} is \textit{capture converted} to a fresh type variable $\rwildcard{X}$
which is used as the generic method parameter for the call to \texttt{add} as shown in listing \ref{lst:addExampleLet}.
Knowing that the type \texttt{String} is a subtype of the free variable $\rwildcard{X}$
which is used as the generic method parameter for the call to \texttt{add} as shown in Listing~\ref{lst:addExampleLet}.
Because we know that the type \texttt{String} is a subtype of the free variable $\rwildcard{X}$,
it is safe to pass \texttt{"String"} for the first parameter of the function.
The second program shown in listing \ref{lst:concatError} is incorrect.
The second program shown in Listing~\ref{lst:concatError} is incorrect.
The method call to \texttt{concat} with two wildcard lists is unsound.
Each list could be of a different kind and therefore the \texttt{concat} cannot succeed.
The problem gets apparent when we try to write the \texttt{concat} method call in our \TamedFJ{} calculus (listing \ref{lst:concatTamedFJ}):
The element types of the lists may be unrelated, therefore the call to \texttt{concat} cannot succeed.
The problem gets apparent if we try to write the \texttt{concat}
method call in the \TamedFJ{} calculus
(Listing~\ref{lst:concatTamedFJ}):
\texttt{l1'} and \texttt{l2'} are two different lists inside the body of the let statements, namely
$\exptype{List}{\rwildcard{X}}$ and $\exptype{List}{\rwildcard{Y}}$.
For the method call \texttt{concat(x1, x2)} no replacement for the generic \texttt{A}
@ -533,33 +552,33 @@ $\exptype{List}{\type{A}} <: \exptype{List}{\type{X}},
% Capture Conversion during Unify.
\item
\unify{} morphs a constraint set into a correct type solution
\unify{} transforms a constraint set into a correct type solution by
gradually assigning types to type placeholders during that process.
Solved constraints are removed and never considered again.
In the following example \unify{} solves the constraint generated by the expression
In the following example, \unify{} solves the constraint generated by the expression
\texttt{l.add(l.head())} first, which results in $\ntv{l} \lessdot \exptype{List}{\wtv{a}}$.
\begin{verbatim}
anyList() = new List<String>() :? new List<Integer>()
add(anyList(), anyList().head());
\end{verbatim}
The type for \texttt{l} can be any kind of list, but it has to be a invariant one.
The type for \texttt{l} can be an arbitrary list, but it has to be a invariant one.
Assigning a \texttt{List<?>} for \texttt{l} is unsound, because the type list hiding behind
\texttt{List<?>} could be a different one for the \texttt{add} call than the \texttt{head} method call.
An additional constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$
is solved by removing the wildcard $\rwildcard{X}$ if possible.
this problem is solved by ANF transformation
This problem is solved by ANF transformation
\item
% This problem is solved by assuming everything is a wildcard and lateron erasing excessive wildcards
% this is solved by having wildcards bound to a type. But this makes it necessary to remove wildcards lateron otherwise Unify would have to backtrack
The program in listing \ref{shuffleExample} shows a challenge involving wildcards and subtyping.
The method call \texttt{shuffle(l)} is incorrect, because \texttt{l} has the type
The program in Listing~\ref{shuffleExample} exhibits a challenge involving wildcards and subtyping.
The method call \texttt{shuffle(l)} is incorrect, because \texttt{l} has type
$\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$ representing a list of unknown lists.
Whereas $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$ is a subtype of
$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$ representing a list of lists, all of the same type $\rwildcard{X}$,
and can savely be passed to \texttt{shuffle}.
However, $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$ is a subtype of
$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$ which represents a list of lists, all of the same type $\rwildcard{X}$,
and can be passed safely to \texttt{shuffle}.
This behavior can also be explained by looking at the types and their capture converted versions:
\begin{center}
\begin{tabular}{l | l | l}
@ -619,7 +638,7 @@ shuffle(l2d); // Valid
% \end{center}
% \end{constraintset}
Given such a program our type inference algorithm has to allow the call \texttt{shuffle(l2d)} and
Given such a program the type inference algorithm has to allow the call \texttt{shuffle(l2d)} and
decline the call to \texttt{shuffle(l)}.
% The method call \texttt{shuffle(l)} is invalid however,
@ -628,10 +647,10 @@ decline the call to \texttt{shuffle(l)}.
% There is no solution for the subtype constraint:
% $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}$
\item \label{challenge3} \textbf{Free Variables cannot leaver their scope}:
\item \label{challenge3} \textbf{Free variables cannot leave their scope}:
Let's assume we have a variable \texttt{ls} with type $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$
%When calling the \texttt{id} function with an element of this list we have to apply capture conversion.
and the following input program:
and the following program:
\noindent
\begin{minipage}{0.62\textwidth}
@ -648,14 +667,15 @@ let x : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) = ls.get(0) in id(x)
% the variable z has to
Take the Java program in listing \ref{lst:mapExample} for example.
It maps every element of a list
$\expr{l} : \exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$
to a polymorphic \texttt{id} method.
We want to use our \unify{} algorithm to determine the correct type for the
variable \expr{l2}.
It uses map to apply a polymorphic method \texttt{id} to every element of a list
$\expr{l} :
\exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$.
How do we get the \unify{} algorithm to determine the correct type for the
variable \expr{l2}?
Although we do not specify constraint generation for language constructs like
lambda expressions used in this example,
we can imagine that the constraints have to look something like this:
we can imagine that the constraints have to look like in Listing~\ref{lst:mapExampleCons}.
\begin{minipage}{0.45\textwidth}
\begin{lstlisting}[caption=List Map Example,label=lst:mapExample]
@ -682,16 +702,18 @@ stem from the body of the lambda expression
The T-Let rule prevents us from using free variables created by the method call to \expr{id}
to be used in the return type $\tv{z}$.
But this has to be signaled to the \unify{} algorithm, which does not know about the origin and context of
But this restriction has to be communicated to the \unify{} algorithm,
which does not know about the origin and context of
the constraints.
If we naively substitute $\sigma(\tv{z}) = \exptype{List}{\rwildcard{A}}$
the return type of the \texttt{map} function would be the type $\exptype{List}{\exptype{List}{\rwildcard{A}}}$.
This type solution is unsound.
the return type of the \texttt{map} function would be the type
$\exptype{List}{\exptype{List}{\rwildcard{A}}}$, which would be unsound.
The type of \expr{l2} is the same as the one of \expr{l}:
$\exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$
\textbf{Solution:}
We solve this by distinguishing between wildcard placeholders and normal placeholders.
We solve this issue by distinguishing between wildcard placeholders
and normal placeholders and introducing them as needed.
$\ntv{z}$ is a normal placeholder and is not allowed to contain free variables.
\end{enumerate}