Ecoop2024_TIforWildFJ/introduction.tex

736 lines
34 KiB
TeX
Raw Normal View History

2024-02-28 17:51:41 +01:00
%TODO:
% -Explain <c
% - Kapitel 1.2 + 1.3 ist noch zu komplex
% - Constraints und Unifikation erklären, bevor Unifikation erwähnt wird
\section{Type Inference for Java}
%The goal is to find a correct typing for a given Java program.
Type inference for Java has many use cases and could be used to help programmers by inserting correct types for them,
finding better type solutions for already typed Java programs (for example more generical ones),
or allowing to write typeless Java code which is then type infered and thereby type checked by our algorithm.
The algorithm proposed in this paper can determine a correct typing for the untyped Java source code example shown in figure \ref{fig:intro-example-typeless}.
Our algorithm is also capable of finding solutions involving wildcards as shown in figure \ref{fig:intro-example-typed}.
%This paper extends a type inference algorithm for Featherweight Java \cite{TIforFGJ} by adding wildcards.
%The last step to create a type inference algorithm compatible to the Java type system.
2024-03-13 18:51:01 +01:00
The algorithm presented in this paper is a improved version of the one in \cite{TIforFGJ} including wildcard support.
%a modified version of the \unify{} algorithm presented in \cite{plue09_1}.
The input to the type inference algorithm is a Featherweight Java program (example in figure \ref{fig:nested-list-example-typeless}) conforming to the syntax shown in figure \ref{fig:syntax}.
The \fjtype{} algorithm calculates constraints based on this intermediate representation,
which are then solved by the \unify{} algorithm
resulting in a correctly typed program (see figure \ref{fig:nested-list-example-typed}).
2024-02-07 10:28:28 +01:00
\begin{itemize}
\item
We introduce the language \tifj{} (chapter \ref{sec:tifj}).
A Featherweight Java derivative including Generics, Wildcards and Type Inference.
\item
2024-02-11 21:59:57 +01:00
We support capture conversion and Java style method calls.
This requires existential types in a form which is not denotable by Java syntax \cite{aModelForJavaWithWildcards}.
\item
2024-02-11 21:59:57 +01:00
We present a novel approach to deal with existential types and capture conversion during constraint unification.
\item The algorithm is split in two parts. A constraint generation step and an unification step.
Input language and constraint generations can be extended without altering the unification part.
\item
2024-02-11 21:59:57 +01:00
We proof soundness and aim for a good compromise between completeness and time complexity.
\end{itemize}
2024-03-05 17:12:56 +01:00
% Our algorithm finds a correct type solution for the following example, where the Java local type inference fails:
% \begin{verbatim}
% class SuperPair<A,B>{
% A a;
% B b;
% }
% class Pair<A,B> extends SuperPair<B,A>{
% A a;
% B b;
2024-02-11 21:59:57 +01:00
2024-03-05 17:12:56 +01:00
% <X> X choose(X a, X b){ return b; }
2024-02-11 21:59:57 +01:00
2024-03-05 17:12:56 +01:00
% String m(List<? extends Pair<Integer, String>> a, List<? extends Pair<Integer, String>> b){
% return choose(choose(a,b).value.a,b.value.b);
% }
% }
% \end{verbatim}
2024-02-11 21:59:57 +01:00
\begin{figure}%[tp]
2024-02-06 18:04:31 +01:00
\begin{subfigure}[t]{\linewidth}
\begin{lstlisting}[style=fgj]
class List<A> {
List<A> add(A v) { ... }
}
class Example {
m(l, la, lb){
return l
.add(la.add(1))
.add(lb.add("str"));
}
}
\end{lstlisting}
\caption{Java method with missing return type}
\label{fig:nested-list-example-typeless}
\end{subfigure}
~
% \begin{subfigure}[t]{\linewidth}
% \begin{lstlisting}[style=tfgj]
% class List<A> {
% List<A> add(A v) { ... }
% }
% class Example {
% m(l, la, lb){
% return let r2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$@*) = {
% let r1 : (*@$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$@*) = l in {
% let p1 : (*@$\exptype{List}{\type{Integer}}$@*) = {
% let xa = la in xa.add(1)
% } in x1.add(p1)
% } in {
% let p2 = {
% let xb = lb in xb.add("str")
% } in x2.add(p2)
% };
% }
% }
% \end{lstlisting}
% \caption{Featherweight Java Representation}
% \label{fig:nested-list-example-let}
% \end{subfigure}
% ~
2024-02-06 18:04:31 +01:00
\begin{subfigure}[t]{\linewidth}
\begin{lstlisting}[style=tfgj]
class List<A> {
List<A> add(A v) { ... }
}
class Example {
m(List<List<? extends Object>> l, List<Integer> la, List<String> lb){
return l
.add(la.add(1))
.add(lb.add("str"));
}
}
\end{lstlisting}
\caption{Java Representation}
\label{fig:nested-list-example-typed}
\end{subfigure}
%\caption{Example code}
%\label{fig:intro-example-code}
\end{figure}
\begin{figure}%[tp]
\begin{subfigure}[t]{0.49\linewidth}
\begin{lstlisting}[style=fgj]
genList() {
if( ... ) {
return new List<String>();
} else {
return new List<Integer>();
}
}
\end{lstlisting}
\caption{Java method with missing return type}
\label{fig:intro-example-typeless}
\end{subfigure}
~
\begin{subfigure}[t]{0.49\linewidth}
\begin{lstlisting}[style=tfgj]
List<?> genList() {
if( ... ) {
return new List<String>();
} else {
return new List<Integer>();
}
}
\end{lstlisting}
\caption{Correct type}
\label{fig:intro-example-typed}
\end{subfigure}
%\caption{Example code}
%\label{fig:intro-example-code}
2024-01-31 18:05:22 +01:00
\end{figure}
% \subsection{Wildcards}
% Java subtyping involving generics is invariant.
% For example \texttt{List<String>} is not a subtype of \texttt{List<Object>}.
% %Wildcards introduce variance by allowing \texttt{List<String>} to be a subtype of \texttt{List<?>}.
% \texttt{List<Object>} is not a valid return type for the method \texttt{genList}.
% The type inference algorithm has to find the correct type involving wildcards (\texttt{List<?>}).
\subsection{Java Wildcards}
Wildcards are expressed by a \texttt{?} in Java and can be used as type parameters.
Wildcards add variance to Java type parameters.
2024-02-14 19:19:03 +01:00
Generally Java has invariant subtyping for polymorphic types.
A \texttt{List<String>} is not a subtype of \texttt{List<Object>} for example
even though it seems intuitive with \texttt{String} being a subtype of \texttt{Object}.
2024-02-14 19:19:03 +01:00
Wildcards can be formalized as existential types \cite{WildFJ}.
\texttt{List<? extends Object>} and \texttt{List<? super String>} are both wildcard types
denoted in our syntax by $\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$ and
$\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$.
The syntax used here allows for wildcard parameters to have a name, an uppper and lower bound at the same time,
and a type they are bound to.
In this case the name is $\rwildcard{X}$ and it's bound to the the type \texttt{List}.
2024-02-11 21:59:57 +01:00
Those properties are needed to formalize capture conversion.
Polymorphic method calls need to be wraped in a process which \textit{opens} existential types \cite{addingWildcardsToJava}.
In Java this is done implicitly in a process called capture conversion.
%show input and a correct letFJ representation
%TODO: first show local type inference and explain lessdotCC constraints. then show example with global TI
\begin{figure}[h]
\begin{subfigure}[t]{0.47\textwidth}
\centering
\begin{lstlisting}[style=tfgj, caption=Valid Java program, label=lst:addExample]
<A> List<A> add(List<A> l, A v) ...
List<? super String> l = ...;
add(l, "String");
\end{lstlisting}
\end{subfigure}\hfill
\begin{subfigure}[t]{0.47\textwidth}
\centering
\begin{lstlisting}[style=letfj, caption=\letfj{} representation, label=lst:addExampleLet]
<A> List<A> add(List<A> l, A v)
let l2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$@*) = l
in <X>add(l2, "String");
\end{lstlisting}
\end{subfigure}
\end{figure}
The Example in listing \ref{lst:addExample} is a valid Java program. Here Java uses local type inference \cite{JavaLocalTI}
to determine the type parameters to the \texttt{add} method call.
In \letfj{} there is no local type inference and all type parameters for a method call are mandatory (see listing \ref{lst:addExampleLet}).
If wildcards are involved the so called capture conversion has to be done manually via let statements.
%A let statement \emph{opens} an existential type.
In the body of the let statement the \textit{capture type} $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$
becomes $\exptype{List}{\rwildcard{X}}$ and the wildcard $\wildcard{X}{\type{Object}}{\type{String}}$ is free and can be used as
a type parameter to \texttt{<X>add(...)}.
%This is a valid Java program where the type parameters for the polymorphic method \texttt{add}
%are determined by local type inference.
One problem is the divergence between denotable and expressable types in Java \cite{semanticWildcardModel}.
A wildcard in the Java syntax has no name and is bound to its enclosing type:
$\exptype{List}{\exptype{List}{\type{?}}}$ equates to $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
During type checking \emph{intermediate types}
%like $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$
%or $\wctype{\rwildcard{X}}{Pair}{\rwildcard{X}, \rwildcard{X}}$
can emerge, which have no equivalent in the Java syntax.
%Our type inference algorithm uses existential types internally but spawns type solutions compatible with Java.
Example: % This program is not typable with the Type Inference algorithm from Plümicke
\begin{lstlisting}[style=java,label=shuffleExample,caption=Intermediate Types Example]
class List<X> extends Object {...}
class List2D<X> extends List<List<X>> {...}
<X> void shuffle(List<List<X>> list) {...}
List<List<?>> l = ...;
List2D<?> l2d = ...;
shuffle(l); // Error
shuffle(l2d); // Valid
\end{lstlisting}
Java is using local type inference to allow method invocations which are not describable with regular Java types.
The \texttt{shuffle} method in this case is invoked with the type $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$
which is a subtype of $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$.
After capture conversion \texttt{l2d'} has the type $\exptype{List}{\exptype{List}{\rwildcard{X}}}$
and \texttt{shuffle} can be invoked with the type parameter $\rwildcard{X}$:
\begin{lstlisting}[style=letfj]
let l2d' : (*@$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$@*) = l2d in <X>shuffle(l2d')
\end{lstlisting}
2024-03-05 17:12:56 +01:00
\subsection{Global Type Inference}
\begin{description}
\item[input] \tifj{} program
\item[output] type solution
\item[postcondition] the type solution applied to the input must yield a valid \letfj{} program
\end{description}
The input to our type inference algorithm is a modified version of the \letfj{}\cite{WildcardsNeedWitnessProtection} calculus (see chapter \ref{sec:tifj}).
First \fjtype{} generates constraints
and afterwards \unify{} computes a solution for the given constraint set.
2024-03-05 17:12:56 +01:00
Constraints consist out of subtype constraints $(\type{T} \lessdot \type{T})$ and capture constraints $(\type{T} \lessdotCC \type{T})$.
\textit{Note:} a type $\type{T}$ can either be a named type, a type placeholder or a wildcard type placeholder.
2024-03-05 17:12:56 +01:00
A subtype constraint is satisfied if the left side is a subtype of the right side according to the rules in figure \ref{fig:subtyping}.
\textit{Example:} $\exptype{List}{\ntv{a}} \lessdot \exptype{List}{\type{String}}$ is fulfilled by replacing type placeholder $\ntv{a}$ with the type $\type{String}$.
Subtype constraints and type placeholders act the same as the ones used in \emph{Type Inference for Featherweight Generic Java} \cite{TIforFGJ}.
2024-03-05 17:12:56 +01:00
The novel capture constraints and wildcard placeholders are needed for method invocations involving wildcards.
\begin{recap}\textbf{TI for FGJ without Wildcards:}
\TFGJ{} generates subtype constraints $(\type{T} \lessdot \type{T})$ consisting of named types and type placeholders.
For example the method invocation \texttt{concat(l, new Object())} generates the constraints
$\tv{l} \lessdot \exptype{List}{\tv{a}}, \type{Object} \lessdot \tv{a}$.
Subtyping without the use of wildcards is invariant \cite{FJ}:
Therefore the only possible solution for the type placeholder $\tv{a}$ is $\tv{a} \doteq \type{Object}$. % in Java and Featherweight Java.
The correct type for the variable \texttt{l} is $\exptype{List}{\type{Object}}$ (or a direct subtype).
%- usually the type of e must be subtypes of the method parameters
%- in case of a polymorphic method: type placeholders resemble type parameters
The type inference algorithm for Featherweight Generic Java \cite{TIforFGJ} (called \TFGJ{}) is complete and sound:
It is able to find a type solution for a Featherweight Generic Java program, which has no type annotations at all,
if there is any.
It's only restriction is that no polymorphic recursion is allowed.
\end{recap}
%
2024-03-05 17:12:56 +01:00
Lets have a look at the constraints generated by \fjtype{} for the example in listing \ref{lst:addExample}:
\begin{constraintset}
\begin{center}
$\begin{array}{c}
2024-03-05 17:12:56 +01:00
\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}, \, \type{String} \lessdotCC \wtv{a}
\\
\hline
\textit{Capture Conversion:}\ \wildcard{Y}{\type{Object}}{\type{String}} \wcSep \exptype{List}{\rwildcard{Y}} \lessdot \exptype{List}{\wtv{a}}, \, \type{String} \lessdot \wtv{a}
2024-03-05 17:12:56 +01:00
\\
\hline
\textit{Solution:}\ \wtv{a} \doteq \rwildcard{Y} \implies \wildcard{Y}{\type{Object}}{\type{String}} \wcSep \exptype{List}{\rwildcard{Y}} \lessdot \exptype{List}{\rwildcard{Y}}, \, \type{String} \lessdot \rwildcard{Y}
2024-03-05 17:12:56 +01:00
\end{array}
$
\end{center}
2024-03-05 17:12:56 +01:00
\end{constraintset}
%
Capture Constraints $(\wctype{\rwildcard{X}}{C}{\rwildcard{X}} \lessdotCC \type{T})$ allow for a capture conversion,
which converts a constraint of the form $(\wctype{\rwildcard{X}}{C}{\rwildcard{X}} \lessdotCC \type{T})$ to $(\exptype{C}{\rwildcard{X}} \lessdot \type{T})$
%These constraints are used at places where a capture conversion via let statement can be added.
2024-03-05 17:12:56 +01:00
%Why do we need the lessdotCC constraints here?
The type of \texttt{l} can be capture converted by a let statement if needed (see listing \ref{lst:addExampleLet}).
Therefore we assign the constraint $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$
which allows \unify{} to do a capture conversion to $\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$.
The captured wildcard $\rwildcard{X}$ gets a fresh name and is stored in the wildcard environment of the \unify{} algorithm.
\textit{Note:} The constraint $\type{String} \lessdot \rwildcard{Y}$ is satisfied
because $\rwildcard{Y}$ has $\type{String}$ as lower bound.
2024-03-05 17:12:56 +01:00
For the example shown in listing \ref{shuffleExample} the method call \texttt{shuffle(l2d)} creates the constraints:
\begin{constraintset}
\begin{center}
$
\begin{array}{l}
\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}
\\
\hline
\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}
\\
\hline
\textit{Capture Conversion:}\
\exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\wtv{x}}}
\\
\hline
\textit{Solution:} \wtv{x} \doteq \rwildcard{X} \implies \exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\rwildcard{X}}}
\end{array}
$
\end{center}
\end{constraintset}
The method call \texttt{shuffle(l)} is invalid however,
because \texttt{l} has the type
$\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
There is no solution for the subtype constraint:
$\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}$
2024-03-05 17:12:56 +01:00
% No capture conversion for methods in the same class:
% Given two methods without type annotations like
2024-02-11 21:59:57 +01:00
% \begin{verbatim}
2024-03-05 17:12:56 +01:00
% // m :: () -> r
% m() = new List<String>() :? new List<Integer>();
% // id :: (a) -> a
% id(a) = a
2024-02-11 21:59:57 +01:00
% \end{verbatim}
2024-03-05 17:12:56 +01:00
% and a method call \texttt{id(m())} would lead to the constraints:
% $\exptype{List}{\type{String}} \lessdot \ntv{r},
% \exptype{List}{\type{Integer}} \lessdot \ntv{r},
% \ntv{r} \lessdotCC \ntv{a}$
% In this example capture conversion is not applicable,
% because the \texttt{id} method is not polymorphic.
% The type solution provided by \unify{} for this constraint set is:
% $\sigma(\ntv{r}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}},
% \sigma(\ntv{a}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}}$
2024-02-11 21:59:57 +01:00
% \begin{verbatim}
2024-03-05 17:12:56 +01:00
% List<?> m() = new List<String>() :? new List<Integer>();
% List<?> id(List<?> a) = a
2024-02-11 21:59:57 +01:00
% \end{verbatim}
% The following example has the \texttt{id} method already typed and the method \texttt{m}
% extended by a recursive call \texttt{id(m())}:
% \begin{verbatim}
% <A> List<A> id(List<A> a) = a
2024-03-05 17:12:56 +01:00
% m() = new List<String>() :? new List<Integer>() :? id(m());
% \end{verbatim}
% Now the constraints make use of a $\lessdotCC$ constraint:
% $\exptype{List}{\type{String}} \lessdot \ntv{r},
% \exptype{List}{\type{Integer}} \lessdot \ntv{r},
% \ntv{r} \lessdotCC \exptype{List}{\wtv{a}}$
% After substituting $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ for $\ntv{r}$ like in the example before
% we get the constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$.
% Due to the $\lessdotCC$ \unify{} is allowed to perform a capture conversion yielding
% $\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$.
% \textit{Note:} The wildcard placeholder $\wtv{a}$ is allowed to hold free variables whereas a normal placeholder like $\ntv{r}$
% is never assigned a type containing free variables.
% Therefore \unify{} sets $\wtv{a} \doteq \rwildcard{X}$, completing the constraint set and resulting in the type solution:
% \begin{verbatim}
% List<?> m() = new List<String>() :? new List<Integer>() :? id(m());
% \end{verbatim}
2024-02-11 21:59:57 +01:00
\subsection{\TamedFJ{} and \letfj{}}
%LetFJ -> Output language!
%TamedFJ -> ANF transformed input langauge
%Input language only described here. It is standard Featherweight Java
% we use the transformation to proof soundness. this could also be moved to the end.
% the constraint generation step assumes every method argument to be encapsulated in a let statement. This is the way Java is doing capture conversion
The input to our algorithm is a typeless version of Featherweight Java.
Methods are declared without parameter or return types.
We still keep type annotations for fields and generic class parameters.
This is a design choice by us,
as we consider them as data declarations which are given by the programmer.
% They are inferred in for example \cite{plue14_3b}
Note the \texttt{new} expression not requiring generic parameters,
which are also inferred by our algorithm.
The method call naturally has no generic parameters aswell.
We add the elvis operator ($\elvis{}$) to the syntax mainly to showcase applications involving wildcard types.
The syntax is described in figure \ref{fig:inputSyntax}.
\begin{figure}
$
\begin{array}{lrcl}
\hline
\text{Parameterized classes} & \mv N & ::= & \exptype{C}{\ol{T}} \\
\text{Types} & \type{S}, \type{T}, \type{U} & ::= & \type{X} \mid \wcNtype{\Delta}{N} \\
\text{Lower bounds} & \type{K}, \type{L} & ::= & \type{T} \mid \bot \\
\text{Type variable contexts} & \Delta & ::= & \overline{\wildcard{X}{T}{L}} \\
\text{Class declarations} & D & ::= & \texttt{class}\ \exptype{C}{\ol{X \triangleleft T}} \triangleleft \type{N} \set{\overline{\type{T}\ f}; \ol{M}} \\
\text{Method declarations} & \texttt{M} & ::= & \texttt{m}(\overline{x}) \{ \texttt{return}\ t; \} \\
\text{Terms} & t & ::= & x \\
& & \ \ | & \texttt{new} \ \type{C}(\overline{t})\\
& & \ \ | & t.f\\
& & \ \ | & t.\texttt{m}(\overline{t})\\
& & \ \ | & t \elvis{} t\\
\text{Variable contexts} & \Gamma & ::= & \overline{x:\type{T}}\\
\hline
\end{array}
$
\caption{Input Syntax}\label{fig:inputSyntax}
\end{figure}
The output is a valid Featherweight Java program.
We use the syntax of the version introduced in \cite{WildcardsNeedWitnessProtection}
calling it \letfj{} for that it is a Featherweight Java variant including \texttt{let} statements.
Our output syntax is shown in figure \ref{fig:outputSyntax}
which is actually a subset of \letfj{}, because we omit \texttt{null} types.
\begin{figure}
$
\begin{array}{lrcl}
\hline
\text{Parameterized classes} & \mv N & ::= & \exptype{C}{\ol{T}} \\
\text{Types} & \type{S}, \type{T}, \type{U} & ::= & \type{X} \mid \wcNtype{\Delta}{N} \\
\text{Lower bounds} & \type{K}, \type{L} & ::= & \type{T} \mid \bot \\
\text{Type variable contexts} & \Delta & ::= & \overline{\wildcard{X}{T}{L}} \\
\text{Class declarations} & D & ::= & \texttt{class}\ \exptype{C}{\ol{X \triangleleft T}} \triangleleft \type{N} \set{\overline{\type{T}\ f}; \ol{M}} \\
\text{Method declarations} & \texttt{M} & ::= & \generics{\overline{\type{X} \triangleleft \type{N}}}\type{T}\ \texttt{m}(\overline{\type{T}\ \expr{x}}) = \expr{e} \\
\text{Terms} & \expr{e} & ::= & \expr{x} \\
& & \ \ | & \texttt{new} \ \exptype{C}{\ol{T}}(\overline{\expr{e}})\\
& & \ \ | & \expr{e}.f\\
& & \ \ | & \expr{e}.\texttt{m}\generics{\ol{T}}(\overline{\expr{e}})\\
& & \ \ | & \texttt{let}\ \expr{x} : \wcNtype{\Delta}{N} = \expr{e} \ \texttt{in} \ \expr{e}\\
& & \ \ | & \expr{e} \elvis{} \expr{e}\\
\text{Variable contexts} & \Gamma & ::= & \overline{\expr{x}:\type{T}}\\
\hline
\end{array}
$
\caption{Output Syntax}\label{fig:outputSyntax}
\end{figure}
The output of our type inference algorithm is a valid \letfj{} program.
Before generating constraints the input is transformed to \TamedFJ{}.
After adding the missing type annotations the resulting program is a valid \letfj{} program.
%This is shown in chapter \ref{sec:soundness}
Capture conversion is only needed for wildcard types,
but we don't know which expressions will spawn wildcard types because there are no type annotations yet.
We preemptively enclose every expression in a let statement before using it as a method argument.
We need the let statements to deal with possible Wildcard types.
The syntax used in our examples is the standard Featherweight Java syntax.
2024-02-11 21:59:57 +01:00
\subsection{Challenges}\label{challenges}
2024-02-11 21:59:57 +01:00
%TODO: Wildcard subtyping is infinite see \cite{TamingWildcards}
2024-02-06 18:04:31 +01:00
The introduction of wildcards adds additional challenges.
% we cannot replace every type variable with a wildcard
Type variables can also be used as type parameters, for example
$\exptype{List}{String} \lessdot \exptype{List}{\tv{a}}$.
A problem arises when replacing type variables with wildcards.
% Wildcards are not reflexive.
% ( on the equals property ), every wildcard has to be capture converted when leaving its scope
% do not substitute free type variables
Lets have a look at two examples:
\begin{itemize}
\item \begin{example} \label{intro-example1}
The first one is a valid Java program.
The type \texttt{List<? super String>} is \textit{capture converted} to a fresh type variable $\rwildcard{X}$
which is used as the generic method parameter \texttt{A}.
Java uses capture conversion to replace the generic \texttt{A} by a capture converted version of the
\texttt{? super String} wildcard.
Knowing that the type \texttt{String} is a subtype of any type the wildcard \texttt{? super String} can inherit
it is safe to pass \texttt{"String"} for the first parameter of the function.
\begin{verbatim}
2024-03-05 17:12:56 +01:00
<A> List<A> add(List<A> l, A v) {}
2024-03-05 17:12:56 +01:00
List<? super String> list = ...;
add(list, "String");
\end{verbatim}
\end{example}
\item \begin{example}\label{intro-example2}
This example displays an incorrect Java program.
The method call to \texttt{concat} with two wildcard lists is unsound.
Each list could be of a different kind and therefore the \texttt{concat} cannot succeed.
\begin{verbatim}
2024-03-05 17:12:56 +01:00
<A> List<A> concat(List<A> l1, List<A> l2) { ... }
2024-03-05 17:12:56 +01:00
List<?> list = ... ;
concat(list, list);
\end{verbatim}
2024-03-05 17:12:56 +01:00
% $\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}, \\
% \wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$
\end{example}
2024-03-05 17:12:56 +01:00
% \item
% \unify{} morphs a constraint set into a correct type solution
% gradually assigning types to type placeholders during that process.
% Solved constraints are removed and never considered again.
% In the following example \unify{} solves the constraint generated by the expression
% \texttt{l.add(l.head())} first, which results in $\ntv{l} \lessdot \exptype{List}{\wtv{a}}$.
% \begin{verbatim}
% anyList() = new List<String>() :? new List<Integer>()
2024-02-14 19:19:03 +01:00
2024-03-05 17:12:56 +01:00
% add(anyList(), anyList().head());
% \end{verbatim}
% The type for \texttt{l} can be any kind of list, but it has to be a invariant one.
% Assigning a \texttt{List<?>} for \texttt{l} is unsound, because the type list hiding behind
% \texttt{List<?>} could be a different one for the \texttt{add} call than the \texttt{head} method call.
% An additional constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$
% is solved by removing the wildcard $\rwildcard{X}$ if possible.
2024-02-14 19:19:03 +01:00
\item \textbf{No principal method type:}
We tried to skip capture conversion and the capture constraints entirely.
But \letfj{}'s type system does not imply a principal typing for methods \cite{principalTypes}.
The problem is that a principal type of a method should have the most general parameter types and the most specific return type.
\begin{lstlisting}[caption=Return type depends on argument types,label=principalTypeExample]
2024-03-27 01:55:19 +01:00
class SpecialPair<X, Y extends X> extends Pair<X,Y>{}
<X,Y> Pair<X,Y> id(Pair<X,Y> in){
return in;
}
<X, Y extends X> void receive(Pair<X,Y> in){}
Pair<?, ?> l;
SpecialPair<?,?> lSpecial;
id(l); // this has to work
receive(id(lSpecial)); // and this too
\end{lstlisting}
By means of subtyping we are able to create types like
$\wctype{\rwildcard{X}, \wildcard{Y}{\rwildcard{X}}{\bot}}{Pair}{\rwildcard{X}, \rwildcard{Y}}$
as a direct supertype of \texttt{SpecialPair<?,?>},
which is now compatible with the \texttt{receive} method.
The call \texttt{receive(id(lSpecial))} is correct whereas the return type of the \texttt{id} method
does not imply so.
If we look at this in the context of global type inference and assume that there are no type annotations for \texttt{l} and \texttt{lSpecial}.
We can neither apply capture conversion during the constraint generation step, because the parameter types are not known yet
and we also can't assume a most generic type for the parameter types, because the the needed return type is not known either.
Take the example in figure \ref{fig:principalTI}.
As soon as the type of $\tv{lSpecial}$ is resolved \unify{} can determine the type of $\tv{id}$
and then solve the constraints $\tv{id} \lessdotCC \exptype{Pair}{\wtv{e}, \wtv{f}}, \wtv{f} \lessdot \wtv{e}$.
%TODO: explain how type variables are named after their respective variables and methods
Capture Conversion cannot be applied before the argument type ($\tv{lSpecial}$ in this case) is known.
Trying to give $\tv{lSpecial}$ a most general type like \texttt{Pair<?,?>} won't work either (see listing \ref{principalTypeExample}).
\begin{figure}
\begin{minipage}{0.40\textwidth}
\begin{lstlisting}[style=tfgj]
// l and lSpecial are untyped
id(l);
receive(id(lSpecial));
\end{lstlisting}
\end{minipage}%
\hfill
\begin{minipage}{0.59\textwidth}
\begin{constraintset}
$
\tv{l} \lessdotCC \exptype{Pair}{\wtv{a}, \wtv{b}}, \\
\tv{lSpecial} \lessdotCC \exptype{Pair}{\wtv{c}, \wtv{d}},
\exptype{Pair}{\wtv{c}, \wtv{d}} \lessdot \tv{id}, \\
\tv{id} \lessdotCC \exptype{Pair}{\wtv{e}, \wtv{f}},
\wtv{f} \lessdot \wtv{e}
$
\end{constraintset}
\end{minipage}
\caption{Principal Type problem}\label{fig:principalTI}
\end{figure}
% TODO: make Unify to resolve C<X> <. a as a =. X.C<X>
% A method has infinte possibilities of being called and there is no most general type.
% P<X,Y> m(C<X> c, C<Y> c2)
% depending on
% A.P<A,A> <. A,B.P<A,B>
% % During the method call it is not sure what kind of return type is needed from the method.
% % The method P<A,B> make(A a, B b)
% % The type A,B.P<A,B> cannot be a subtype of P<X,X>
% % But if A^X_X and B^X_X
% % Could a constraint C<X> <. a? be expanded to a? = X^u?_l?.C<X>
2024-02-14 19:19:03 +01:00
\end{itemize}
%TODO: Move this part. or move the third challenge some underneath.
The \unify{} algorithm only sees the constraints with no information about the program they originated from.
2024-03-05 17:12:56 +01:00
The main challenge was to find an algorithm which computes $\sigma(\wtv{a}) = \rwildcard{X}$ for example \ref{intro-example1} but not for example \ref{intro-example2}.
%TODO
% The goal is to proof soundness in respect to the type rules introduced by \cite{aModelForJavaWithWildcards}
% and \cite{WildcardsNeedWitnessProtection}.
% \subsection{Capture Conversion}
% The \texttt{let} statements in \TamedFJ{} act as capture conversion for wildcard types.
% \begin{figure}
% \begin{minipage}{0.45\textwidth}
% \begin{lstlisting}[style=tfgj]
% <X> List<X> clone(List<X> l);
% example(p){
% return clone(p);
% }
% \end{lstlisting}
% \end{minipage}%
% \hfill
% \begin{minipage}{0.5\textwidth}
% \begin{lstlisting}[style=letfj]
% <X> List<X> clone(List<X> l);
% (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) example((*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) p) =
% let x : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) = p in
% clone(x) : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*);
% \end{lstlisting}
% \end{minipage}
% \caption{Type inference adding capture conversion}\label{fig:addingLetExample}
% \end{figure}
% Figure \ref{fig:addingLetExample} shows a let statement getting added to the typed output.
% The method \texttt{clone} cannot be called with the type $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$.
% After a capture conversion \texttt{x} has the type $\exptype{List}{\rwildcard{X}}$ with $\rwildcard{X}$ being a free variable.
% Afterwards we have to find a supertype of $\exptype{List}{\rwildcard{X}}$, which does not contain free variables
% ($\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ in this case).
% During the constraint generation step most types are not known yet and are represented by a type placeholder.
% During a methodcall like the one in the \texttt{example} method in figure \ref{fig:ccExample} the type of the parameter \texttt{p}
% is not known yet.
% The type \texttt{List<?>} would be one possibility as a parameter type for \texttt{p}.
% To make wildcards work for our type inference algorithm \unify{} has to apply capture conversions if necessary.
% The type placeholder $\tv{r}$ is the return type of the \texttt{example} method.
% One possible type solution is $\tv{p} \doteq \tv{r} \doteq \wctype{\rwildcard{X}}{List}{\rwildcard{X}}$,
% which leads to:
% \begin{verbatim}
% List<?> example(List<?> p){
% return clone(p);
% }
% \end{verbatim}
% But by substituting $\tv{p} \doteq \wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ in the constraint
% $\tv{p} \lessdotCC \exptype{List}{\wtv{x}}$ leads to
% $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{x}}$.
% To make this typing possible we have to introduce a capture conversion via a let statement:
% $\texttt{return}\ (\texttt{let}\ \texttt{x} : \wctype{\rwildcard{X}}{List}{\rwildcard{X}} = \texttt{p}\ \texttt{in} \
% \texttt{clone}\generics{\rwildcard{X}}(x) : \wctype{\rwildcard{X}}{List}{\rwildcard{X}})$
% Inside the let statement the variable \texttt{x} has the type $\exptype{List}{\rwildcard{X}}$
% This spawns additional problems.
% \begin{figure}
% \begin{minipage}{0.45\textwidth}
% \begin{verbatim}
% <X> List<X> clone(List<X> l){...}
% example(p){
% return clone(p);
% }
% \end{verbatim}
% \end{minipage}%
% \hfill
% \begin{minipage}{0.35\textwidth}
% \begin{constraintset}
% \textbf{Constraints:}\\
% $
% \tv{p} \lessdotCC \exptype{List}{\wtv{x}}, \\
% \tv{p} \lessdot \tv{r}, \\
% \tv{p} \lessdot \type{Object},
% \tv{r} \lessdot \type{Object}
% $
% \end{constraintset}
% \end{minipage}
% \caption{Type inference example}\label{fig:ccExample}
% \end{figure}
% In addition with free variables this leads to unwanted behaviour.
% Take the constraint
% $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$ for example.
% After a capture conversion from $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ to $\exptype{List}{\rwildcard{Y}}$ and a substitution $\wtv{a} \doteq \rwildcard{Y}$
% we get
% $\exptype{List}{\rwildcard{Y}} \lessdot \exptype{List}{\rwildcard{Y}}$.
% Which is correct if we apply capture conversion to the left side:
% $\exptype{List}{\rwildcard{X}} <: \exptype{List}{\rwildcard{X}}$
% If the input constraints did not intend for this constraint to undergo a capture conversion then \unify{} would produce an invalid
% type solution due to:
% $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \nless: \exptype{List}{\rwildcard{X}}$
% The reason for this is the \texttt{S-Exists} rule's premise
% $\text{dom}(\Delta') \cap \text{fv}(\exptype{List}{\rwildcard{X}}) = \emptyset$.
% Capture constraints cannot be stored in a set.
% $\wtv{a} \lessdotCC \wtv{b}$ is not the same as $\wtv{a} \lessdotCC \wtv{b}$.
% Both constraints will end up the same after a substitution for both placeholders $\tv{a}$ and $\tv{b}$.
% But afterwards a capture conversion is applied, which can generate different types on the left sides.
% \begin{itemize}
% \item $\text{CC}(\wctype{\rwildcard{X}}{List}{\rwildcard{X}}) \implies \exptype{List}{\rwildcard{Y}}$
% \item $\text{CC}(\wctype{\rwildcard{X}}{List}{\rwildcard{X}}) \implies \exptype{List}{\rwildcard{Z}}$
% \end{itemize}
%
% Wildcards are not reflexive. A box of type $\wctype{\rwildcard{X}}{Box}{\rwildcard{X}}$
% is able to hold a value of any type. It could be a $\exptype{Box}{String}$ or a $\exptype{Box}{Integer}$ etc.
% Also two of those boxes do not necessarily contain the same type.
% But there are situations where it is possible to assume that.
% For example the type $\wctype{\rwildcard{X}}{Pair}{\exptype{Box}{\rwildcard{X}}, \exptype{Box}{\rwildcard{X}}}$
% is a pair of two boxes, which always contain the same type.
% Inside the scope of the \texttt{Pair} type, the wildcard $\rwildcard{X}$ stays the same.