Ecoop2024_TIforWildFJ/introduction.tex

926 lines
41 KiB
TeX
Raw Normal View History

2024-02-28 17:51:41 +01:00
%TODO:
% -Explain <c
% - Kapitel 1.2 + 1.3 ist noch zu komplex
% - Constraints und Unifikation erklären, bevor Unifikation erwähnt wird
2024-04-11 18:02:25 +02:00
\section{Motivation}
2024-04-17 12:59:31 +02:00
Java already uses type inference when it comes to method invocations, local variables or lambda expressions.
The crucial part for all of this use cases is the surrounding context.
The invocation of \texttt{emptyList} missing type parameters can be added by Java's local type inference
because the expected return type is known. \texttt{List<String>} in this case.
\begin{lstlisting}[caption=Extract of a valid Java program]
<T> List<T> emptyList() { ... }
List<String> ls = emptyList();
\end{lstlisting}
2024-04-19 11:47:21 +02:00
Local type inference is based on matching and not unification \cite{javaTIisBroken}.
2024-04-17 16:45:24 +02:00
When calling the \texttt{emptyList} method without context its return value will be set to a \texttt{List<Object>}.
The following Java code snippet is incorrect, because \texttt{emptyList()} returns
2024-04-21 13:09:13 +02:00
a \texttt{List<Object>} instead of the required \texttt{List<List<String>>}.
2024-04-17 12:59:31 +02:00
\begin{verbatim}
2024-04-17 16:45:24 +02:00
emptyList().add(new List<String>())
.get(0)
.get(0);
2024-04-17 12:59:31 +02:00
\end{verbatim}
2024-04-17 16:45:24 +02:00
%List<String> <. A
%List<A> <: List<B>, B <: List<C>
% B = A and therefore A on the left and right side of constraints.
% this makes matching impossible
2024-04-19 11:47:21 +02:00
The return type of the first \texttt{get} method is based on the return type of
\texttt{emptyList},
but the second call to \texttt{get} is only possible if \texttt{emptyList} returns
a list of lists.
The local type inference algorithm based on matching cannot produce this solution.
Here a type inference algorithm based on unification is needed.
2024-04-17 16:45:24 +02:00
\begin{verbatim}
this.<List<String>>emptyList().add(new List<String>())
.get(0)
.get(0);
\end{verbatim}
2024-04-24 08:33:58 +02:00
$
2024-04-24 17:30:15 +02:00
\begin{array}[b]{c}
\begin{array}[b]{c}
\texttt{emptyList()} : \exptype{List}{\exptype{List}{\type{String}}}
2024-04-24 08:33:58 +02:00
\\
\hline
2024-04-24 17:30:15 +02:00
\texttt{emptyList().get()} : \exptype{List}{\type{String}}
\end{array} \rulenameAfter{T-Call}
\quad \generics{\type{X}}\exptype{List}{\type{X}} \to \type{X} \in \mtypeEnvironment{}(\texttt{get})1
\\
\hline
\texttt{emptyList().get().get()} : \type{String}
\end{array} \rulenameAfter{T-Call}
2024-04-24 08:33:58 +02:00
$
2024-04-21 13:09:13 +02:00
%motivate constraints:
To solve this example we will create constraints
$
\exptype{List}{\tv{a}} \lessdot \tv{b},
\tv{b} \lessdot \exptype{List}{\tv{c}},
\tv{c} \lessdot \exptype{List}{\tv{d}}
$ TODO
2024-04-19 11:47:21 +02:00
Local type inference \cite{javaTIisBroken} is able to find a type substitution $\sigma$ satisfying
$\overline{\type{A} <: \sigma(\type{F}) }, \sigma(\type{R}) <: \type{E}$.
It is important that $\overline{\type{A}}$ and $\type{E}$ are given types and do not contain
type placeholders already used in $\overline{\type{F}}$.
2024-04-16 19:27:56 +02:00
2024-04-19 11:47:21 +02:00
The type parameters are not allowed on the left side of $A <: F$
We can generate method calls where this is the case. The second call to \texttt{get}.
2024-04-16 19:27:56 +02:00
2024-04-19 11:47:21 +02:00
% Local type inference cannot deal with type inference during the algorithm.
% If the left side contains unknown type parameters.
% \begin{verbatim}
% import java.util.ArrayList;
% import java.util.stream.*;
% class Test {
% void test(){
% var s = new ArrayList<String>().stream().map(i -> 1);
% receive(s);
% }
% void receive(Stream<Object> l){}
% }
% \end{verbatim}
% TypeError:
% \begin{verbatim}
% void test(){
% var l = new ArrayList<String>();
% l.add("hi");
% var s = l.stream().map(i -> 1).collect(Collectors.toList());
% var s2 = l.stream().map(i -> "String").collect(Collectors.toList());
% receive(s, s2);
% }
% <A> void receive(List<A> l, List<A> l2){}
% \end{verbatim}
% Correct:
% \begin{verbatim}
% void test(){
% var l = new ArrayList<String>();
% l.add("hi");
% List<Object> s = l.stream().map(i -> 1).collect(Collectors.toList());
% List<Object> s2 = l.stream().map(i -> "String").collect(Collectors.toList());
% receive(s, s2);
% }
% <A> void receive(List<A> l, List<A> l2){}
% \end{verbatim}
% Error:
% \begin{verbatim}
% rrr(this.emptyBox().set(this.<Integer>emptyBox()).set(this.<String>emptyBox()));
% \end{verbatim}
% Correct:
% \begin{verbatim}
% rrr(this.<Box<?>>emptyBox().set(this.<Integer>emptyBox()).set(this.<String>emptyBox()));
% \end{verbatim}
2024-04-17 12:59:31 +02:00
2024-04-17 00:11:16 +02:00
% \begin{recap}{Java Local Type Inference}
% Local type inference is able to solve constraints of the form
% T <. b, b <. T where T are given types
% \end{recap}
% Local Type inference cannot infer F-Bounded types (TODO: we can, right?)
\section{Type Inference for Java}
%The goal is to find a correct typing for a given Java program.
Type inference for Java has many use cases and could be used to help programmers by inserting correct types for them,
finding better type solutions for already typed Java programs (for example more generical ones),
or allowing to write typeless Java code which is then type infered and thereby type checked by our algorithm.
The algorithm proposed in this paper can determine a correct typing for the untyped Java source code example shown in figure \ref{fig:intro-example-typeless}.
Our algorithm is also capable of finding solutions involving wildcards as shown in figure \ref{fig:intro-example-typed}.
%This paper extends a type inference algorithm for Featherweight Java \cite{TIforFGJ} by adding wildcards.
%The last step to create a type inference algorithm compatible to the Java type system.
2024-03-13 18:51:01 +01:00
The algorithm presented in this paper is a improved version of the one in \cite{TIforFGJ} including wildcard support.
%a modified version of the \unify{} algorithm presented in \cite{plue09_1}.
The input to the type inference algorithm is a Featherweight Java program (example in figure \ref{fig:nested-list-example-typeless}) conforming to the syntax shown in figure \ref{fig:syntax}.
The \fjtype{} algorithm calculates constraints based on this intermediate representation,
which are then solved by the \unify{} algorithm
resulting in a correctly typed program (see figure \ref{fig:nested-list-example-typed}).
2024-02-07 10:28:28 +01:00
\begin{itemize}
\item
We introduce the language \tifj{} (chapter \ref{sec:tifj}).
A Featherweight Java derivative including Generics, Wildcards and Type Inference.
\item
2024-02-11 21:59:57 +01:00
We support capture conversion and Java style method calls.
This requires existential types in a form which is not denotable by Java syntax \cite{aModelForJavaWithWildcards}.
\item
2024-02-11 21:59:57 +01:00
We present a novel approach to deal with existential types and capture conversion during constraint unification.
\item The algorithm is split in two parts. A constraint generation step and an unification step.
Input language and constraint generations can be extended without altering the unification part.
\item
2024-02-11 21:59:57 +01:00
We proof soundness and aim for a good compromise between completeness and time complexity.
\end{itemize}
2024-03-05 17:12:56 +01:00
% Our algorithm finds a correct type solution for the following example, where the Java local type inference fails:
% \begin{verbatim}
% class SuperPair<A,B>{
% A a;
% B b;
% }
% class Pair<A,B> extends SuperPair<B,A>{
% A a;
% B b;
2024-02-11 21:59:57 +01:00
2024-03-05 17:12:56 +01:00
% <X> X choose(X a, X b){ return b; }
2024-02-11 21:59:57 +01:00
2024-03-05 17:12:56 +01:00
% String m(List<? extends Pair<Integer, String>> a, List<? extends Pair<Integer, String>> b){
% return choose(choose(a,b).value.a,b.value.b);
% }
% }
% \end{verbatim}
2024-02-11 21:59:57 +01:00
\begin{figure}%[tp]
2024-02-06 18:04:31 +01:00
\begin{subfigure}[t]{\linewidth}
\begin{lstlisting}[style=fgj]
class List<A> {
List<A> add(A v) { ... }
}
class Example {
m(l, la, lb){
return l
.add(la.add(1))
.add(lb.add("str"));
}
}
\end{lstlisting}
2024-04-05 12:24:42 +02:00
\caption{Java method missing argument and return types}
2024-02-06 18:04:31 +01:00
\label{fig:nested-list-example-typeless}
\end{subfigure}
~
% \begin{subfigure}[t]{\linewidth}
% \begin{lstlisting}[style=tfgj]
% class List<A> {
% List<A> add(A v) { ... }
% }
% class Example {
% m(l, la, lb){
% return let r2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$@*) = {
% let r1 : (*@$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$@*) = l in {
% let p1 : (*@$\exptype{List}{\type{Integer}}$@*) = {
% let xa = la in xa.add(1)
% } in x1.add(p1)
% } in {
% let p2 = {
% let xb = lb in xb.add("str")
% } in x2.add(p2)
% };
% }
% }
% \end{lstlisting}
% \caption{Featherweight Java Representation}
% \label{fig:nested-list-example-let}
% \end{subfigure}
% ~
2024-02-06 18:04:31 +01:00
\begin{subfigure}[t]{\linewidth}
\begin{lstlisting}[style=tfgj]
class List<A> {
List<A> add(A v) { ... }
}
class Example {
m(List<List<? extends Object>> l, List<Integer> la, List<String> lb){
return l
.add(la.add(1))
.add(lb.add("str"));
}
}
\end{lstlisting}
\caption{Java Representation}
\label{fig:nested-list-example-typed}
\end{subfigure}
%\caption{Example code}
%\label{fig:intro-example-code}
\end{figure}
\begin{figure}%[tp]
\begin{subfigure}[t]{0.49\linewidth}
\begin{lstlisting}[style=fgj]
genList() {
if( ... ) {
return new List().add(1);
} else {
return new List().add("String");
}
}
\end{lstlisting}
\caption{Java method with missing return type}
\label{fig:intro-example-typeless}
\end{subfigure}
~
\begin{subfigure}[t]{0.49\linewidth}
\begin{lstlisting}[style=tfgj]
List<?> genList() {
if( ... ) {
return new Box<Integer>(1);
} else {
return new Box<String>("String");
}
}
\end{lstlisting}
\caption{Correct type}
\label{fig:intro-example-typed}
\end{subfigure}
%\caption{Example code}
%\label{fig:intro-example-code}
2024-01-31 18:05:22 +01:00
\end{figure}
% \subsection{Wildcards}
% Java subtyping involving generics is invariant.
% For example \texttt{List<String>} is not a subtype of \texttt{List<Object>}.
% %Wildcards introduce variance by allowing \texttt{List<String>} to be a subtype of \texttt{List<?>}.
% \texttt{List<Object>} is not a valid return type for the method \texttt{genList}.
% The type inference algorithm has to find the correct type involving wildcards (\texttt{List<?>}).
\subsection{Java Wildcards}
Wildcards are expressed by a \texttt{?} in Java and can be used as type parameters.
Wildcards add variance to Java type parameters.
2024-02-14 19:19:03 +01:00
Generally Java has invariant subtyping for polymorphic types.
A \texttt{List<String>} is not a subtype of \texttt{List<Object>} for example
even though it seems intuitive with \texttt{String} being a subtype of \texttt{Object}.
2024-02-14 19:19:03 +01:00
Wildcards can be formalized as existential types \cite{WildFJ}.
\texttt{List<? extends Object>} and \texttt{List<? super String>} are both wildcard types
denoted in our syntax by $\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$ and
$\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$.
The syntax used here allows for wildcard parameters to have a name, an uppper and lower bound at the same time,
and a type they are bound to.
In this case the name is $\rwildcard{X}$ and it's bound to the the type \texttt{List}.
2024-02-11 21:59:57 +01:00
Those properties are needed to formalize capture conversion.
Polymorphic method calls need to be wraped in a process which \textit{opens} existential types \cite{addingWildcardsToJava}.
In Java this is done implicitly in a process called capture conversion.
%show input and a correct letFJ representation
%TODO: first show local type inference and explain lessdotCC constraints. then show example with global TI
\begin{figure}[h]
\begin{subfigure}[t]{0.47\textwidth}
\centering
\begin{lstlisting}[style=tfgj, caption=Valid Java program, label=lst:addExample]
<A> List<A> add(List<A> l, A v) ...
List<? super String> l = ...;
add(l, "String");
\end{lstlisting}
\end{subfigure}\hfill
\begin{subfigure}[t]{0.47\textwidth}
\centering
\begin{lstlisting}[style=letfj, caption=\letfj{} representation, label=lst:addExampleLet]
<A> List<A> add(List<A> l, A v)
let l2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$@*) = l
in <X>add(l2, "String");
\end{lstlisting}
\end{subfigure}
\end{figure}
The Example in listing \ref{lst:addExample} is a valid Java program. Here Java uses local type inference \cite{JavaLocalTI}
to determine the type parameters to the \texttt{add} method call.
In \letfj{} there is no local type inference and all type parameters for a method call are mandatory (see listing \ref{lst:addExampleLet}).
If wildcards are involved the so called capture conversion has to be done manually via let statements.
%A let statement \emph{opens} an existential type.
In the body of the let statement the \textit{capture type} $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$
becomes $\exptype{List}{\rwildcard{X}}$ and the wildcard $\wildcard{X}{\type{Object}}{\type{String}}$ is free and can be used as
a type parameter to \texttt{<X>add(...)}.
%This is a valid Java program where the type parameters for the polymorphic method \texttt{add}
%are determined by local type inference.
One problem is the divergence between denotable and expressable types in Java \cite{semanticWildcardModel}.
A wildcard in the Java syntax has no name and is bound to its enclosing type:
$\exptype{List}{\exptype{List}{\type{?}}}$ equates to $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
During type checking \emph{intermediate types}
%like $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$
%or $\wctype{\rwildcard{X}}{Pair}{\rwildcard{X}, \rwildcard{X}}$
can emerge, which have no equivalent in the Java syntax.
%Our type inference algorithm uses existential types internally but spawns type solutions compatible with Java.
Example: % This program is not typable with the Type Inference algorithm from Plümicke
\begin{lstlisting}[style=java,label=shuffleExample,caption=Intermediate Types Example]
class List<X> extends Object {...}
class List2D<X> extends List<List<X>> {...}
<X> void shuffle(List<List<X>> list) {...}
List<List<?>> l = ...;
List2D<?> l2d = ...;
shuffle(l); // Error
shuffle(l2d); // Valid
\end{lstlisting}
Java is using local type inference to allow method invocations which are not describable with regular Java types.
The \texttt{shuffle} method in this case is invoked with the type $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$
which is a subtype of $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$.
After capture conversion \texttt{l2d'} has the type $\exptype{List}{\exptype{List}{\rwildcard{X}}}$
and \texttt{shuffle} can be invoked with the type parameter $\rwildcard{X}$:
\begin{lstlisting}[style=letfj]
let l2d' : (*@$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$@*) = l2d in <X>shuffle(l2d')
\end{lstlisting}
2024-03-05 17:12:56 +01:00
\subsection{Global Type Inference}
\begin{description}
\item[input] \tifj{} program
\item[output] type solution
\item[postcondition] the type solution applied to the input must yield a valid \letfj{} program
\end{description}
The input to our type inference algorithm is a modified version of the \letfj{}\cite{WildcardsNeedWitnessProtection} calculus (see chapter \ref{sec:tifj}).
First \fjtype{} generates constraints
and afterwards \unify{} computes a solution for the given constraint set.
2024-03-05 17:12:56 +01:00
Constraints consist out of subtype constraints $(\type{T} \lessdot \type{T})$ and capture constraints $(\type{T} \lessdotCC \type{T})$.
\textit{Note:} a type $\type{T}$ can either be a named type, a type placeholder or a wildcard type placeholder.
2024-03-05 17:12:56 +01:00
A subtype constraint is satisfied if the left side is a subtype of the right side according to the rules in figure \ref{fig:subtyping}.
\textit{Example:} $\exptype{List}{\ntv{a}} \lessdot \exptype{List}{\type{String}}$ is fulfilled by replacing type placeholder $\ntv{a}$ with the type $\type{String}$.
Subtype constraints and type placeholders act the same as the ones used in \emph{Type Inference for Featherweight Generic Java} \cite{TIforFGJ}.
2024-03-05 17:12:56 +01:00
The novel capture constraints and wildcard placeholders are needed for method invocations involving wildcards.
\begin{recap}\textbf{TI for FGJ without Wildcards:}
\TFGJ{} generates subtype constraints $(\type{T} \lessdot \type{T})$ consisting of named types and type placeholders.
For example the method invocation \texttt{concat(l, new Object())} generates the constraints
$\tv{l} \lessdot \exptype{List}{\tv{a}}, \type{Object} \lessdot \tv{a}$.
Subtyping without the use of wildcards is invariant \cite{FJ}:
Therefore the only possible solution for the type placeholder $\tv{a}$ is $\tv{a} \doteq \type{Object}$. % in Java and Featherweight Java.
The correct type for the variable \texttt{l} is $\exptype{List}{\type{Object}}$ (or a direct subtype).
%- usually the type of e must be subtypes of the method parameters
%- in case of a polymorphic method: type placeholders resemble type parameters
The type inference algorithm for Featherweight Generic Java \cite{TIforFGJ} (called \TFGJ{}) is complete and sound:
It is able to find a type solution for a Featherweight Generic Java program, which has no type annotations at all,
if there is any.
It's only restriction is that no polymorphic recursion is allowed.
\end{recap}
%
2024-03-05 17:12:56 +01:00
Lets have a look at the constraints generated by \fjtype{} for the example in listing \ref{lst:addExample}:
\begin{constraintset}
\begin{center}
$\begin{array}{c}
2024-03-05 17:12:56 +01:00
\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}, \, \type{String} \lessdotCC \wtv{a}
\\
\hline
\textit{Capture Conversion:}\ \wildcard{Y}{\type{Object}}{\type{String}} \wcSep \exptype{List}{\rwildcard{Y}} \lessdot \exptype{List}{\wtv{a}}, \, \type{String} \lessdot \wtv{a}
2024-03-05 17:12:56 +01:00
\\
\hline
\textit{Solution:}\ \wtv{a} \doteq \rwildcard{Y} \implies \wildcard{Y}{\type{Object}}{\type{String}} \wcSep \exptype{List}{\rwildcard{Y}} \lessdot \exptype{List}{\rwildcard{Y}}, \, \type{String} \lessdot \rwildcard{Y}
2024-03-05 17:12:56 +01:00
\end{array}
$
\end{center}
2024-03-05 17:12:56 +01:00
\end{constraintset}
%
Capture Constraints $(\wctype{\rwildcard{X}}{C}{\rwildcard{X}} \lessdotCC \type{T})$ allow for a capture conversion,
which converts a constraint of the form $(\wctype{\rwildcard{X}}{C}{\rwildcard{X}} \lessdotCC \type{T})$ to $(\exptype{C}{\rwildcard{X}} \lessdot \type{T})$
%These constraints are used at places where a capture conversion via let statement can be added.
2024-03-05 17:12:56 +01:00
%Why do we need the lessdotCC constraints here?
The type of \texttt{l} can be capture converted by a let statement if needed (see listing \ref{lst:addExampleLet}).
Therefore we assign the constraint $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$
which allows \unify{} to do a capture conversion to $\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$.
The captured wildcard $\rwildcard{X}$ gets a fresh name and is stored in the wildcard environment of the \unify{} algorithm.
\textit{Note:} The constraint $\type{String} \lessdot \rwildcard{Y}$ is satisfied
because $\rwildcard{Y}$ has $\type{String}$ as lower bound.
2024-03-05 17:12:56 +01:00
For the example shown in listing \ref{shuffleExample} the method call \texttt{shuffle(l2d)} creates the constraints:
\begin{constraintset}
\begin{center}
$
\begin{array}{l}
\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}
\\
\hline
\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}
\\
\hline
\textit{Capture Conversion:}\
\exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\wtv{x}}}
\\
\hline
\textit{Solution:} \wtv{x} \doteq \rwildcard{X} \implies \exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\rwildcard{X}}}
\end{array}
$
\end{center}
\end{constraintset}
The method call \texttt{shuffle(l)} is invalid however,
because \texttt{l} has the type
$\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
There is no solution for the subtype constraint:
$\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}$
2024-03-05 17:12:56 +01:00
% No capture conversion for methods in the same class:
% Given two methods without type annotations like
2024-02-11 21:59:57 +01:00
% \begin{verbatim}
2024-03-05 17:12:56 +01:00
% // m :: () -> r
% m() = new List<String>() :? new List<Integer>();
% // id :: (a) -> a
% id(a) = a
2024-02-11 21:59:57 +01:00
% \end{verbatim}
2024-03-05 17:12:56 +01:00
% and a method call \texttt{id(m())} would lead to the constraints:
% $\exptype{List}{\type{String}} \lessdot \ntv{r},
% \exptype{List}{\type{Integer}} \lessdot \ntv{r},
% \ntv{r} \lessdotCC \ntv{a}$
% In this example capture conversion is not applicable,
% because the \texttt{id} method is not polymorphic.
% The type solution provided by \unify{} for this constraint set is:
% $\sigma(\ntv{r}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}},
% \sigma(\ntv{a}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}}$
2024-02-11 21:59:57 +01:00
% \begin{verbatim}
2024-03-05 17:12:56 +01:00
% List<?> m() = new List<String>() :? new List<Integer>();
% List<?> id(List<?> a) = a
2024-02-11 21:59:57 +01:00
% \end{verbatim}
% The following example has the \texttt{id} method already typed and the method \texttt{m}
% extended by a recursive call \texttt{id(m())}:
% \begin{verbatim}
% <A> List<A> id(List<A> a) = a
2024-03-05 17:12:56 +01:00
% m() = new List<String>() :? new List<Integer>() :? id(m());
% \end{verbatim}
% Now the constraints make use of a $\lessdotCC$ constraint:
% $\exptype{List}{\type{String}} \lessdot \ntv{r},
% \exptype{List}{\type{Integer}} \lessdot \ntv{r},
% \ntv{r} \lessdotCC \exptype{List}{\wtv{a}}$
% After substituting $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ for $\ntv{r}$ like in the example before
% we get the constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$.
% Due to the $\lessdotCC$ \unify{} is allowed to perform a capture conversion yielding
% $\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$.
% \textit{Note:} The wildcard placeholder $\wtv{a}$ is allowed to hold free variables whereas a normal placeholder like $\ntv{r}$
% is never assigned a type containing free variables.
% Therefore \unify{} sets $\wtv{a} \doteq \rwildcard{X}$, completing the constraint set and resulting in the type solution:
% \begin{verbatim}
% List<?> m() = new List<String>() :? new List<Integer>() :? id(m());
% \end{verbatim}
2024-02-11 21:59:57 +01:00
\subsection{\TamedFJ{} and \letfj{}}
%LetFJ -> Output language!
%TamedFJ -> ANF transformed input langauge
%Input language only described here. It is standard Featherweight Java
% we use the transformation to proof soundness. this could also be moved to the end.
% the constraint generation step assumes every method argument to be encapsulated in a let statement. This is the way Java is doing capture conversion
The input to our algorithm is a typeless version of Featherweight Java.
Methods are declared without parameter or return types.
We still keep type annotations for fields and generic class parameters.
This is a design choice by us,
as we consider them as data declarations which are given by the programmer.
% They are inferred in for example \cite{plue14_3b}
Note the \texttt{new} expression not requiring generic parameters,
which are also inferred by our algorithm.
The method call naturally has no generic parameters aswell.
We add the elvis operator ($\elvis{}$) to the syntax mainly to showcase applications involving wildcard types.
2024-04-17 00:11:16 +02:00
The syntax is described in figure \ref{fig:outputSyntax} with optional type annotations highlighted in yellow.
It is possible to exclude all optional types.
The output is a valid Featherweight Java program.
We use the syntax of the version introduced in \cite{WildcardsNeedWitnessProtection}
calling it \letfj{} for that it is a Featherweight Java variant including \texttt{let} statements.
2024-04-17 00:11:16 +02:00
% Our output syntax is shown in figure \ref{fig:outputSyntax}
% which is actually a subset of \letfj{}, because we omit \texttt{null} types.
2024-04-17 00:11:16 +02:00
\newcommand{\highlight}[1]{\begingroup\fboxsep=0pt\colorbox{yellow}{$\displaystyle #1$}\endgroup}
\begin{figure}
$
\begin{array}{lrcl}
\hline
\text{Parameterized classes} & \mv N & ::= & \exptype{C}{\ol{T}} \\
\text{Types} & \type{S}, \type{T}, \type{U} & ::= & \type{X} \mid \wcNtype{\Delta}{N} \\
\text{Lower bounds} & \type{K}, \type{L} & ::= & \type{T} \mid \bot \\
\text{Type variable contexts} & \Delta & ::= & \overline{\wildcard{X}{T}{L}} \\
\text{Class declarations} & D & ::= & \texttt{class}\ \exptype{C}{\ol{X \triangleleft T}} \triangleleft \type{N} \set{\overline{\type{T}\ f}; \ol{M}} \\
2024-04-17 17:13:35 +02:00
\text{Method declarations} & \texttt{M} & ::= & \highlight{\generics{\overline{\type{X} \triangleleft \type{N}}}}\ \highlight{\type{T}}\ \texttt{m}(\overline{\highlight{\type{T}}\ \expr{x}}) \{ \texttt{return} \ \expr{e}; \} \\
\text{Terms} & \expr{e} & ::= & \expr{x} \\
2024-04-17 00:11:16 +02:00
& & \ \ | & \texttt{new} \ \type{C}\highlight{\generics{\ol{T}}}(\overline{\expr{e}})\\
& & \ \ | & \expr{e}.f\\
2024-04-17 00:11:16 +02:00
& & \ \ | & \expr{e}.\texttt{m}\highlight{\generics{\ol{T}}}(\overline{\expr{e}})\\
& & \ \ | & \texttt{let}\ \expr{x} \highlight{: \wcNtype{\Delta}{N}} = \expr{e} \ \texttt{in} \ \expr{e}\\
& & \ \ | & \expr{e} \elvis{} \expr{e}\\
\text{Variable contexts} & \Gamma & ::= & \overline{\expr{x}:\type{T}}\\
\hline
\end{array}
$
2024-04-17 00:11:16 +02:00
\caption{Input Syntax with optional type annotations}\label{fig:outputSyntax}
\end{figure}
The output of our type inference algorithm is a valid \letfj{} program.
2024-04-05 12:24:42 +02:00
Before generating constraints the input is transformed to \TamedFJ{} (see section \ref{sec:anfTransformation}).
After adding the missing type annotations the resulting program is a valid \letfj{} program.
%This is shown in chapter \ref{sec:soundness}
Capture conversion is only needed for wildcard types,
but we don't know which expressions will spawn wildcard types because there are no type annotations yet.
We preemptively enclose every expression in a let statement before using it as a method argument.
We need the let statements to deal with possible Wildcard types.
The syntax used in our examples is the standard Featherweight Java syntax.
2024-02-11 21:59:57 +01:00
\subsection{Challenges}\label{challenges}
2024-02-11 21:59:57 +01:00
%TODO: Wildcard subtyping is infinite see \cite{TamingWildcards}
2024-02-06 18:04:31 +01:00
The introduction of wildcards adds additional challenges.
% we cannot replace every type variable with a wildcard
Type variables can also be used as type parameters, for example
$\exptype{List}{String} \lessdot \exptype{List}{\tv{a}}$.
A problem arises when replacing type variables with wildcards.
% Wildcards are not reflexive.
% ( on the equals property ), every wildcard has to be capture converted when leaving its scope
% do not substitute free type variables
Lets have a look at two examples:
\begin{itemize}
\item \begin{example} \label{intro-example1}
The first one is a valid Java program.
The type \texttt{List<? super String>} is \textit{capture converted} to a fresh type variable $\rwildcard{X}$
which is used as the generic method parameter \texttt{A}.
Java uses capture conversion to replace the generic \texttt{A} by a capture converted version of the
\texttt{? super String} wildcard.
Knowing that the type \texttt{String} is a subtype of any type the wildcard \texttt{? super String} can inherit
it is safe to pass \texttt{"String"} for the first parameter of the function.
\begin{verbatim}
2024-03-05 17:12:56 +01:00
<A> List<A> add(List<A> l, A v) {}
2024-03-05 17:12:56 +01:00
List<? super String> list = ...;
add(list, "String");
\end{verbatim}
\end{example}
\item \begin{example}\label{intro-example2}
This example displays an incorrect Java program.
The method call to \texttt{concat} with two wildcard lists is unsound.
Each list could be of a different kind and therefore the \texttt{concat} cannot succeed.
\begin{verbatim}
2024-03-05 17:12:56 +01:00
<A> List<A> concat(List<A> l1, List<A> l2) { ... }
List<?> getList() { ... }
concat(getList(), getList());
\end{verbatim}
The \letfj{} representation in listing \ref{letfjConcatFail} clarifies this.
Inside the let statement the variables hold the types
$\set{ \texttt{x1} : \exptype{List}{\rwildcard{X}}, \texttt{x2} : \exptype{List}{\rwildcard{Y}}}$.
For the method call \texttt{concat(x1, x2)} no replacement for the generic \texttt{A}
exists to satisfy
$\exptype{List}{\type{A}} <: \exptype{List}{\type{X}},
\exptype{List}{\type{A}} <: \exptype{List}{\type{Y}}$.
\begin{lstlisting}[style=letfj,caption=Incorrect method call,label=letfjConcatFail]
let x1 : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) = getList() in
let x2 : (*@$\wctype{\rwildcard{Y}}{List}{\rwildcard{Y}}$@*) = getList() in
concat(x1, x2)
\end{lstlisting}
2024-03-05 17:12:56 +01:00
% $\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}, \\
% \wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$
\end{example}
2024-03-05 17:12:56 +01:00
% \item
% \unify{} morphs a constraint set into a correct type solution
% gradually assigning types to type placeholders during that process.
% Solved constraints are removed and never considered again.
% In the following example \unify{} solves the constraint generated by the expression
% \texttt{l.add(l.head())} first, which results in $\ntv{l} \lessdot \exptype{List}{\wtv{a}}$.
% \begin{verbatim}
% anyList() = new List<String>() :? new List<Integer>()
2024-02-14 19:19:03 +01:00
2024-03-05 17:12:56 +01:00
% add(anyList(), anyList().head());
% \end{verbatim}
% The type for \texttt{l} can be any kind of list, but it has to be a invariant one.
% Assigning a \texttt{List<?>} for \texttt{l} is unsound, because the type list hiding behind
% \texttt{List<?>} could be a different one for the \texttt{add} call than the \texttt{head} method call.
% An additional constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$
% is solved by removing the wildcard $\rwildcard{X}$ if possible.
2024-02-14 19:19:03 +01:00
\item \textbf{Capture Conversion during \unify{}:}
The return type of a generic method call depends on its argument types.
A method like \texttt{String trim(String s)} will always return a \type{String} type.
However the return type of \texttt{<A> A head(List<A> l)} is a generic variable \texttt{A} and only shows
its actual type when the type of the argument list \texttt{l} is known.
The same goes for capture conversion, which can only be applied for a method call
when the argument types are given.
At the start of our global type inference algorithm no types are assigned yet.
During the course of the \unify{} algorithm more and more types emerge.
As soon as enough type information is given \unify{} can conduct a capture conversion if needed.
The constraints where this is possible are marked as capture constraints.
\item \textbf{Free Variables cannot leaver their scope}:
\begin{example}
Take the following Java program for example.
It maps every element of a list
$\expr{l} : \exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$
to a polymorphic \texttt{id} method.
We want to use our \unify{} algorithm to determine the correct type for the
variable \expr{l2}.
Although we do not specify constraint generation for language constructs like
lambda expressions used in this example,
we can imagine that the constraints have to look something like this:
\begin{minipage}{0.45\textwidth}
\begin{lstlisting}{java}
<X> List<X> id(List<X> l){ ... }
List<List<?>> ls;
l2 = l.map(x -> id(x));
\end{lstlisting}\end{minipage}
\hfill
\begin{minipage}{0.45\textwidth}
\begin{constraintset}
$
\begin{array}{l}
\wctype{\rwildcard{A}}{List}{\rwildcard{A}} \lessdotCC \exptype{List}{\wtv{x}}, \\
\exptype{List}{\wtv{x}} \lessdot \tv{z}, \\
\exptype{List}{\tv{z}} \lessdot \tv{l2}
\end{array}
$
\end{constraintset}
\end{minipage}
The constraints
$\wctype{\rwildcard{A}}{List}{\rwildcard{A}} \lessdotCC \exptype{List}{\wtv{x}},
\exptype{List}{\wtv{x}} \lessdot \tv{z}$
stem from the body of the lambda expression
\texttt{shuffle(x)}.
\textit{For clarification:} This method call would be represented as the following expression in \letfj{}:
\texttt{let x1 :$\wctype{\rwildcard{A}}{List}{\rwildcard{A}}$ = x in id(x) :$\tv{z}$}
The T-Let rule prevents us from using free variables created by the method call to \expr{id}
to be used in the return type $\tv{z}$.
But this has to be signaled to the \unify{} algorithm, which does not know about the origin and context of
the constraints.
If we naively substitute $\sigma(\tv{z}) = \exptype{List}{\rwildcard{A}}$
the return type of the \texttt{map} function would be the type $\exptype{List}{\exptype{List}{\rwildcard{A}}}$.
This type solution is unsound.
The type of \expr{l2} is the same as the one of \expr{l}:
$\exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$
\textbf{Solution:}
We solve this by distinguishing between wildcard placeholders and normal placeholders.
$\ntv{z}$ is a normal placeholder and is not allowed to contain free variables.
\end{example}
\end{itemize}
%TODO: Move this part. or move the third challenge some underneath.
The \unify{} algorithm only sees the constraints with no information about the program they originated from.
The main challenge was to find an algorithm which computes $\sigma(\wtv{a}) = \rwildcard{X}$ for example \ref{intro-example1} but not for example \ref{intro-example2}.
\section{Discussion Pair Example}
\begin{verbatim}
<X> Pair<X,X> make(List<X> l){ ... }
<X> boolean compare(Pair<X,X> p) { ... }
List<?> l;
Pair<?,?> p;
compare(make(l)); // Valid
compare(p); // Error
\end{verbatim}
Our type inference algorithm is not able to solve this example.
When we convert this to \TamedFJ{} and generate constraints we end up with:
\begin{lstlisting}[style=tamedfj]
2024-04-10 11:27:53 +02:00
let m = let x = l in make(x) in compare(m)
\end{lstlisting}
2024-04-10 11:27:53 +02:00
\begin{constraintset}$
\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \ntv{x},
\ntv{x} \lessdotCC \exptype{List}{\wtv{a}}
\exptype{Pair}{\wtv{a}, \wtv{a}} \lessdot \ntv{m}, %% TODO: Mark this constraint
\ntv{m} \lessdotCC \exptype{Pair}{\wtv{b}, \wtv{b}}
$\end{constraintset}
$\ntv{x}$ will get the type $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ and
from the constraint
$\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$
\unify{} deducts $\wtv{a} \doteq \rwildcard{X}$ leading to
$\exptype{Pair}{\rwildcard{X}, \rwildcard{X}} \lessdot \ntv{m}$.
Finding a supertype to $\exptype{Pair}{\rwildcard{X}, \rwildcard{X}}$ is the crucial part.
2024-04-10 12:59:43 +02:00
The correct substition for $\ntv{m}$ would be $\wctype{\rwildcard{X}}{Pair}{\rwildcard{X}, \rwildcard{X}}$.
2024-04-10 17:40:49 +02:00
But this leads to additional branching inside the \unify{} algorithm and increases runtime.
%We refrain from using that type, because it is not denotable with Java syntax.
%Types used for normal type placeholders should be expressable Java types. % They are not!
2024-04-10 15:35:27 +02:00
The prefered way of dealing with this example in our opinion would be the addition of a multi-let statement to the syntax.
\begin{lstlisting}[style=letfj]
2024-04-10 17:40:49 +02:00
let x : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) = l, m = make(x) in compare(make(x))
\end{lstlisting}
2024-04-10 17:40:49 +02:00
We can make it work with a special rule in the \unify{}.
But this will only help in this specific example and not generally solve the issue.
A type $\exptype{Pair}{\rwildcard{X}, \rwildcard{X}}$ has atleast two immediate supertypes:
$\wctype{\rwildcard{X}}{Pair}{\rwildcard{X}, \rwildcard{X}}$ and
$\wctype{\rwildcard{X}, \rwildcard{Y}}{Pair}{\rwildcard{X}, \rwildcard{Y}}$.
Imagine a type $\exptype{Triple}{\rwildcard{X},\rwildcard{X},\rwildcard{X}}$ already has
% TODO: how many supertypes are there?
%X.Triple<X,X,X> <: X,Y.Triple<X,Y,X> <:
%X,Y,Z.Triple<X,Y,Z>
% TODO example:
\begin{lstlisting}[style=java]
<X> Triple<X,X,X> sameL(List<X> l)
<X,Y> Triple<X,Y,Y> sameP(Pair<X,Y> l)
<X,Y> void triple(Triple<X,Y,Y> tr){}
Pair<?,?> p ...
List<?> l ...
make(t) { return t; }
triple(make(sameP(p)));
triple(make(sameL(l)));
\end{lstlisting}
\begin{constraintset}
$
\exptype{Triple}{\rwildcard{X}, \rwildcard{X}, \rwildcard{X}} \lessdot \ntv{t},
\ntv{t} \lessdotCC \exptype{Triple}{\wtv{a}, \wtv{b}, \wtv{b}}, \\
(\textit{This constraint is added later: } \exptype{Triple}{\rwildcard{X}, \rwildcard{Y}, \rwildcard{Y}} \lessdot \ntv{t})
$
% Triple<X,X,X> <. t
% ( Triple<X,Y,Y> <. t ) <- this constraint is added later
% t <. Triple<a?, b?, b?>
% t =. x.Triple<X,X,X>
\end{constraintset}
2024-04-08 22:51:06 +02:00
%TODO
% The goal is to proof soundness in respect to the type rules introduced by \cite{aModelForJavaWithWildcards}
% and \cite{WildcardsNeedWitnessProtection}.
% \subsection{Capture Conversion}
% The \texttt{let} statements in \TamedFJ{} act as capture conversion for wildcard types.
% \begin{figure}
% \begin{minipage}{0.45\textwidth}
% \begin{lstlisting}[style=tfgj]
% <X> List<X> clone(List<X> l);
% example(p){
% return clone(p);
% }
% \end{lstlisting}
% \end{minipage}%
% \hfill
% \begin{minipage}{0.5\textwidth}
% \begin{lstlisting}[style=letfj]
% <X> List<X> clone(List<X> l);
% (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) example((*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) p) =
% let x : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) = p in
% clone(x) : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*);
% \end{lstlisting}
% \end{minipage}
% \caption{Type inference adding capture conversion}\label{fig:addingLetExample}
% \end{figure}
% Figure \ref{fig:addingLetExample} shows a let statement getting added to the typed output.
% The method \texttt{clone} cannot be called with the type $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$.
% After a capture conversion \texttt{x} has the type $\exptype{List}{\rwildcard{X}}$ with $\rwildcard{X}$ being a free variable.
% Afterwards we have to find a supertype of $\exptype{List}{\rwildcard{X}}$, which does not contain free variables
% ($\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ in this case).
% During the constraint generation step most types are not known yet and are represented by a type placeholder.
% During a methodcall like the one in the \texttt{example} method in figure \ref{fig:ccExample} the type of the parameter \texttt{p}
% is not known yet.
% The type \texttt{List<?>} would be one possibility as a parameter type for \texttt{p}.
% To make wildcards work for our type inference algorithm \unify{} has to apply capture conversions if necessary.
% The type placeholder $\tv{r}$ is the return type of the \texttt{example} method.
% One possible type solution is $\tv{p} \doteq \tv{r} \doteq \wctype{\rwildcard{X}}{List}{\rwildcard{X}}$,
% which leads to:
% \begin{verbatim}
% List<?> example(List<?> p){
% return clone(p);
% }
% \end{verbatim}
% But by substituting $\tv{p} \doteq \wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ in the constraint
% $\tv{p} \lessdotCC \exptype{List}{\wtv{x}}$ leads to
% $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{x}}$.
% To make this typing possible we have to introduce a capture conversion via a let statement:
% $\texttt{return}\ (\texttt{let}\ \texttt{x} : \wctype{\rwildcard{X}}{List}{\rwildcard{X}} = \texttt{p}\ \texttt{in} \
% \texttt{clone}\generics{\rwildcard{X}}(x) : \wctype{\rwildcard{X}}{List}{\rwildcard{X}})$
% Inside the let statement the variable \texttt{x} has the type $\exptype{List}{\rwildcard{X}}$
% This spawns additional problems.
% \begin{figure}
% \begin{minipage}{0.45\textwidth}
% \begin{verbatim}
% <X> List<X> clone(List<X> l){...}
% example(p){
% return clone(p);
% }
% \end{verbatim}
% \end{minipage}%
% \hfill
% \begin{minipage}{0.35\textwidth}
% \begin{constraintset}
% \textbf{Constraints:}\\
% $
% \tv{p} \lessdotCC \exptype{List}{\wtv{x}}, \\
% \tv{p} \lessdot \tv{r}, \\
% \tv{p} \lessdot \type{Object},
% \tv{r} \lessdot \type{Object}
% $
% \end{constraintset}
% \end{minipage}
% \caption{Type inference example}\label{fig:ccExample}
% \end{figure}
% In addition with free variables this leads to unwanted behaviour.
% Take the constraint
% $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$ for example.
% After a capture conversion from $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ to $\exptype{List}{\rwildcard{Y}}$ and a substitution $\wtv{a} \doteq \rwildcard{Y}$
% we get
% $\exptype{List}{\rwildcard{Y}} \lessdot \exptype{List}{\rwildcard{Y}}$.
% Which is correct if we apply capture conversion to the left side:
% $\exptype{List}{\rwildcard{X}} <: \exptype{List}{\rwildcard{X}}$
% If the input constraints did not intend for this constraint to undergo a capture conversion then \unify{} would produce an invalid
% type solution due to:
% $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \nless: \exptype{List}{\rwildcard{X}}$
% The reason for this is the \texttt{S-Exists} rule's premise
% $\text{dom}(\Delta') \cap \text{fv}(\exptype{List}{\rwildcard{X}}) = \emptyset$.
% Capture constraints cannot be stored in a set.
% $\wtv{a} \lessdotCC \wtv{b}$ is not the same as $\wtv{a} \lessdotCC \wtv{b}$.
% Both constraints will end up the same after a substitution for both placeholders $\tv{a}$ and $\tv{b}$.
% But afterwards a capture conversion is applied, which can generate different types on the left sides.
% \begin{itemize}
% \item $\text{CC}(\wctype{\rwildcard{X}}{List}{\rwildcard{X}}) \implies \exptype{List}{\rwildcard{Y}}$
% \item $\text{CC}(\wctype{\rwildcard{X}}{List}{\rwildcard{X}}) \implies \exptype{List}{\rwildcard{Z}}$
% \end{itemize}
%
% Wildcards are not reflexive. A box of type $\wctype{\rwildcard{X}}{Box}{\rwildcard{X}}$
% is able to hold a value of any type. It could be a $\exptype{Box}{String}$ or a $\exptype{Box}{Integer}$ etc.
% Also two of those boxes do not necessarily contain the same type.
% But there are situations where it is possible to assume that.
% For example the type $\wctype{\rwildcard{X}}{Pair}{\exptype{Box}{\rwildcard{X}}, \exptype{Box}{\rwildcard{X}}}$
% is a pair of two boxes, which always contain the same type.
% Inside the scope of the \texttt{Pair} type, the wildcard $\rwildcard{X}$ stays the same.