This commit is contained in:
Peter Thiemann 2024-05-29 11:01:30 +02:00
parent d2f58b2489
commit 8428ebdc70

View File

@ -59,8 +59,10 @@ is scope for a global type inference algorithm that infers
method signatures even if no type information (except class headers
and types of field variables) is given.
We present such a global type inference algorithm for Featherweight
Generic Java with wildcards. The same approach can also be used for
regular Java programs.
Generic Java with wildcards, a Java core calculus with generics and
wildcards in the tradition of Featherweight Java \cite{FJ} and its
extensions \cite{WildFJ,WildcardsNeedWitnessProtection}. Our approach can also be used for regular Java programs,
but we limit the formal presentation to this core calculus.
%The goal is to find a correct typing for a given Java program.
Our algorithm enables programmers to write Java code with only a
@ -69,15 +71,17 @@ variables), it can fill in missing type annotations in partially
type-annotated programs, and it can suggest better (more general)
types for ordinary, typed Java programs.
Listing~\ref{lst:intro-example-typeless} shows an example input of a
method implementation without any type annotation. Our algorithm
infers the type annotations in Listing~\ref{lst:intro-example-typed}:
it adds the type arguments to \texttt{List} at object creation and it
infers the most specific return type \texttt{List<?>}, which is a
wildcard type. Our algorithm is the first to infer wildcard types that
comes with a soundness proof.
Previous work on global type inference for Java either does not
consider wildcards or it simplifies the problem by not modeling key
features of Java wildcards.
\textbf{TO BE CONTINUED}
We propose a global type inference algorithm for Java supporting Wildcards and proven sound.
Global type inference allows input programs without any type annotations.
A programmer could write Java code without stressing about types and type annotations which
are infered and inserted by our algorithm.
%This leads to better types (Poster referenz)
%The algorithm proposed in this paper can determine a correct typing for the untyped Java source code example shown in listing \ref{lst:intro-example-typeless}.
%In this case our algorithm would also be able to propose a solution including wildcards as shown in listing \ref{lst:intro-example-typed}.
@ -86,7 +90,7 @@ are infered and inserted by our algorithm.
%The last step to create a type inference algorithm compatible to the Java type system.
\begin{figure}
\begin{figure}[tp]
\begin{minipage}{0.43\textwidth}
\begin{lstlisting}[style=java,label=lst:intro-example-typeless,caption=Missing return type]
genList() {
@ -113,57 +117,52 @@ List<?> genList() {
\end{figure}
\subsection{Comparision to similar Type Inference Algorithms}
To outline the contributions in this paper we will list the advantages and improvements to smiliar type inference algorithms:
\begin{description}
\item[Global Type Inference for Featherweight Java] \cite{TIforFGJ} is a predecessor to our algorithm.
The type inference algorithm presented here supports Java Wildcards.
% Proven sound on type rules of Featherweight Java, which are also proven to produce sound programs
% implication rules that follow the subtyping rules directly. Easy to understand soundness proof
% capture conversion is needed
\textit{Example:} The type inference algorithm for Generic Featherweight Java produces \texttt{Object} as the return type of the
\texttt{genBox} method in listing \ref{lst:intro-example-typeless}
whereas our type inference algorithm will infer the type solution shown in listing \ref{lst:intro-example-typed}
involving a wildcard type.
\item[Type Unification for Java with Wildcards]
An existing unification algorithm for Java with wildcards \cite{plue09_1} states the same capabilities,
but exposes some errors when it comes to method invocations.
Especially the challenges shown in chapter \ref{challenges} are handled incorrectly.
The main reasons are that Plümickes algorithm only supports types which are expressible in Java syntax
and its soundness is proven towards a self-defined subtype ordering, but never against a complete type system.
It appears that the subtype relation changes depending on whether a type is used as an argument to a method invocation.
We resolve this by denoting Java wildcards as existential types
and introducing a second kind of subtype constraint. %, the current state of the art
Global Type Inference for Featherweight Java \cite{TIforFGJ} is a
sound, but incomplete inference algorithm for Featherweight Generic
Java. It does not support wildcards, which means that it infers the
return type \texttt{Object} for the method \texttt{genList} in
Listing~\ref{lst:intro-example-typeless}. Like our algorithm, their
algorithm restricts to monomorphic recursion, which leads to
incompleteness.
A type unification algorithm for Java with wildcards \cite{plue09_1} states the same capabilities,
but it does not handle capture conversion correctly because it only supports types which are expressible in Java syntax (more details in
section~\ref{challenges}).
Moreover, it appears that the subtype relation changes depending on whether a type is used as an argument to a method invocation.
We resolve this problem by modeling Java wildcards as existential
types \cite{WildFJ,WildcardsNeedWitnessProtection}, which also serve
as the basis of our soundness proof.
% forces us to define a new kind of subtype constraint. %, the current state of the art
%and is able to deal with types that are not directly denotable in Java.
Additionally the soundness of our algorithm is proven using a Featherweight Java calculus \cite{WildFJ}.
%The algorithm presented in this paper is able to solve all those challenges correctly
%and it's correctness is proven using a Featherweight Java calculus \cite{WildFJ}.
%But they are all correctly solved by our new type inference algorithm presented in this paper.
\item[Java Type Inference]
Standard Java provides type inference in a restricted form % namely {Local Type Inference}.
which only works for local environments where the surrounding context has known types.
But our global type inference algorithm is able to work on input programs which do not hold any type annotations at all.
We will show the different capabilities with an example.
In listing \ref{lst:tiExample} the method call \texttt{emptyList} is missing
its type parameters.
Java is using a matching algorithm \cite{javaTIisBroken} to replace \texttt{T} with \texttt{String}
resulting in the correct expected type \texttt{List<String>}.
%The invocation of \texttt{emptyList} missing type parameters can be added by Java's local type inference
%because the expected return type is known. \texttt{List<String>} in this case.
\begin{lstlisting}[caption=Extract of a valid Java program, label=lst:tiExample]
\begin{lstlisting}[caption=Part of a valid Java program, style=tfgj, label=lst:tiExample,float=tp]
<T> List<T> emptyList() { ... }
List<String> ls = emptyList();
\end{lstlisting}
%\textit{Local Type Inference limitations:}
Note that local type inference depends on the type annotation on the left side of the assignment.
When calling the \texttt{emptyList} method without this type context its return value will be set to a \texttt{List<Object>}.
The Java code snippet in \ref{lst:tiLimit} is incorrect, because \texttt{emptyList()} returns
a \texttt{List<Object>} instead of the required $\exptype{List}{\exptype{List}{String}}$.
\begin{lstlisting}[caption=Limitations of Java's Type Inference, label=lst:tiLimit]
Standard Java provides type inference in a restricted form % namely {Local Type Inference}.
which only works for local environments where the surrounding context has known types.
The example in Listing~\ref{lst:tiExample} exhibits the main differences to our global type inference algorithm.
The method call \texttt{emptyList} lacks its type parameters.
Java relies on a matching algorithm \cite{javaTIisBroken} to instantiate \texttt{T} with \texttt{String}
resulting in the correct expected type \texttt{List<String>}.
This local type inference depends on the type annotation on the left side of the assignment.
When calling the \texttt{emptyList} method without this type context
its return value will be inferred as \texttt{List<Object>}.
Therefore, Java rejects the code snippet in Listing~\ref{lst:tiLimit}:
it infers the type \texttt{List<Object>} for
\texttt{emptyList()} instead of the required
$\exptype{List}{\exptype{List}{String}}$.
\begin{lstlisting}[caption=Limitations of Java's Type Inference,
label=lst:tiLimit, float=tp]
emptyList().add(new List<String>())
.get(0)
.get(0); //Typing Error
@ -172,14 +171,12 @@ emptyList().add(new List<String>())
%List<A> <: List<B>, B <: List<C>
% B = A and therefore A on the left and right side of constraints.
% this makes matching impossible
The second call to \texttt{get} produces a type error, because Java expects \texttt{emptyList} to return
a \texttt{List<Object>}.
The type inference algorithm presented in this paper will correctly replace the type parameter \texttt{T} of the \texttt{emptyList}
method with \texttt{List<List<String>>} and proof this code snippet correct.
The local type inference algorithm based on matching cannot produce this solution.
Here our type inference algorithm based on unification is needed.
Hence, the second call to \texttt{get} produces a type error.
The type inference algorithm presented in this paper correctly instantiates the type parameter \texttt{T} of the \texttt{emptyList}
method with \texttt{List<List<String>>} and render this code snippet correct.
\end{description}
% %motivate constraints:
% To solve this example our Type Inference algorithm will create constraints
% $
@ -197,20 +194,20 @@ Here our type inference algorithm based on unification is needed.
% - Easy to implement
% - Capture Conversion support
% - Existential Types support
Our contributions are
We summarize our contributions:
\begin{itemize}
\item
We introduce the language \tifj{} (chapter \ref{sec:tifj}).
A Featherweight Java derivative including Generics, Wildcards and Type Inference.
We introduce the language \tifj{} (section \ref{sec:tifj}),
a Featherweight Java derivative including generics, wildcards, and type inference.
\item
We support capture conversion and Java style method calls.
This requires existential types in a form which is not denotable by Java syntax \cite{aModelForJavaWithWildcards}.
Our algorithm handles existential types in a form which is not
denotable by Java syntax \cite{aModelForJavaWithWildcards}. Thus, we support capture conversion and Java style method calls.
\item
We present a novel approach to deal with existential types and capture conversion during constraint unification.
\item The algorithm is split in two parts. A constraint generation step and an unification step.
Input language and constraint generations can be extended without altering the unification part.
We present a novel approach to deal with existential types and capture conversion during constraint unification.
% \item The algorithm is split in two parts. A constraint generation step and an unification step.
% Input language and constraint generations can be extended without altering the unification part.
\item
We prove soundness and aim for a good compromise between completeness and time complexity.
We prove soundness and aim for a good compromise between completeness and time complexity.
\end{itemize}
% Our algorithm finds a correct type solution for the following example, where the Java local type inference fails:
% \begin{verbatim}
@ -240,25 +237,25 @@ We prove soundness and aim for a good compromise between completeness and time c
% The type inference algorithm has to find the correct type involving wildcards (\texttt{List<?>}).
\section{Java Wildcards}
\label{sec:java-wildcards}
Java has invariant subtyping for polymorphic types.
As Java is an imperative language, subtyping for generic types is invariant.
%but it incooperates use-site variance via so called wildcard types.
A \texttt{List<String>} is not a subtype of \texttt{List<Object>}
even though it seems intuitive with \texttt{String} being a subtype of \texttt{Object}.
To make the type system more expressive Java incooperates use-site variance by allowing wildcards (\texttt{?}) in type annotations.
For example a type \texttt{List<?>} (short for \texttt{List<? extends Object>}, with \texttt{?} being a placeholder for any type) %with the wildcard \texttt{?} representing a placeholder for any type
Even though \texttt{String} is subtype of \texttt{Object},
a \texttt{List<String>} is not a subtype of \texttt{List<Object>}:
because someone might store an \texttt{Integer} in the list, which is
compatible with \texttt{Object}, but not with \texttt{String} (see Listing~\ref{lst:invarianceExample}).
Invariance is overly restrictive in read-only or write-only
contexts. Hence, Java incooperates use-site variance by allowing wildcards (\texttt{?}) in types.
For example, the type \texttt{List<?>} (short for \texttt{List<? extends Object>}, with \texttt{?} being a placeholder for any type)
is a supertype of \texttt{List<String>} and \texttt{List<Object>}.
%The \texttt{?} is a wildcard type which can be replaced by any type as needed.
%
Listing~\ref{lst:wildcardIntro} shows a use of wildcards that renders the assignment \texttt{lo = ls} correct.
The program still does not compile, because the addition of an \texttt{Integer} to \texttt{lo} is still incorrect.
%Class instances in Java are mutable and passed by reference.
Subtyping must be invariant in Java otherwise the integrity of data classes like \texttt{List} cannot be guaranteed.
See listing \ref{lst:invarianceExample} for example
where an \texttt{Integer} would be added to a list of \texttt{String}
if not for the Java type system which rejects the assignment \texttt{lo = ls}.
Listing \ref{lst:wildcardIntro} shows the use of wildcards rendering the assignment \texttt{lo = ls} correct.
The program still does not compile, because now the addition of an Integer to \texttt{lo} is rightfully deemed incorrect by Java.
\begin{figure}
\begin{figure}[tp]
\begin{minipage}{0.48\textwidth}
\begin{lstlisting}[caption=Java Invariance Example,label=lst:invarianceExample]{java}
List<String> ls = ...;
@ -278,14 +275,13 @@ lo.add(new Integer(1)); // error!
\end{minipage}
\end{figure}
Wildcard types are virtual types.
There is no instantiation of a \texttt{List<?>}.
It is a placeholder type which can hold any kind of list like
Wildcard types like \texttt{List<?>} are virtual types, i.e., the run-time
type of an object is always a fully instantiated type like
\texttt{List<String>} or \texttt{List<Object>}.
This type can also change at any given time, for example when multiple threads
share a reference to the same field.
A wildcard \texttt{?} must be considered a different type everytime it is accessed.
Therefore calling the method \texttt{concat} with two wildcard lists in the example in listing \ref{lst:concatError} is incorrect.
The issue is that the run-time type underlying a wildcard type can
change at any time, for example when multiple threads share a reference to the same field.
Hence, a wildcard \texttt{?} must be considered a different type everytime it is accessed.
For that reason, the call to the method \texttt{concat} with two wildcard lists in Listing~\ref{lst:concatError} is rejected.
% The \texttt{concat} method does not create a new list,
% but adds all elements from the second argument to the list given as the first argument.
% This is allowed in Java, because both lists are of the polymorphic type \texttt{List<X>}.
@ -293,50 +289,52 @@ Therefore calling the method \texttt{concat} with two wildcard lists in the exam
% if Java would treat \texttt{?} as a regular type and instantiate the type variable \texttt{X}
% of the \texttt{concat} function with \texttt{?}.
\begin{figure}
\begin{figure}[tp]
\begin{lstlisting}[caption=Wildcard Example with faulty call to a concat method,label=lst:concatError]{java}
<X> List<X> concat(List<X> l1, List<X> l2){
return l1.addAll(l2);
}
List<String> ls = new List<String>();
List<?> l1 = ls;
List<?> l1 = new List<String>("foo");
List<?> l2 = new List<Integer>(1); // List containing Integer
concat(l1, l2); // Error! Would concat two different lists
\end{lstlisting}
\end{figure}
To determine the correctness of method calls involving wildcard types Java's typecheck
makes use of a concept called \textbf{Capture Conversion}.
To determine the correctness of method calls involving wildcard types Java's typechecker
makes use of a concept called \textbf{capture conversion}.
% was designed to make Java wildcards useful.
% - without capture conversion
% - is used to open wildcard types
% -
This process was formalized for Featherweight Java \cite{FJ} by replacing existential types with wildcards and capture conversion with let statements \cite{WildcardsNeedWitnessProtection}.
We propose our own Featherweight Java derivative called \TamedFJ{} defined in chapter \ref{sec:tifj}.
To express the example in listing \ref{lst:wildcardIntro} with our calculus we first have to translate the wildcard types:
\texttt{List<? extends Object>} becomes $\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{A}}$.
The syntax used here allows for wildcard parameters to have a name, an uppper and lower bound,
and a type they are bound to.
In this case the name is $\rwildcard{A}$ with the upper bound $\type{Object}$ and it's bound to the the type \texttt{List}.
Before we can call the \texttt{add} method on this type we have to add a capture conversion via let statement:
One way to formalize this concept is by replacing wildcards with
existential types and modeling capture conversion with suitably
inserted let statements \cite{WildcardsNeedWitnessProtection}.
Our Featherweight Java derivative called \TamedFJ{} is modeled after
Bierhoff's calculus \cite{WildcardsNeedWitnessProtection} (see section~\ref{sec:tifj}).
To express the example in Listing~\ref{lst:wildcardIntro} in our calculus we first translate the wildcard types:
\texttt{List<? extends Object>} becomes
$\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{A}}$,
where the existentially bound variable \texttt{A} has a lower bound
$\bot$ and an upper bound $\type{Object}$.
Before we can call the \texttt{add} method on this type we perform
capture conversion by inserting a let statement:
\begin{lstlisting}
let v : (*@$\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{A}}$@*) = lo in v.<A>add(new Integer(1));
\end{lstlisting}
\expr{lo} is assigned to a new variable \expr{v} bearing the type $\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$,
but inside the let statement the variable \expr{v} will be treated as $\exptype{List}{\rwildcard{A}}$.
The idea is that every Wildcard type is backed by a concrete type.
By assigning \expr{lo} to a immutable variable \expr{v} we unpack the concrete type $\exptype{List}{\rwildcard{A}}$
that was concealed by \expr{lo}'s existential type.
Here $\rwildcard{A}$ is a fresh variable or a captured wildcard so to say.
The only information we have about $\rwildcard{A}$ is that it is any type inbetween the bounds $\bot$ and $\type{Object}$
The variable \expr{lo} (from Listing~\ref{lst:wildcardIntro}) is
assigned to a new immutable variable \expr{v} with type
$\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$,
but inside the let statement the variable \expr{v} will be treated as
$\exptype{List}{\rwildcard{A}}$.
Here $\rwildcard{A}$ is a fresh variable or a captured wildcard.
The only information we have about $\rwildcard{A}$ is that it is a
supertype of $\bot$ and a subtype of $\type{Object}$
It is important to give the captured wildcard type $\rwildcard{A}$ an unique name which is used nowhere else.
With this formalization it gets obvious why the method call to \texttt{concat}
in listing \ref{lst:concatError} is irregular (see \ref{lst:concatTamedFJ}).
\begin{figure}
This approach also clarifies why the method call to \texttt{concat}
in listing \ref{lst:concatError} is rejected (see Listing~\ref{lst:concatTamedFJ}).
\begin{figure}[tp]
\begin{lstlisting}[style=TamedFJ,caption=\TamedFJ{} representation of the concat call from listing \ref{lst:concatError}, label=lst:concatTamedFJ]
let l1' : (*@$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$@*) = l1 in
let l2' : (*@$\wctype{\rwildcard{Y}}{List}{\exptype{List}{\rwildcard{Y}}}$@*) = l2 in