This commit is contained in:
Peter Thiemann 2024-05-29 11:01:30 +02:00
parent d2f58b2489
commit 8428ebdc70

View File

@ -59,8 +59,10 @@ is scope for a global type inference algorithm that infers
method signatures even if no type information (except class headers method signatures even if no type information (except class headers
and types of field variables) is given. and types of field variables) is given.
We present such a global type inference algorithm for Featherweight We present such a global type inference algorithm for Featherweight
Generic Java with wildcards. The same approach can also be used for Generic Java with wildcards, a Java core calculus with generics and
regular Java programs. wildcards in the tradition of Featherweight Java \cite{FJ} and its
extensions \cite{WildFJ,WildcardsNeedWitnessProtection}. Our approach can also be used for regular Java programs,
but we limit the formal presentation to this core calculus.
%The goal is to find a correct typing for a given Java program. %The goal is to find a correct typing for a given Java program.
Our algorithm enables programmers to write Java code with only a Our algorithm enables programmers to write Java code with only a
@ -69,15 +71,17 @@ variables), it can fill in missing type annotations in partially
type-annotated programs, and it can suggest better (more general) type-annotated programs, and it can suggest better (more general)
types for ordinary, typed Java programs. types for ordinary, typed Java programs.
Listing~\ref{lst:intro-example-typeless} shows an example input of a
method implementation without any type annotation. Our algorithm
infers the type annotations in Listing~\ref{lst:intro-example-typed}:
it adds the type arguments to \texttt{List} at object creation and it
infers the most specific return type \texttt{List<?>}, which is a
wildcard type. Our algorithm is the first to infer wildcard types that
comes with a soundness proof.
Previous work on global type inference for Java either does not
consider wildcards or it simplifies the problem by not modeling key
features of Java wildcards.
\textbf{TO BE CONTINUED}
We propose a global type inference algorithm for Java supporting Wildcards and proven sound.
Global type inference allows input programs without any type annotations.
A programmer could write Java code without stressing about types and type annotations which
are infered and inserted by our algorithm.
%This leads to better types (Poster referenz)
%The algorithm proposed in this paper can determine a correct typing for the untyped Java source code example shown in listing \ref{lst:intro-example-typeless}. %The algorithm proposed in this paper can determine a correct typing for the untyped Java source code example shown in listing \ref{lst:intro-example-typeless}.
%In this case our algorithm would also be able to propose a solution including wildcards as shown in listing \ref{lst:intro-example-typed}. %In this case our algorithm would also be able to propose a solution including wildcards as shown in listing \ref{lst:intro-example-typed}.
@ -86,7 +90,7 @@ are infered and inserted by our algorithm.
%The last step to create a type inference algorithm compatible to the Java type system. %The last step to create a type inference algorithm compatible to the Java type system.
\begin{figure} \begin{figure}[tp]
\begin{minipage}{0.43\textwidth} \begin{minipage}{0.43\textwidth}
\begin{lstlisting}[style=java,label=lst:intro-example-typeless,caption=Missing return type] \begin{lstlisting}[style=java,label=lst:intro-example-typeless,caption=Missing return type]
genList() { genList() {
@ -113,57 +117,52 @@ List<?> genList() {
\end{figure} \end{figure}
\subsection{Comparision to similar Type Inference Algorithms}
To outline the contributions in this paper we will list the advantages and improvements to smiliar type inference algorithms: Global Type Inference for Featherweight Java \cite{TIforFGJ} is a
\begin{description} sound, but incomplete inference algorithm for Featherweight Generic
\item[Global Type Inference for Featherweight Java] \cite{TIforFGJ} is a predecessor to our algorithm. Java. It does not support wildcards, which means that it infers the
The type inference algorithm presented here supports Java Wildcards. return type \texttt{Object} for the method \texttt{genList} in
% Proven sound on type rules of Featherweight Java, which are also proven to produce sound programs Listing~\ref{lst:intro-example-typeless}. Like our algorithm, their
% implication rules that follow the subtyping rules directly. Easy to understand soundness proof algorithm restricts to monomorphic recursion, which leads to
% capture conversion is needed incompleteness.
\textit{Example:} The type inference algorithm for Generic Featherweight Java produces \texttt{Object} as the return type of the
\texttt{genBox} method in listing \ref{lst:intro-example-typeless}
whereas our type inference algorithm will infer the type solution shown in listing \ref{lst:intro-example-typed} A type unification algorithm for Java with wildcards \cite{plue09_1} states the same capabilities,
involving a wildcard type. but it does not handle capture conversion correctly because it only supports types which are expressible in Java syntax (more details in
\item[Type Unification for Java with Wildcards] section~\ref{challenges}).
An existing unification algorithm for Java with wildcards \cite{plue09_1} states the same capabilities, Moreover, it appears that the subtype relation changes depending on whether a type is used as an argument to a method invocation.
but exposes some errors when it comes to method invocations. We resolve this problem by modeling Java wildcards as existential
Especially the challenges shown in chapter \ref{challenges} are handled incorrectly. types \cite{WildFJ,WildcardsNeedWitnessProtection}, which also serve
The main reasons are that Plümickes algorithm only supports types which are expressible in Java syntax as the basis of our soundness proof.
and its soundness is proven towards a self-defined subtype ordering, but never against a complete type system. % forces us to define a new kind of subtype constraint. %, the current state of the art
It appears that the subtype relation changes depending on whether a type is used as an argument to a method invocation.
We resolve this by denoting Java wildcards as existential types
and introducing a second kind of subtype constraint. %, the current state of the art
%and is able to deal with types that are not directly denotable in Java. %and is able to deal with types that are not directly denotable in Java.
Additionally the soundness of our algorithm is proven using a Featherweight Java calculus \cite{WildFJ}.
%The algorithm presented in this paper is able to solve all those challenges correctly %The algorithm presented in this paper is able to solve all those challenges correctly
%and it's correctness is proven using a Featherweight Java calculus \cite{WildFJ}. %and it's correctness is proven using a Featherweight Java calculus \cite{WildFJ}.
%But they are all correctly solved by our new type inference algorithm presented in this paper. %But they are all correctly solved by our new type inference algorithm presented in this paper.
\item[Java Type Inference]
Standard Java provides type inference in a restricted form % namely {Local Type Inference}. \begin{lstlisting}[caption=Part of a valid Java program, style=tfgj, label=lst:tiExample,float=tp]
which only works for local environments where the surrounding context has known types.
But our global type inference algorithm is able to work on input programs which do not hold any type annotations at all.
We will show the different capabilities with an example.
In listing \ref{lst:tiExample} the method call \texttt{emptyList} is missing
its type parameters.
Java is using a matching algorithm \cite{javaTIisBroken} to replace \texttt{T} with \texttt{String}
resulting in the correct expected type \texttt{List<String>}.
%The invocation of \texttt{emptyList} missing type parameters can be added by Java's local type inference
%because the expected return type is known. \texttt{List<String>} in this case.
\begin{lstlisting}[caption=Extract of a valid Java program, label=lst:tiExample]
<T> List<T> emptyList() { ... } <T> List<T> emptyList() { ... }
List<String> ls = emptyList(); List<String> ls = emptyList();
\end{lstlisting} \end{lstlisting}
Standard Java provides type inference in a restricted form % namely {Local Type Inference}.
%\textit{Local Type Inference limitations:} which only works for local environments where the surrounding context has known types.
Note that local type inference depends on the type annotation on the left side of the assignment. The example in Listing~\ref{lst:tiExample} exhibits the main differences to our global type inference algorithm.
When calling the \texttt{emptyList} method without this type context its return value will be set to a \texttt{List<Object>}. The method call \texttt{emptyList} lacks its type parameters.
The Java code snippet in \ref{lst:tiLimit} is incorrect, because \texttt{emptyList()} returns Java relies on a matching algorithm \cite{javaTIisBroken} to instantiate \texttt{T} with \texttt{String}
a \texttt{List<Object>} instead of the required $\exptype{List}{\exptype{List}{String}}$. resulting in the correct expected type \texttt{List<String>}.
\begin{lstlisting}[caption=Limitations of Java's Type Inference, label=lst:tiLimit] This local type inference depends on the type annotation on the left side of the assignment.
When calling the \texttt{emptyList} method without this type context
its return value will be inferred as \texttt{List<Object>}.
Therefore, Java rejects the code snippet in Listing~\ref{lst:tiLimit}:
it infers the type \texttt{List<Object>} for
\texttt{emptyList()} instead of the required
$\exptype{List}{\exptype{List}{String}}$.
\begin{lstlisting}[caption=Limitations of Java's Type Inference,
label=lst:tiLimit, float=tp]
emptyList().add(new List<String>()) emptyList().add(new List<String>())
.get(0) .get(0)
.get(0); //Typing Error .get(0); //Typing Error
@ -172,14 +171,12 @@ emptyList().add(new List<String>())
%List<A> <: List<B>, B <: List<C> %List<A> <: List<B>, B <: List<C>
% B = A and therefore A on the left and right side of constraints. % B = A and therefore A on the left and right side of constraints.
% this makes matching impossible % this makes matching impossible
The second call to \texttt{get} produces a type error, because Java expects \texttt{emptyList} to return Hence, the second call to \texttt{get} produces a type error.
a \texttt{List<Object>}.
The type inference algorithm presented in this paper will correctly replace the type parameter \texttt{T} of the \texttt{emptyList} The type inference algorithm presented in this paper correctly instantiates the type parameter \texttt{T} of the \texttt{emptyList}
method with \texttt{List<List<String>>} and proof this code snippet correct. method with \texttt{List<List<String>>} and render this code snippet correct.
The local type inference algorithm based on matching cannot produce this solution.
Here our type inference algorithm based on unification is needed.
\end{description}
% %motivate constraints: % %motivate constraints:
% To solve this example our Type Inference algorithm will create constraints % To solve this example our Type Inference algorithm will create constraints
% $ % $
@ -197,20 +194,20 @@ Here our type inference algorithm based on unification is needed.
% - Easy to implement % - Easy to implement
% - Capture Conversion support % - Capture Conversion support
% - Existential Types support % - Existential Types support
Our contributions are We summarize our contributions:
\begin{itemize} \begin{itemize}
\item \item
We introduce the language \tifj{} (chapter \ref{sec:tifj}). We introduce the language \tifj{} (section \ref{sec:tifj}),
A Featherweight Java derivative including Generics, Wildcards and Type Inference. a Featherweight Java derivative including generics, wildcards, and type inference.
\item \item
We support capture conversion and Java style method calls. Our algorithm handles existential types in a form which is not
This requires existential types in a form which is not denotable by Java syntax \cite{aModelForJavaWithWildcards}. denotable by Java syntax \cite{aModelForJavaWithWildcards}. Thus, we support capture conversion and Java style method calls.
\item \item
We present a novel approach to deal with existential types and capture conversion during constraint unification. We present a novel approach to deal with existential types and capture conversion during constraint unification.
\item The algorithm is split in two parts. A constraint generation step and an unification step. % \item The algorithm is split in two parts. A constraint generation step and an unification step.
Input language and constraint generations can be extended without altering the unification part. % Input language and constraint generations can be extended without altering the unification part.
\item \item
We prove soundness and aim for a good compromise between completeness and time complexity. We prove soundness and aim for a good compromise between completeness and time complexity.
\end{itemize} \end{itemize}
% Our algorithm finds a correct type solution for the following example, where the Java local type inference fails: % Our algorithm finds a correct type solution for the following example, where the Java local type inference fails:
% \begin{verbatim} % \begin{verbatim}
@ -240,25 +237,25 @@ We prove soundness and aim for a good compromise between completeness and time c
% The type inference algorithm has to find the correct type involving wildcards (\texttt{List<?>}). % The type inference algorithm has to find the correct type involving wildcards (\texttt{List<?>}).
\section{Java Wildcards} \section{Java Wildcards}
\label{sec:java-wildcards}
Java has invariant subtyping for polymorphic types. As Java is an imperative language, subtyping for generic types is invariant.
%but it incooperates use-site variance via so called wildcard types. %but it incooperates use-site variance via so called wildcard types.
A \texttt{List<String>} is not a subtype of \texttt{List<Object>} Even though \texttt{String} is subtype of \texttt{Object},
even though it seems intuitive with \texttt{String} being a subtype of \texttt{Object}. a \texttt{List<String>} is not a subtype of \texttt{List<Object>}:
To make the type system more expressive Java incooperates use-site variance by allowing wildcards (\texttt{?}) in type annotations. because someone might store an \texttt{Integer} in the list, which is
For example a type \texttt{List<?>} (short for \texttt{List<? extends Object>}, with \texttt{?} being a placeholder for any type) %with the wildcard \texttt{?} representing a placeholder for any type compatible with \texttt{Object}, but not with \texttt{String} (see Listing~\ref{lst:invarianceExample}).
Invariance is overly restrictive in read-only or write-only
contexts. Hence, Java incooperates use-site variance by allowing wildcards (\texttt{?}) in types.
For example, the type \texttt{List<?>} (short for \texttt{List<? extends Object>}, with \texttt{?} being a placeholder for any type)
is a supertype of \texttt{List<String>} and \texttt{List<Object>}. is a supertype of \texttt{List<String>} and \texttt{List<Object>}.
%The \texttt{?} is a wildcard type which can be replaced by any type as needed. %The \texttt{?} is a wildcard type which can be replaced by any type as needed.
%
Listing~\ref{lst:wildcardIntro} shows a use of wildcards that renders the assignment \texttt{lo = ls} correct.
The program still does not compile, because the addition of an \texttt{Integer} to \texttt{lo} is still incorrect.
%Class instances in Java are mutable and passed by reference. \begin{figure}[tp]
Subtyping must be invariant in Java otherwise the integrity of data classes like \texttt{List} cannot be guaranteed.
See listing \ref{lst:invarianceExample} for example
where an \texttt{Integer} would be added to a list of \texttt{String}
if not for the Java type system which rejects the assignment \texttt{lo = ls}.
Listing \ref{lst:wildcardIntro} shows the use of wildcards rendering the assignment \texttt{lo = ls} correct.
The program still does not compile, because now the addition of an Integer to \texttt{lo} is rightfully deemed incorrect by Java.
\begin{figure}
\begin{minipage}{0.48\textwidth} \begin{minipage}{0.48\textwidth}
\begin{lstlisting}[caption=Java Invariance Example,label=lst:invarianceExample]{java} \begin{lstlisting}[caption=Java Invariance Example,label=lst:invarianceExample]{java}
List<String> ls = ...; List<String> ls = ...;
@ -278,14 +275,13 @@ lo.add(new Integer(1)); // error!
\end{minipage} \end{minipage}
\end{figure} \end{figure}
Wildcard types are virtual types. Wildcard types like \texttt{List<?>} are virtual types, i.e., the run-time
There is no instantiation of a \texttt{List<?>}. type of an object is always a fully instantiated type like
It is a placeholder type which can hold any kind of list like
\texttt{List<String>} or \texttt{List<Object>}. \texttt{List<String>} or \texttt{List<Object>}.
This type can also change at any given time, for example when multiple threads The issue is that the run-time type underlying a wildcard type can
share a reference to the same field. change at any time, for example when multiple threads share a reference to the same field.
A wildcard \texttt{?} must be considered a different type everytime it is accessed. Hence, a wildcard \texttt{?} must be considered a different type everytime it is accessed.
Therefore calling the method \texttt{concat} with two wildcard lists in the example in listing \ref{lst:concatError} is incorrect. For that reason, the call to the method \texttt{concat} with two wildcard lists in Listing~\ref{lst:concatError} is rejected.
% The \texttt{concat} method does not create a new list, % The \texttt{concat} method does not create a new list,
% but adds all elements from the second argument to the list given as the first argument. % but adds all elements from the second argument to the list given as the first argument.
% This is allowed in Java, because both lists are of the polymorphic type \texttt{List<X>}. % This is allowed in Java, because both lists are of the polymorphic type \texttt{List<X>}.
@ -293,50 +289,52 @@ Therefore calling the method \texttt{concat} with two wildcard lists in the exam
% if Java would treat \texttt{?} as a regular type and instantiate the type variable \texttt{X} % if Java would treat \texttt{?} as a regular type and instantiate the type variable \texttt{X}
% of the \texttt{concat} function with \texttt{?}. % of the \texttt{concat} function with \texttt{?}.
\begin{figure} \begin{figure}[tp]
\begin{lstlisting}[caption=Wildcard Example with faulty call to a concat method,label=lst:concatError]{java} \begin{lstlisting}[caption=Wildcard Example with faulty call to a concat method,label=lst:concatError]{java}
<X> List<X> concat(List<X> l1, List<X> l2){ <X> List<X> concat(List<X> l1, List<X> l2){
return l1.addAll(l2); return l1.addAll(l2);
} }
List<String> ls = new List<String>(); List<?> l1 = new List<String>("foo");
List<?> l1 = ls;
List<?> l2 = new List<Integer>(1); // List containing Integer List<?> l2 = new List<Integer>(1); // List containing Integer
concat(l1, l2); // Error! Would concat two different lists concat(l1, l2); // Error! Would concat two different lists
\end{lstlisting} \end{lstlisting}
\end{figure} \end{figure}
To determine the correctness of method calls involving wildcard types Java's typecheck To determine the correctness of method calls involving wildcard types Java's typechecker
makes use of a concept called \textbf{Capture Conversion}. makes use of a concept called \textbf{capture conversion}.
% was designed to make Java wildcards useful. % was designed to make Java wildcards useful.
% - without capture conversion % - without capture conversion
% - is used to open wildcard types % - is used to open wildcard types
% - % -
This process was formalized for Featherweight Java \cite{FJ} by replacing existential types with wildcards and capture conversion with let statements \cite{WildcardsNeedWitnessProtection}. One way to formalize this concept is by replacing wildcards with
We propose our own Featherweight Java derivative called \TamedFJ{} defined in chapter \ref{sec:tifj}. existential types and modeling capture conversion with suitably
To express the example in listing \ref{lst:wildcardIntro} with our calculus we first have to translate the wildcard types: inserted let statements \cite{WildcardsNeedWitnessProtection}.
\texttt{List<? extends Object>} becomes $\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{A}}$. Our Featherweight Java derivative called \TamedFJ{} is modeled after
The syntax used here allows for wildcard parameters to have a name, an uppper and lower bound, Bierhoff's calculus \cite{WildcardsNeedWitnessProtection} (see section~\ref{sec:tifj}).
and a type they are bound to. To express the example in Listing~\ref{lst:wildcardIntro} in our calculus we first translate the wildcard types:
In this case the name is $\rwildcard{A}$ with the upper bound $\type{Object}$ and it's bound to the the type \texttt{List}. \texttt{List<? extends Object>} becomes
Before we can call the \texttt{add} method on this type we have to add a capture conversion via let statement: $\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{A}}$,
where the existentially bound variable \texttt{A} has a lower bound
$\bot$ and an upper bound $\type{Object}$.
Before we can call the \texttt{add} method on this type we perform
capture conversion by inserting a let statement:
\begin{lstlisting} \begin{lstlisting}
let v : (*@$\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{A}}$@*) = lo in v.<A>add(new Integer(1)); let v : (*@$\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{A}}$@*) = lo in v.<A>add(new Integer(1));
\end{lstlisting} \end{lstlisting}
\expr{lo} is assigned to a new variable \expr{v} bearing the type $\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$, The variable \expr{lo} (from Listing~\ref{lst:wildcardIntro}) is
but inside the let statement the variable \expr{v} will be treated as $\exptype{List}{\rwildcard{A}}$. assigned to a new immutable variable \expr{v} with type
The idea is that every Wildcard type is backed by a concrete type. $\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$,
By assigning \expr{lo} to a immutable variable \expr{v} we unpack the concrete type $\exptype{List}{\rwildcard{A}}$ but inside the let statement the variable \expr{v} will be treated as
that was concealed by \expr{lo}'s existential type. $\exptype{List}{\rwildcard{A}}$.
Here $\rwildcard{A}$ is a fresh variable or a captured wildcard so to say. Here $\rwildcard{A}$ is a fresh variable or a captured wildcard.
The only information we have about $\rwildcard{A}$ is that it is any type inbetween the bounds $\bot$ and $\type{Object}$ The only information we have about $\rwildcard{A}$ is that it is a
supertype of $\bot$ and a subtype of $\type{Object}$
It is important to give the captured wildcard type $\rwildcard{A}$ an unique name which is used nowhere else. It is important to give the captured wildcard type $\rwildcard{A}$ an unique name which is used nowhere else.
With this formalization it gets obvious why the method call to \texttt{concat} This approach also clarifies why the method call to \texttt{concat}
in listing \ref{lst:concatError} is irregular (see \ref{lst:concatTamedFJ}). in listing \ref{lst:concatError} is rejected (see Listing~\ref{lst:concatTamedFJ}).
\begin{figure}[tp]
\begin{figure}
\begin{lstlisting}[style=TamedFJ,caption=\TamedFJ{} representation of the concat call from listing \ref{lst:concatError}, label=lst:concatTamedFJ] \begin{lstlisting}[style=TamedFJ,caption=\TamedFJ{} representation of the concat call from listing \ref{lst:concatError}, label=lst:concatTamedFJ]
let l1' : (*@$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$@*) = l1 in let l1' : (*@$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$@*) = l1 in
let l2' : (*@$\wctype{\rwildcard{Y}}{List}{\exptype{List}{\rwildcard{Y}}}$@*) = l2 in let l2' : (*@$\wctype{\rwildcard{Y}}{List}{\exptype{List}{\rwildcard{Y}}}$@*) = l2 in