Ecoop2024_TIforWildFJ/introduction.tex
2024-02-07 18:26:41 +01:00

280 lines
12 KiB
TeX

\section{Type Inference for Java}
%The goal is to find a correct typing for a given Java program.
Type inference for Java has many use cases and could be used to help programmers by inserting correct types for them,
Finding better type solutions for already typed Java programs (for example more generical ones),
or allowing to write typeless Java code which is then type infered and thereby type checked by our algorithm.
The algorithm proposed in this paper can determine a correct typing for the untyped Java source code example shown in figure \ref{fig:intro-example-typeless}.
Our algorithm is also capable of finding solutions involving wildcards as shown in figure \ref{fig:intro-example-typed}.
%This paper extends a type inference algorithm for Featherweight Java \cite{TIforFGJ} by adding wildcards.
%The last step to create a type inference algorithm compatible to the Java type system.
The algorithm presented in this paper is a slightly improved version of the one in \cite{TIforFGJ} including wildcard support.
%a modified version of the \unify{} algorithm presented in \cite{plue09_1}.
The input to the type inference algorithm is a Featherweight Java program (example in figure \ref{fig:nested-list-example-typeless}) conforming to the syntax shown in figure \ref{fig:syntax}.
The \fjtype{} algorithm calculates constraints based on this intermediate representation,
which are then solved by the \unify{} algorithm
resulting in a correctly typed program (see figure \ref{fig:nested-list-example-typed}).
\begin{figure}[tp]
\begin{subfigure}[t]{\linewidth}
\begin{lstlisting}[style=fgj]
class List<A> {
List<A> add(A v) { ... }
}
class Example {
m(l, la, lb){
return l
.add(la.add(1))
.add(lb.add("str"));
}
}
\end{lstlisting}
\caption{Java method with missing return type}
\label{fig:nested-list-example-typeless}
\end{subfigure}
~
% \begin{subfigure}[t]{\linewidth}
% \begin{lstlisting}[style=tfgj]
% class List<A> {
% List<A> add(A v) { ... }
% }
% class Example {
% m(l, la, lb){
% return let r2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$@*) = {
% let r1 : (*@$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$@*) = l in {
% let p1 : (*@$\exptype{List}{\type{Integer}}$@*) = {
% let xa = la in xa.add(1)
% } in x1.add(p1)
% } in {
% let p2 = {
% let xb = lb in xb.add("str")
% } in x2.add(p2)
% };
% }
% }
% \end{lstlisting}
% \caption{Featherweight Java Representation}
% \label{fig:nested-list-example-let}
% \end{subfigure}
% ~
\begin{subfigure}[t]{\linewidth}
\begin{lstlisting}[style=tfgj]
class List<A> {
List<A> add(A v) { ... }
}
class Example {
m(List<List<? extends Object>> l, List<Integer> la, List<String> lb){
return l
.add(la.add(1))
.add(lb.add("str"));
}
}
\end{lstlisting}
\caption{Java Representation}
\label{fig:nested-list-example-typed}
\end{subfigure}
%\caption{Example code}
%\label{fig:intro-example-code}
\end{figure}
%TODO
% The goal is to proof soundness in respect to the type rules introduced by \cite{aModelForJavaWithWildcards}
% and \cite{WildcardsNeedWitnessProtection}.
\begin{figure}[tp]
\begin{subfigure}[t]{0.49\linewidth}
\begin{lstlisting}[style=fgj]
genList() {
if( ... ) {
return new List<String>();
} else {
return new List<Integer>();
}
}
\end{lstlisting}
\caption{Java method with missing return type}
\label{fig:intro-example-typeless}
\end{subfigure}
~
\begin{subfigure}[t]{0.49\linewidth}
\begin{lstlisting}[style=tfgj]
List<?> genList() {
if( ... ) {
return new List<String>();
} else {
return new List<Integer>();
}
}
\end{lstlisting}
\caption{Correct type}
\label{fig:intro-example-typed}
\end{subfigure}
%\caption{Example code}
%\label{fig:intro-example-code}
\end{figure}
% \subsection{Wildcards}
% Java subtyping involving generics is invariant.
% For example \texttt{List<String>} is not a subtype of \texttt{List<Object>}.
% %Wildcards introduce variance by allowing \texttt{List<String>} to be a subtype of \texttt{List<?>}.
% \texttt{List<Object>} is not a valid return type for the method \texttt{genList}.
% The type inference algorithm has to find the correct type involving wildcards (\texttt{List<?>}).
\subsection{Java Wildcards}
Wildcards add variance to Java type parameters.
In Java a \texttt{List<String>} is not a subtype of \texttt{List<Object>}
even though it seems intuitive with \texttt{String} being a subtype of \texttt{Object}.
Here wildcards come into play.
$\exptype{List}{String} <: \wctype{\wildcard{X}{\bot}{\type{Object}}}{List}{\rwildcard{X}}$
means \texttt{List<String>} is a subtype of \texttt{List<? extend Object>}.
The syntax used here allows for wildcard parameters to have a name, an uppper and lower bound at the same time,
and a type they are bound to.
In this case the name is $\rwildcard{X}$ and it's bound to the the type \texttt{List}.
% Additionally they can hold a upper or lower bound restriction like \texttt{List<? super String>}.
% Our representation of this type is: $\wctype{\wildcard{X}{\type{String}}{\type{Object}}}{List}{\rwildcard{X}}$
% Every wildcard has a name ($\rwildcard{X}$ in this case) and an upper and lower bound (respectively \texttt{Object} and \texttt{String}).
\subsection{Challenges}\label{challenges}
The introduction of wildcards adds additional challenges.
% we cannot replace every type variable with a wildcard
Type variables can also be used as type parameters, for example
$\exptype{List}{String} \lessdot \exptype{List}{\tv{a}}$.
A problem arises when replacing type variables with wildcards.
% Wildcards are not reflexive.
% ( on the equals property ), every wildcard has to be capture converted when leaving its scope
% do not substitute free type variables
Lets have a look at two examples:
\begin{itemize}
\item \begin{example} \label{intro-example1}
The first one is a valid Java program.
The type \texttt{List<? super String>} is \textit{capture converted} to a fresh type variable $\rwildcard{X}$
which is used as the generic method parameter \texttt{A}.
Java uses capture conversion to replace the generic \texttt{A} by a capture converted version of the
\texttt{? super String} wildcard.
Knowing that the type \texttt{String} is a subtype of any type the wildcard \texttt{? super String} can inherit
it is safe to pass \texttt{"String"} for the first parameter of the function.
\begin{verbatim}
<A> List<A> add(A a, List<A> la) {}
List<? super String> list = ...;
add("String", list);
\end{verbatim}
The constraints representing this code snippet are:
\begin{displaymath}
\type{String} \lessdotCC \wtv{a},\,
\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}
\end{displaymath}
Here $\sigma(\tv{a}) = \rwildcard{X}$ is a valid solution.
\end{example}
\item \begin{example}\label{intro-example2}
This example displays an incorrect Java program.
The method call to \texttt{concat} with two wildcard lists is unsound.
Each list could be of a different kind and therefore the \texttt{concat} cannot succeed.
\begin{verbatim}
<A> List<A> concat(List<A> l1, List<A> l2) {}
List<?> list = ... ;
concat(list, list);
\end{verbatim}
The constraints for this example are:
$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}, \\
\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$
%$\exptype{List}{?} \lessdot \exptype{List}{\tv{a}},
%\exptype{List}{?} \lessdot \exptype{List}{\tv{a}}$
Remember that the given constraints cannot have a valid solution.
%This is due to the fact that the wildcard variables cannot leave their scope.
In this example the \unify{} algorithm should not replace $\tv{a}$ with the captured wildcard $\rwildcard{X}$.
\end{example}
\end{itemize}
The \unify{} algorithm only sees the constraints with no information about the program they originated from.
The main challenge was to find an algorithm which computes $\sigma(\tv{a}) = \rwildcard{X}$ for example \ref{intro-example1} but not for example \ref{intro-example2}.
% Solution: A type variable can only take one type and not a wildcard type.
% A wildcard type is only treated like a wildcard while his definition is in scope (during the reduce rule)
% \subsection{Capture Conversion}
% \begin{verbatim}
% <A> List<A> add(A a, List<A> la) {}
% List<? super String> list = ...;
% add("String", list);
% \end{verbatim}
% The constraints representig this code snippet are:
% $\type{String} \lessdot \tv{a},
% \wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\tv{a}}$
% Under the hood the typechecker has to find a replacement for the generic method parameter \texttt{A}.
% Java allows a method \texttt{<A> List<A> add(A a, List<A> la)} to be called with
% \texttt{String} and \texttt{List<? super String>}.
% A naive approach would be to treat wildcards like any other type and replace \texttt{A}
% with \texttt{? super String}.
% Generating the method type \texttt{List<? super String> add(? super String a, List<? super String> la)}.
% This does not work for a method like
% \texttt{<A> void merge(List<A> l1, List<A> l2)}.
% It is crucial for this method to be called with two lists of the same type.
% Substituting a wildcard for the generic \texttt{A} leads to
% \texttt{void merge(List<?> l1, List<?> l2)},
% which is unsound.
% Capture conversion utilizes the fact that instantiated classes always have an actual type.
% \texttt{new List<String>()} and \texttt{new List<Object>()} are
% valid instances.
% \texttt{new List<? super String>()} is not.
% Every type $\wcNtype{\Delta}{N}$ contains an underlying instance of a type $\type{N}$.
% Knowing this fact allows to make additional assumptions.
% \wildFJ{} type rules allow for generic parameters to be replaced by wildcard types.
% This is possible because wildcards are split into two parts.
% Their declaration at the environment part $\Delta$ of a type $\wcNtype{\Delta}{N}$
% and their actual use in the body of the type $\type{N}$.
% Replacing the generic \texttt{A} in the type \texttt{List<A>}
% by a wildcard $\type{X}$ will not generate the type \texttt{List<?>}.
% \texttt{List<?>} is the equivalent of $\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\type{X}}$.
% The generated type here is $\exptype{List}{\type{X}}$, which has different subtype properties.
% There is no information about what the type $\exptype{List}{\type{X}}$ actually is.
% It has itself as a subtype and for example a type like $\exptype{ArrayList}{\type{X}}$.
% A method call for example only works with values, which are correct instances of classes.
% Capture conversion automatically converts wildcard type to a concrete class type,
% which then can be used as a parameter for a method call.
% In \wildFJ{} this is done via let statements.
% Written in FJ syntax: $\type{N}$ is a initialized type and
% $\wcNtype{\Delta}{N}$ is a type containing wildcards $\Delta$.
% It is crucial for the wildcard names to never be used twice.
% Otherwise the members of a list with type $\wctype{\type{X}}{List}{\type{X}}$ and
% a list with type $\wctype{\type{X}}{List}{\type{X}}$ would be the same. $\type{X}$ in fact.
% The let statement adds wildcards to the environment.
% Those wildcards are not allowed to escape the scope of the let statement.
% Which is a problem for our type inference algorithm.
% The capture converted version of the type $\wctype{\rwildcard{X}}{Box}{\rwildcard{X}}$ is
% $\exptype{Box}{\rwildcard{X}}$.
% The captured type is only used as an intermediate type during a method call.