Cleanup and intro to Type INference

This commit is contained in:
Andreas Stadelmeier 2024-05-13 00:17:49 +02:00
parent fc508cf331
commit 5f33fa4711

View File

@ -16,8 +16,8 @@ But our global type inference algorithm is able to work on input programs which
We will show the different capabilities with an example.
In listing \ref{lst:tiExample} the method call \texttt{emptyList} is missing
its type parameters.
Java will infer \texttt{String} for the parameter \texttt{T} by using the type annotations
\texttt{List<String>} on the left side of the assignment and a matching algorithm \cite{javaTIisBroken}.
Java is using a matching algorithm \cite{javaTIisBroken} to replace \texttt{T} with \texttt{String}
resulting in the correct expected type \texttt{List<String>}.
%The invocation of \texttt{emptyList} missing type parameters can be added by Java's local type inference
%because the expected return type is known. \texttt{List<String>} in this case.
\begin{lstlisting}[caption=Extract of a valid Java program, label=lst:tiExample]
@ -27,24 +27,25 @@ List<String> ls = emptyList();
\end{lstlisting}
%\textit{Local Type Inference limitations:}
When calling the \texttt{emptyList} method without context its return value will be set to a \texttt{List<Object>}.
Note that local type inference depends on the type annotation on the left side of the assignment.
When calling the \texttt{emptyList} method without this type context its return value will be set to a \texttt{List<Object>}.
The Java code snippet in \ref{lst:tiLimit} is incorrect, because \texttt{emptyList()} returns
a \texttt{List<Object>} instead of the required $\exptype{List}{\exptype{List}{String}}$.
\begin{lstlisting}[caption=Limitations of Java's Type Inference, label=lst:tiLimit]
emptyList().add(new List<String>())
.get(0)
.get(0);
.get(0); //Typing Error
\end{lstlisting}
%List<String> <. A
%List<A> <: List<B>, B <: List<C>
% B = A and therefore A on the left and right side of constraints.
% this makes matching impossible
The return type of the first \texttt{get} method is based on the return type of
\texttt{emptyList},
but the second call to \texttt{get} is only possible if \texttt{emptyList} returns
a list of lists.
The second call to \texttt{get} produces a type error, because Java expects \texttt{emptyList} to return
a \texttt{List<Object>}.
The type inference algorithm presented in this paper will correctly replace the type parameter \texttt{T} of the \texttt{emptyList}
method with \texttt{List<List<String>>} and proof this code snippet correct.
The local type inference algorithm based on matching cannot produce this solution.
Here a type inference algorithm based on unification is needed.
Here our type inference algorithm based on unification is needed.
% \begin{verbatim}
% this.<List<String>>emptyList().add(new List<String>())
% .get(0)
@ -74,15 +75,6 @@ $
\tv{c} \lessdot \exptype{List}{\tv{d}}
$
TODO
Local type inference \cite{javaTIisBroken} is able to find a type substitution $\sigma$ satisfying
$\overline{\type{A} <: \sigma(\type{F}) }, \sigma(\type{R}) <: \type{E}$.
It is important that $\overline{\type{A}}$ and $\type{E}$ are given types and do not contain
type placeholders already used in $\overline{\type{F}}$.
The type parameters are not allowed on the left side of $A <: F$
We can generate method calls where this is the case. The second call to \texttt{get}.
% Local type inference cannot deal with type inference during the algorithm.
% If the left side contains unknown type parameters.
@ -143,18 +135,24 @@ We can generate method calls where this is the case. The second call to \texttt{
%The goal is to find a correct typing for a given Java program.
Type inference for Java has many use cases and could be used to help programmers by inserting correct types for them,
finding better type solutions for already typed Java programs (for example more generical ones),
or allowing to write typeless Java code which is then type infered and thereby type checked by our algorithm.
or allowing to write typeless Java code which is then type inferred and thereby type checked by our algorithm.
The algorithm proposed in this paper can determine a correct typing for the untyped Java source code example shown in figure \ref{fig:intro-example-typeless}.
Our algorithm is also capable of finding solutions involving wildcards as shown in figure \ref{fig:intro-example-typed}.
Our algorithm is also capable of finding solutions involving wildcards as shown in listing \ref{lst:intro-example-typed}.
%This paper extends a type inference algorithm for Featherweight Java \cite{TIforFGJ} by adding wildcards.
%The last step to create a type inference algorithm compatible to the Java type system.
The algorithm presented in this paper is a improved version of the one in \cite{TIforFGJ} including wildcard support.
%a modified version of the \unify{} algorithm presented in \cite{plue09_1}.
The input to the type inference algorithm is a Featherweight Java program (example in figure \ref{fig:nested-list-example-typeless}) conforming to the syntax shown in figure \ref{fig:syntax}.
The \fjtype{} algorithm calculates constraints based on this intermediate representation,
which are then solved by the \unify{} algorithm
resulting in a correctly typed program (see figure \ref{fig:nested-list-example-typed}).
% Proven sound on type rules of Featherweight Java, which are also proven to produce sound programs
% implication rules that follow the subtyping rules directly. Easy to understand soundness proof
% capture conversion is needed
The type inference algorithm for Generic Featherweight Java \cite{TIforFGJ} produces \texttt{Object} as the return type of the
\texttt{genBox} method in listing \ref{lst:intro-example-typeless}
whereas our type inference algorithm will infer the type solution shown in listing \ref{lst:intro-example-typed}.
An existing unification algorithm for Java with wildcards \cite{plue09_1} states the same capabilities,
but exposes some when it comes to method invocations.
Especially the problems shown in chapter \ref{challenges} are handled incorrectly.
Whereas our type inference algorithm is based on a Featherweight Java calculus \cite{WildFJ} and it's proven sound subtyping rules.
%But they are all correctly solved by our new type inference algorithm presented in this paper.
\begin{itemize}
\item
@ -188,101 +186,31 @@ We proof soundness and aim for a good compromise between completeness and time c
% }
% \end{verbatim}
\begin{figure}%[tp]
\begin{subfigure}[t]{\linewidth}
\begin{lstlisting}[style=fgj]
class List<A> {
List<A> add(A v) { ... }
}
class Example {
m(l, la, lb){
return l
.add(la.add(1))
.add(lb.add("str"));
}
}
\end{lstlisting}
\caption{Java method missing argument and return types}
\label{fig:nested-list-example-typeless}
\end{subfigure}
~
% \begin{subfigure}[t]{\linewidth}
% \begin{lstlisting}[style=tfgj]
% class List<A> {
% List<A> add(A v) { ... }
% }
% class Example {
% m(l, la, lb){
% return let r2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$@*) = {
% let r1 : (*@$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$@*) = l in {
% let p1 : (*@$\exptype{List}{\type{Integer}}$@*) = {
% let xa = la in xa.add(1)
% } in x1.add(p1)
% } in {
% let p2 = {
% let xb = lb in xb.add("str")
% } in x2.add(p2)
% };
% }
% }
% \end{lstlisting}
% \caption{Featherweight Java Representation}
% \label{fig:nested-list-example-let}
% \end{subfigure}
% ~
\begin{subfigure}[t]{\linewidth}
\begin{lstlisting}[style=tfgj]
class List<A> {
List<A> add(A v) { ... }
}
class Example {
m(List<List<? extends Object>> l, List<Integer> la, List<String> lb){
return l
.add(la.add(1))
.add(lb.add("str"));
}
}
\end{lstlisting}
\caption{Java Representation}
\label{fig:nested-list-example-typed}
\end{subfigure}
%\caption{Example code}
%\label{fig:intro-example-code}
\end{figure}
\begin{figure}%[tp]
\begin{subfigure}[t]{0.49\linewidth}
\begin{lstlisting}[style=fgj]
genList() {
\begin{figure}
\begin{minipage}{0.43\textwidth}
\begin{lstlisting}[style=java,label=lst:intro-example-typeless,caption=Missing return type]
genBox() {
if( ... ) {
return new List().add(1);
return new Box(1);
} else {
return new List().add("String");
return new Box("Str");
}
}
\end{lstlisting}
\caption{Java method with missing return type}
\label{fig:intro-example-typeless}
\end{subfigure}
~
\begin{subfigure}[t]{0.49\linewidth}
\begin{lstlisting}[style=tfgj]
List<?> genList() {
\end{minipage}%
\hfill
\begin{minipage}{0.55\textwidth}
\begin{lstlisting}[style=tfgj,caption=Correct type,label=lst:intro-example-typed]
Box<?> genBox() {
if( ... ) {
return new Box<Integer>(1);
} else {
return new Box<String>("String");
return new Box<String>("Str");
}
}
\end{lstlisting}
\caption{Correct type}
\label{fig:intro-example-typed}
\end{subfigure}
%\caption{Example code}
%\label{fig:intro-example-code}
\end{minipage}
\end{figure}
% \subsection{Wildcards}
@ -373,30 +301,29 @@ TODO %Explain Java Capture Conversion
%show input and a correct letFJ representation
%TODO: first show local type inference and explain lessdotCC constraints. then show example with global TI
\begin{figure}[h]
\begin{subfigure}[t]{0.47\textwidth}
\centering
\begin{minipage}{0.49\textwidth}
\begin{lstlisting}[style=tfgj, caption=Valid Java program, label=lst:addExample]
<A> List<A> add(List<A> l, A v) ...
List<? super String> l = ...;
add(l, "String");
\end{lstlisting}
\end{subfigure}\hfill
\begin{subfigure}[t]{0.47\textwidth}
\centering
\end{minipage}\hfill
\begin{minipage}{0.49\textwidth}
\begin{lstlisting}[style=letfj, caption=\letfj{} representation, label=lst:addExampleLet]
<A> List<A> add(List<A> l, A v)
let l2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$@*) = l
in <X>add(l2, "String");
\end{lstlisting}
\end{subfigure}
\end{minipage}
\end{figure}
The Example in listing \ref{lst:addExample} is a valid Java program. Here Java uses local type inference \cite{JavaLocalTI}
to determine the type parameters to the \texttt{add} method call.
In \letfj{} there is no local type inference and all type parameters for a method call are mandatory (see listing \ref{lst:addExampleLet}).
If wildcards are involved the so called capture conversion has to be done manually via let statements.
A \TamedFJ{} representation including all type annotations and an explicit capture conversion via let statement is shown in listing \ref{lst:addExampleLet}.
%In \letfj{} there is no local type inference and all type parameters for a method call are mandatory (see listing \ref{lst:addExampleLet}).
%If wildcards are involved the so called capture conversion has to be done manually via let statements.
%A let statement \emph{opens} an existential type.
In the body of the let statement the \textit{capture type} $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$
becomes $\exptype{List}{\rwildcard{X}}$ and the wildcard $\wildcard{X}{\type{Object}}{\type{String}}$ is free and can be used as
@ -574,7 +501,7 @@ $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}} \lessdotCC \exptype
% List<?> m() = new List<String>() :? new List<Integer>() :? id(m());
% \end{verbatim}
\subsection{\TamedFJ{} and \letfj{}}
\subsection{\TamedFJ{}}
%LetFJ -> Output language!
%TamedFJ -> ANF transformed input langauge
%Input language only described here. It is standard Featherweight Java
@ -582,23 +509,21 @@ $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}} \lessdotCC \exptype
% the constraint generation step assumes every method argument to be encapsulated in a let statement. This is the way Java is doing capture conversion
The input to our algorithm is a typeless version of Featherweight Java.
Methods are declared without parameter or return types.
We still keep type annotations for fields and generic class parameters.
Method parameters and return types are optional.
We still require type annotations for fields and generic class parameters.
This is a design choice by us,
as we consider them as data declarations which are given by the programmer.
% They are inferred in for example \cite{plue14_3b}
Note the \texttt{new} expression not requiring generic parameters,
which are also inferred by our algorithm.
The method call naturally has no generic parameters aswell.
The method call naturally has type inferred generic parameters aswell.
We add the elvis operator ($\elvis{}$) to the syntax mainly to showcase applications involving wildcard types.
The syntax is described in figure \ref{fig:outputSyntax} with optional type annotations highlighted in yellow.
It is possible to exclude all optional types.
The output is a valid Featherweight Java program.
We use the syntax of the version introduced in \cite{WildcardsNeedWitnessProtection}
calling it \letfj{} for that it is a Featherweight Java variant including \texttt{let} statements.
% Our output syntax is shown in figure \ref{fig:outputSyntax}
% which is actually a subset of \letfj{}, because we omit \texttt{null} types.
% The output is a valid Featherweight Java program.
% We use the syntax of the version introduced in \cite{WildcardsNeedWitnessProtection}
% calling it \letfj{} for that it is a Featherweight Java variant including \texttt{let} statements.
\newcommand{\highlight}[1]{\begingroup\fboxsep=0pt\colorbox{yellow}{$\displaystyle #1$}\endgroup}
\begin{figure}
@ -624,10 +549,12 @@ $
\caption{Input Syntax with optional type annotations}\label{fig:outputSyntax}
\end{figure}
The output of our type inference algorithm is fully and correctly typed \TamedFJ{} program.
Before generating constraints the input is transformed by an ANF transformation (see section \ref{sec:anfTransformation}).
Constraints are generated on the basis of the program in A-Normal form.
After adding the missing type annotations the resulting program is valid under the typing rules in \cite{WildFJ}.
The output of our type inference algorithm is a valid \letfj{} program.
Before generating constraints the input is transformed to \TamedFJ{} (see section \ref{sec:anfTransformation}).
After adding the missing type annotations the resulting program is a valid \letfj{} program.
%This is shown in chapter \ref{sec:soundness}
Capture conversion is only needed for wildcard types,
but we don't know which expressions will spawn wildcard types because there are no type annotations yet.