Cleanup and intro to Type INference

This commit is contained in:
Andreas Stadelmeier 2024-05-13 00:17:49 +02:00
parent fc508cf331
commit 5f33fa4711

View File

@ -16,8 +16,8 @@ But our global type inference algorithm is able to work on input programs which
We will show the different capabilities with an example. We will show the different capabilities with an example.
In listing \ref{lst:tiExample} the method call \texttt{emptyList} is missing In listing \ref{lst:tiExample} the method call \texttt{emptyList} is missing
its type parameters. its type parameters.
Java will infer \texttt{String} for the parameter \texttt{T} by using the type annotations Java is using a matching algorithm \cite{javaTIisBroken} to replace \texttt{T} with \texttt{String}
\texttt{List<String>} on the left side of the assignment and a matching algorithm \cite{javaTIisBroken}. resulting in the correct expected type \texttt{List<String>}.
%The invocation of \texttt{emptyList} missing type parameters can be added by Java's local type inference %The invocation of \texttt{emptyList} missing type parameters can be added by Java's local type inference
%because the expected return type is known. \texttt{List<String>} in this case. %because the expected return type is known. \texttt{List<String>} in this case.
\begin{lstlisting}[caption=Extract of a valid Java program, label=lst:tiExample] \begin{lstlisting}[caption=Extract of a valid Java program, label=lst:tiExample]
@ -27,24 +27,25 @@ List<String> ls = emptyList();
\end{lstlisting} \end{lstlisting}
%\textit{Local Type Inference limitations:} %\textit{Local Type Inference limitations:}
When calling the \texttt{emptyList} method without context its return value will be set to a \texttt{List<Object>}. Note that local type inference depends on the type annotation on the left side of the assignment.
When calling the \texttt{emptyList} method without this type context its return value will be set to a \texttt{List<Object>}.
The Java code snippet in \ref{lst:tiLimit} is incorrect, because \texttt{emptyList()} returns The Java code snippet in \ref{lst:tiLimit} is incorrect, because \texttt{emptyList()} returns
a \texttt{List<Object>} instead of the required $\exptype{List}{\exptype{List}{String}}$. a \texttt{List<Object>} instead of the required $\exptype{List}{\exptype{List}{String}}$.
\begin{lstlisting}[caption=Limitations of Java's Type Inference, label=lst:tiLimit] \begin{lstlisting}[caption=Limitations of Java's Type Inference, label=lst:tiLimit]
emptyList().add(new List<String>()) emptyList().add(new List<String>())
.get(0) .get(0)
.get(0); .get(0); //Typing Error
\end{lstlisting} \end{lstlisting}
%List<String> <. A %List<String> <. A
%List<A> <: List<B>, B <: List<C> %List<A> <: List<B>, B <: List<C>
% B = A and therefore A on the left and right side of constraints. % B = A and therefore A on the left and right side of constraints.
% this makes matching impossible % this makes matching impossible
The return type of the first \texttt{get} method is based on the return type of The second call to \texttt{get} produces a type error, because Java expects \texttt{emptyList} to return
\texttt{emptyList}, a \texttt{List<Object>}.
but the second call to \texttt{get} is only possible if \texttt{emptyList} returns The type inference algorithm presented in this paper will correctly replace the type parameter \texttt{T} of the \texttt{emptyList}
a list of lists. method with \texttt{List<List<String>>} and proof this code snippet correct.
The local type inference algorithm based on matching cannot produce this solution. The local type inference algorithm based on matching cannot produce this solution.
Here a type inference algorithm based on unification is needed. Here our type inference algorithm based on unification is needed.
% \begin{verbatim} % \begin{verbatim}
% this.<List<String>>emptyList().add(new List<String>()) % this.<List<String>>emptyList().add(new List<String>())
% .get(0) % .get(0)
@ -74,15 +75,6 @@ $
\tv{c} \lessdot \exptype{List}{\tv{d}} \tv{c} \lessdot \exptype{List}{\tv{d}}
$ $
TODO
Local type inference \cite{javaTIisBroken} is able to find a type substitution $\sigma$ satisfying
$\overline{\type{A} <: \sigma(\type{F}) }, \sigma(\type{R}) <: \type{E}$.
It is important that $\overline{\type{A}}$ and $\type{E}$ are given types and do not contain
type placeholders already used in $\overline{\type{F}}$.
The type parameters are not allowed on the left side of $A <: F$
We can generate method calls where this is the case. The second call to \texttt{get}.
% Local type inference cannot deal with type inference during the algorithm. % Local type inference cannot deal with type inference during the algorithm.
% If the left side contains unknown type parameters. % If the left side contains unknown type parameters.
@ -143,18 +135,24 @@ We can generate method calls where this is the case. The second call to \texttt{
%The goal is to find a correct typing for a given Java program. %The goal is to find a correct typing for a given Java program.
Type inference for Java has many use cases and could be used to help programmers by inserting correct types for them, Type inference for Java has many use cases and could be used to help programmers by inserting correct types for them,
finding better type solutions for already typed Java programs (for example more generical ones), finding better type solutions for already typed Java programs (for example more generical ones),
or allowing to write typeless Java code which is then type infered and thereby type checked by our algorithm. or allowing to write typeless Java code which is then type inferred and thereby type checked by our algorithm.
The algorithm proposed in this paper can determine a correct typing for the untyped Java source code example shown in figure \ref{fig:intro-example-typeless}. The algorithm proposed in this paper can determine a correct typing for the untyped Java source code example shown in figure \ref{fig:intro-example-typeless}.
Our algorithm is also capable of finding solutions involving wildcards as shown in figure \ref{fig:intro-example-typed}. Our algorithm is also capable of finding solutions involving wildcards as shown in listing \ref{lst:intro-example-typed}.
%This paper extends a type inference algorithm for Featherweight Java \cite{TIforFGJ} by adding wildcards. %This paper extends a type inference algorithm for Featherweight Java \cite{TIforFGJ} by adding wildcards.
%The last step to create a type inference algorithm compatible to the Java type system. %The last step to create a type inference algorithm compatible to the Java type system.
The algorithm presented in this paper is a improved version of the one in \cite{TIforFGJ} including wildcard support. The algorithm presented in this paper is a improved version of the one in \cite{TIforFGJ} including wildcard support.
%a modified version of the \unify{} algorithm presented in \cite{plue09_1}. % Proven sound on type rules of Featherweight Java, which are also proven to produce sound programs
The input to the type inference algorithm is a Featherweight Java program (example in figure \ref{fig:nested-list-example-typeless}) conforming to the syntax shown in figure \ref{fig:syntax}. % implication rules that follow the subtyping rules directly. Easy to understand soundness proof
The \fjtype{} algorithm calculates constraints based on this intermediate representation, % capture conversion is needed
which are then solved by the \unify{} algorithm The type inference algorithm for Generic Featherweight Java \cite{TIforFGJ} produces \texttt{Object} as the return type of the
resulting in a correctly typed program (see figure \ref{fig:nested-list-example-typed}). \texttt{genBox} method in listing \ref{lst:intro-example-typeless}
whereas our type inference algorithm will infer the type solution shown in listing \ref{lst:intro-example-typed}.
An existing unification algorithm for Java with wildcards \cite{plue09_1} states the same capabilities,
but exposes some when it comes to method invocations.
Especially the problems shown in chapter \ref{challenges} are handled incorrectly.
Whereas our type inference algorithm is based on a Featherweight Java calculus \cite{WildFJ} and it's proven sound subtyping rules.
%But they are all correctly solved by our new type inference algorithm presented in this paper.
\begin{itemize} \begin{itemize}
\item \item
@ -188,101 +186,31 @@ We proof soundness and aim for a good compromise between completeness and time c
% } % }
% \end{verbatim} % \end{verbatim}
\begin{figure}%[tp]
\begin{subfigure}[t]{\linewidth}
\begin{lstlisting}[style=fgj]
class List<A> {
List<A> add(A v) { ... }
}
class Example { \begin{figure}
m(l, la, lb){ \begin{minipage}{0.43\textwidth}
return l \begin{lstlisting}[style=java,label=lst:intro-example-typeless,caption=Missing return type]
.add(la.add(1)) genBox() {
.add(lb.add("str"));
}
}
\end{lstlisting}
\caption{Java method missing argument and return types}
\label{fig:nested-list-example-typeless}
\end{subfigure}
~
% \begin{subfigure}[t]{\linewidth}
% \begin{lstlisting}[style=tfgj]
% class List<A> {
% List<A> add(A v) { ... }
% }
% class Example {
% m(l, la, lb){
% return let r2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$@*) = {
% let r1 : (*@$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$@*) = l in {
% let p1 : (*@$\exptype{List}{\type{Integer}}$@*) = {
% let xa = la in xa.add(1)
% } in x1.add(p1)
% } in {
% let p2 = {
% let xb = lb in xb.add("str")
% } in x2.add(p2)
% };
% }
% }
% \end{lstlisting}
% \caption{Featherweight Java Representation}
% \label{fig:nested-list-example-let}
% \end{subfigure}
% ~
\begin{subfigure}[t]{\linewidth}
\begin{lstlisting}[style=tfgj]
class List<A> {
List<A> add(A v) { ... }
}
class Example {
m(List<List<? extends Object>> l, List<Integer> la, List<String> lb){
return l
.add(la.add(1))
.add(lb.add("str"));
}
}
\end{lstlisting}
\caption{Java Representation}
\label{fig:nested-list-example-typed}
\end{subfigure}
%\caption{Example code}
%\label{fig:intro-example-code}
\end{figure}
\begin{figure}%[tp]
\begin{subfigure}[t]{0.49\linewidth}
\begin{lstlisting}[style=fgj]
genList() {
if( ... ) { if( ... ) {
return new List().add(1); return new Box(1);
} else { } else {
return new List().add("String"); return new Box("Str");
} }
} }
\end{lstlisting} \end{lstlisting}
\caption{Java method with missing return type} \end{minipage}%
\label{fig:intro-example-typeless} \hfill
\end{subfigure} \begin{minipage}{0.55\textwidth}
~ \begin{lstlisting}[style=tfgj,caption=Correct type,label=lst:intro-example-typed]
\begin{subfigure}[t]{0.49\linewidth} Box<?> genBox() {
\begin{lstlisting}[style=tfgj]
List<?> genList() {
if( ... ) { if( ... ) {
return new Box<Integer>(1); return new Box<Integer>(1);
} else { } else {
return new Box<String>("String"); return new Box<String>("Str");
} }
} }
\end{lstlisting} \end{lstlisting}
\caption{Correct type} \end{minipage}
\label{fig:intro-example-typed}
\end{subfigure}
%\caption{Example code}
%\label{fig:intro-example-code}
\end{figure} \end{figure}
% \subsection{Wildcards} % \subsection{Wildcards}
@ -373,30 +301,29 @@ TODO %Explain Java Capture Conversion
%show input and a correct letFJ representation %show input and a correct letFJ representation
%TODO: first show local type inference and explain lessdotCC constraints. then show example with global TI %TODO: first show local type inference and explain lessdotCC constraints. then show example with global TI
\begin{figure}[h] \begin{figure}[h]
\begin{subfigure}[t]{0.47\textwidth} \begin{minipage}{0.49\textwidth}
\centering
\begin{lstlisting}[style=tfgj, caption=Valid Java program, label=lst:addExample] \begin{lstlisting}[style=tfgj, caption=Valid Java program, label=lst:addExample]
<A> List<A> add(List<A> l, A v) ... <A> List<A> add(List<A> l, A v) ...
List<? super String> l = ...; List<? super String> l = ...;
add(l, "String"); add(l, "String");
\end{lstlisting} \end{lstlisting}
\end{subfigure}\hfill \end{minipage}\hfill
\begin{subfigure}[t]{0.47\textwidth} \begin{minipage}{0.49\textwidth}
\centering
\begin{lstlisting}[style=letfj, caption=\letfj{} representation, label=lst:addExampleLet] \begin{lstlisting}[style=letfj, caption=\letfj{} representation, label=lst:addExampleLet]
<A> List<A> add(List<A> l, A v) <A> List<A> add(List<A> l, A v)
let l2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$@*) = l let l2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$@*) = l
in <X>add(l2, "String"); in <X>add(l2, "String");
\end{lstlisting} \end{lstlisting}
\end{subfigure} \end{minipage}
\end{figure} \end{figure}
The Example in listing \ref{lst:addExample} is a valid Java program. Here Java uses local type inference \cite{JavaLocalTI} The Example in listing \ref{lst:addExample} is a valid Java program. Here Java uses local type inference \cite{JavaLocalTI}
to determine the type parameters to the \texttt{add} method call. to determine the type parameters to the \texttt{add} method call.
In \letfj{} there is no local type inference and all type parameters for a method call are mandatory (see listing \ref{lst:addExampleLet}). A \TamedFJ{} representation including all type annotations and an explicit capture conversion via let statement is shown in listing \ref{lst:addExampleLet}.
If wildcards are involved the so called capture conversion has to be done manually via let statements. %In \letfj{} there is no local type inference and all type parameters for a method call are mandatory (see listing \ref{lst:addExampleLet}).
%If wildcards are involved the so called capture conversion has to be done manually via let statements.
%A let statement \emph{opens} an existential type. %A let statement \emph{opens} an existential type.
In the body of the let statement the \textit{capture type} $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$ In the body of the let statement the \textit{capture type} $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$
becomes $\exptype{List}{\rwildcard{X}}$ and the wildcard $\wildcard{X}{\type{Object}}{\type{String}}$ is free and can be used as becomes $\exptype{List}{\rwildcard{X}}$ and the wildcard $\wildcard{X}{\type{Object}}{\type{String}}$ is free and can be used as
@ -574,7 +501,7 @@ $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}} \lessdotCC \exptype
% List<?> m() = new List<String>() :? new List<Integer>() :? id(m()); % List<?> m() = new List<String>() :? new List<Integer>() :? id(m());
% \end{verbatim} % \end{verbatim}
\subsection{\TamedFJ{} and \letfj{}} \subsection{\TamedFJ{}}
%LetFJ -> Output language! %LetFJ -> Output language!
%TamedFJ -> ANF transformed input langauge %TamedFJ -> ANF transformed input langauge
%Input language only described here. It is standard Featherweight Java %Input language only described here. It is standard Featherweight Java
@ -582,23 +509,21 @@ $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}} \lessdotCC \exptype
% the constraint generation step assumes every method argument to be encapsulated in a let statement. This is the way Java is doing capture conversion % the constraint generation step assumes every method argument to be encapsulated in a let statement. This is the way Java is doing capture conversion
The input to our algorithm is a typeless version of Featherweight Java. The input to our algorithm is a typeless version of Featherweight Java.
Methods are declared without parameter or return types. Method parameters and return types are optional.
We still keep type annotations for fields and generic class parameters. We still require type annotations for fields and generic class parameters.
This is a design choice by us, This is a design choice by us,
as we consider them as data declarations which are given by the programmer. as we consider them as data declarations which are given by the programmer.
% They are inferred in for example \cite{plue14_3b} % They are inferred in for example \cite{plue14_3b}
Note the \texttt{new} expression not requiring generic parameters, Note the \texttt{new} expression not requiring generic parameters,
which are also inferred by our algorithm. which are also inferred by our algorithm.
The method call naturally has no generic parameters aswell. The method call naturally has type inferred generic parameters aswell.
We add the elvis operator ($\elvis{}$) to the syntax mainly to showcase applications involving wildcard types. We add the elvis operator ($\elvis{}$) to the syntax mainly to showcase applications involving wildcard types.
The syntax is described in figure \ref{fig:outputSyntax} with optional type annotations highlighted in yellow. The syntax is described in figure \ref{fig:outputSyntax} with optional type annotations highlighted in yellow.
It is possible to exclude all optional types. It is possible to exclude all optional types.
The output is a valid Featherweight Java program. % The output is a valid Featherweight Java program.
We use the syntax of the version introduced in \cite{WildcardsNeedWitnessProtection} % We use the syntax of the version introduced in \cite{WildcardsNeedWitnessProtection}
calling it \letfj{} for that it is a Featherweight Java variant including \texttt{let} statements. % calling it \letfj{} for that it is a Featherweight Java variant including \texttt{let} statements.
% Our output syntax is shown in figure \ref{fig:outputSyntax}
% which is actually a subset of \letfj{}, because we omit \texttt{null} types.
\newcommand{\highlight}[1]{\begingroup\fboxsep=0pt\colorbox{yellow}{$\displaystyle #1$}\endgroup} \newcommand{\highlight}[1]{\begingroup\fboxsep=0pt\colorbox{yellow}{$\displaystyle #1$}\endgroup}
\begin{figure} \begin{figure}
@ -624,10 +549,12 @@ $
\caption{Input Syntax with optional type annotations}\label{fig:outputSyntax} \caption{Input Syntax with optional type annotations}\label{fig:outputSyntax}
\end{figure} \end{figure}
The output of our type inference algorithm is fully and correctly typed \TamedFJ{} program.
Before generating constraints the input is transformed by an ANF transformation (see section \ref{sec:anfTransformation}).
Constraints are generated on the basis of the program in A-Normal form.
After adding the missing type annotations the resulting program is valid under the typing rules in \cite{WildFJ}.
The output of our type inference algorithm is a valid \letfj{} program.
Before generating constraints the input is transformed to \TamedFJ{} (see section \ref{sec:anfTransformation}).
After adding the missing type annotations the resulting program is a valid \letfj{} program.
%This is shown in chapter \ref{sec:soundness} %This is shown in chapter \ref{sec:soundness}
Capture conversion is only needed for wildcard types, Capture conversion is only needed for wildcard types,
but we don't know which expressions will spawn wildcard types because there are no type annotations yet. but we don't know which expressions will spawn wildcard types because there are no type annotations yet.