%TODO: % -Explain } on the left side of the assignment and a matching algorithm \cite{javaTIisBroken}. %The invocation of \texttt{emptyList} missing type parameters can be added by Java's local type inference %because the expected return type is known. \texttt{List} in this case. \begin{lstlisting}[caption=Extract of a valid Java program, label=lst:tiExample] List emptyList() { ... } List ls = emptyList(); \end{lstlisting} %\textit{Local Type Inference limitations:} When calling the \texttt{emptyList} method without context its return value will be set to a \texttt{List}. The Java code snippet in \ref{lst:tiLimit} is incorrect, because \texttt{emptyList()} returns a \texttt{List} instead of the required $\exptype{List}{\exptype{List}{String}}$. \begin{lstlisting}[caption=Limitations of Java's Type Inference, label=lst:tiLimit] emptyList().add(new List()) .get(0) .get(0); \end{lstlisting} %List <. A %List <: List, B <: List % B = A and therefore A on the left and right side of constraints. % this makes matching impossible The return type of the first \texttt{get} method is based on the return type of \texttt{emptyList}, but the second call to \texttt{get} is only possible if \texttt{emptyList} returns a list of lists. The local type inference algorithm based on matching cannot produce this solution. Here a type inference algorithm based on unification is needed. % \begin{verbatim} % this.>emptyList().add(new List()) % .get(0) % .get(0); % \end{verbatim} % $ % \begin{array}[b]{c} % \begin{array}[b]{c} % \texttt{emptyList()} : \exptype{List}{\exptype{List}{\type{String}}} % \\ % \hline % \texttt{emptyList().get()} : \exptype{List}{\type{String}} % \end{array} \rulenameAfter{T-Call} % \quad \generics{\type{X}}\exptype{List}{\type{X}} \to \type{X} \in \mtypeEnvironment{}(\texttt{get})1 % \\ % \hline % \texttt{emptyList().get().get()} : \type{String} % \end{array} \rulenameAfter{T-Call} % $ %motivate constraints: To solve this example our Type Inference algorithm will create constraints $ \exptype{List}{\tv{a}} \lessdot \tv{b}, \tv{b} \lessdot \exptype{List}{\tv{c}}, \tv{c} \lessdot \exptype{List}{\tv{d}} $ TODO Local type inference \cite{javaTIisBroken} is able to find a type substitution $\sigma$ satisfying $\overline{\type{A} <: \sigma(\type{F}) }, \sigma(\type{R}) <: \type{E}$. It is important that $\overline{\type{A}}$ and $\type{E}$ are given types and do not contain type placeholders already used in $\overline{\type{F}}$. The type parameters are not allowed on the left side of $A <: F$ We can generate method calls where this is the case. The second call to \texttt{get}. % Local type inference cannot deal with type inference during the algorithm. % If the left side contains unknown type parameters. % \begin{verbatim} % import java.util.ArrayList; % import java.util.stream.*; % class Test { % void test(){ % var s = new ArrayList().stream().map(i -> 1); % receive(s); % } % void receive(Stream l){} % } % \end{verbatim} % TypeError: % \begin{verbatim} % void test(){ % var l = new ArrayList(); % l.add("hi"); % var s = l.stream().map(i -> 1).collect(Collectors.toList()); % var s2 = l.stream().map(i -> "String").collect(Collectors.toList()); % receive(s, s2); % } % void receive(List l, List l2){} % \end{verbatim} % Correct: % \begin{verbatim} % void test(){ % var l = new ArrayList(); % l.add("hi"); % List s = l.stream().map(i -> 1).collect(Collectors.toList()); % List s2 = l.stream().map(i -> "String").collect(Collectors.toList()); % receive(s, s2); % } % void receive(List l, List l2){} % \end{verbatim} % Error: % \begin{verbatim} % rrr(this.emptyBox().set(this.emptyBox()).set(this.emptyBox())); % \end{verbatim} % Correct: % \begin{verbatim} % rrr(this.>emptyBox().set(this.emptyBox()).set(this.emptyBox())); % \end{verbatim} % \begin{recap}{Java Local Type Inference} % Local type inference is able to solve constraints of the form % T <. b, b <. T where T are given types % \end{recap} % Local Type inference cannot infer F-Bounded types (TODO: we can, right?) \section{Type Inference for Java} %The goal is to find a correct typing for a given Java program. Type inference for Java has many use cases and could be used to help programmers by inserting correct types for them, finding better type solutions for already typed Java programs (for example more generical ones), or allowing to write typeless Java code which is then type infered and thereby type checked by our algorithm. The algorithm proposed in this paper can determine a correct typing for the untyped Java source code example shown in figure \ref{fig:intro-example-typeless}. Our algorithm is also capable of finding solutions involving wildcards as shown in figure \ref{fig:intro-example-typed}. %This paper extends a type inference algorithm for Featherweight Java \cite{TIforFGJ} by adding wildcards. %The last step to create a type inference algorithm compatible to the Java type system. The algorithm presented in this paper is a improved version of the one in \cite{TIforFGJ} including wildcard support. %a modified version of the \unify{} algorithm presented in \cite{plue09_1}. The input to the type inference algorithm is a Featherweight Java program (example in figure \ref{fig:nested-list-example-typeless}) conforming to the syntax shown in figure \ref{fig:syntax}. The \fjtype{} algorithm calculates constraints based on this intermediate representation, which are then solved by the \unify{} algorithm resulting in a correctly typed program (see figure \ref{fig:nested-list-example-typed}). \begin{itemize} \item We introduce the language \tifj{} (chapter \ref{sec:tifj}). A Featherweight Java derivative including Generics, Wildcards and Type Inference. \item We support capture conversion and Java style method calls. This requires existential types in a form which is not denotable by Java syntax \cite{aModelForJavaWithWildcards}. \item We present a novel approach to deal with existential types and capture conversion during constraint unification. \item The algorithm is split in two parts. A constraint generation step and an unification step. Input language and constraint generations can be extended without altering the unification part. \item We proof soundness and aim for a good compromise between completeness and time complexity. \end{itemize} % Our algorithm finds a correct type solution for the following example, where the Java local type inference fails: % \begin{verbatim} % class SuperPair{ % A a; % B b; % } % class Pair extends SuperPair{ % A a; % B b; % X choose(X a, X b){ return b; } % String m(List> a, List> b){ % return choose(choose(a,b).value.a,b.value.b); % } % } % \end{verbatim} \begin{figure}%[tp] \begin{subfigure}[t]{\linewidth} \begin{lstlisting}[style=fgj] class List { List add(A v) { ... } } class Example { m(l, la, lb){ return l .add(la.add(1)) .add(lb.add("str")); } } \end{lstlisting} \caption{Java method missing argument and return types} \label{fig:nested-list-example-typeless} \end{subfigure} ~ % \begin{subfigure}[t]{\linewidth} % \begin{lstlisting}[style=tfgj] % class List { % List add(A v) { ... } % } % class Example { % m(l, la, lb){ % return let r2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$@*) = { % let r1 : (*@$\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$@*) = l in { % let p1 : (*@$\exptype{List}{\type{Integer}}$@*) = { % let xa = la in xa.add(1) % } in x1.add(p1) % } in { % let p2 = { % let xb = lb in xb.add("str") % } in x2.add(p2) % }; % } % } % \end{lstlisting} % \caption{Featherweight Java Representation} % \label{fig:nested-list-example-let} % \end{subfigure} % ~ \begin{subfigure}[t]{\linewidth} \begin{lstlisting}[style=tfgj] class List { List add(A v) { ... } } class Example { m(List> l, List la, List lb){ return l .add(la.add(1)) .add(lb.add("str")); } } \end{lstlisting} \caption{Java Representation} \label{fig:nested-list-example-typed} \end{subfigure} %\caption{Example code} %\label{fig:intro-example-code} \end{figure} \begin{figure}%[tp] \begin{subfigure}[t]{0.49\linewidth} \begin{lstlisting}[style=fgj] genList() { if( ... ) { return new List().add(1); } else { return new List().add("String"); } } \end{lstlisting} \caption{Java method with missing return type} \label{fig:intro-example-typeless} \end{subfigure} ~ \begin{subfigure}[t]{0.49\linewidth} \begin{lstlisting}[style=tfgj] List genList() { if( ... ) { return new Box(1); } else { return new Box("String"); } } \end{lstlisting} \caption{Correct type} \label{fig:intro-example-typed} \end{subfigure} %\caption{Example code} %\label{fig:intro-example-code} \end{figure} % \subsection{Wildcards} % Java subtyping involving generics is invariant. % For example \texttt{List} is not a subtype of \texttt{List}. % %Wildcards introduce variance by allowing \texttt{List} to be a subtype of \texttt{List}. % \texttt{List} is not a valid return type for the method \texttt{genList}. % The type inference algorithm has to find the correct type involving wildcards (\texttt{List}). \subsection{Java Wildcards} Java has invariant subtyping for polymorphic types. %but it incooperates use-site variance via so called wildcard types. A \texttt{List} is not a subtype of \texttt{List} even though it seems intuitive with \texttt{String} being a subtype of \texttt{Object}. To make the type system more expressive Java incooperates use-site variance by allowing wildcards (\texttt{?}) in type annotations. For example a type \texttt{List} (short for \texttt{List}, with \texttt{?} being a placeholder for any type) %with the wildcard \texttt{?} representing a placeholder for any type is a supertype of \texttt{List} and \texttt{List}. %The \texttt{?} is a wildcard type which can be replaced by any type as needed. %Class instances in Java are mutable and passed by reference. Subtyping must be invariant in Java otherwise the integrity of data classes like \texttt{List} cannot be guaranteed. See listing \ref{lst:invarianceExample} for example where an \texttt{Integer} would be added to a list of \texttt{String} if not for the Java type system which rejects the assignment \texttt{lo = ls}. Listing \ref{lst:wildcardIntro} shows the use of wildcards rendering the assignment \texttt{lo = ls} correct. The program still does not compile, because now the addition of an Integer to \texttt{lo} is rightfully deemed incorrect by Java. % Goal: Motivate the TamedFJ language. Why do we need it (Capture Conversion) This behaviour is emulated by our language \TamedFJ{}, a Featherweight Java \cite{FJ} derivative with added wildcard support and a global type inference feature (syntax definition in section \ref{sec:tifj}). %\TamedFJ{} is basically the language described by \textit{Bierhoff} \cite{WildcardsNeedWitnessProtection} with optional type annotations. Let's have a look at a representation of the \texttt{add} call from the last line in listing \ref{lst:wildcardIntro} with our calculus \TamedFJ{}: %The \texttt{add} call in listing \ref{lst:wildcardIntro} needs to be encased by a \texttt{let} statement in our calculus. %This makes the capture converion explicit. \begin{lstlisting} let v : (*@$\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{A}}$@*) = lo in v.add(new Integer(1)); \end{lstlisting} The method call is encased in a \texttt{let} statement and \expr{lo} is assigned to a new variable \expr{v} of \linebreak[2] \textbf{Existential Type} $\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{A}}$. Our calculus uses existential types \cite{WildFJ} to formalize wildcards: \texttt{List} is translated to $\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$ and \texttt{List} is expressed as $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$. The syntax used here allows for wildcard parameters to have a name, an uppper and lower bound, and a type they are bound to. In this case the name is $\rwildcard{A}$ and it's bound to the the type \texttt{List}. Inside the \texttt{let} statement the variable \expr{v} has the type $\exptype{List}{\rwildcard{A}}$. This is an explicit version of \linebreak[2] \textbf{Capture Conversion}, which makes use of the fact that a concrete type must be behind every wildcard type. There is no instantiation of a \texttt{List}, but there exists some unknown type $\exptype{List}{\rwildcard{A}}$, with $\rwildcard{A}$ inbetween the bounds $\bot$ (bottom type, subtype of all types) and \texttt{Object}. Inside the body of the let statement \expr{v} is treated as a value with the constant type $\exptype{List}{\rwildcard{A}}$. Existential types enable us to formalize \textit{Capture Conversion}. Polymorphic method calls need to be wraped in a process which \textit{opens} existential types \cite{addingWildcardsToJava}. In Java this is done implicitly in a process called capture conversion. \begin{figure} \begin{minipage}{0.4\textwidth} \begin{lstlisting}[caption=Java Invariance Example,label=lst:invarianceExample]{java} List ls = ...; List lo = ...; lo = ls; // typing error! lo.add(new Integer(1)); \end{lstlisting} \end{minipage} \hfill \begin{minipage}{0.5\textwidth} \begin{lstlisting}[caption=Use-Site Variance Example,label=lst:wildcardIntro]{java} List ls = ...; List lo = ...; lo = ls; // correct lo.add(new Integer(1)); // error! \end{lstlisting} \end{minipage} \end{figure} \begin{recap}{\textbf{Capture Conversion}} TODO %Explain Java Capture Conversion \end{recap} %show input and a correct letFJ representation %TODO: first show local type inference and explain lessdotCC constraints. then show example with global TI \begin{figure}[h] \begin{subfigure}[t]{0.47\textwidth} \centering \begin{lstlisting}[style=tfgj, caption=Valid Java program, label=lst:addExample] List add(List l, A v) ... List l = ...; add(l, "String"); \end{lstlisting} \end{subfigure}\hfill \begin{subfigure}[t]{0.47\textwidth} \centering \begin{lstlisting}[style=letfj, caption=\letfj{} representation, label=lst:addExampleLet] List add(List l, A v) let l2 : (*@$\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$@*) = l in add(l2, "String"); \end{lstlisting} \end{subfigure} \end{figure} The Example in listing \ref{lst:addExample} is a valid Java program. Here Java uses local type inference \cite{JavaLocalTI} to determine the type parameters to the \texttt{add} method call. In \letfj{} there is no local type inference and all type parameters for a method call are mandatory (see listing \ref{lst:addExampleLet}). If wildcards are involved the so called capture conversion has to be done manually via let statements. %A let statement \emph{opens} an existential type. In the body of the let statement the \textit{capture type} $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}}$ becomes $\exptype{List}{\rwildcard{X}}$ and the wildcard $\wildcard{X}{\type{Object}}{\type{String}}$ is free and can be used as a type parameter to \texttt{add(...)}. %This is a valid Java program where the type parameters for the polymorphic method \texttt{add} %are determined by local type inference. One problem is the divergence between denotable and expressable types in Java \cite{semanticWildcardModel}. A wildcard in the Java syntax has no name and is bound to its enclosing type: $\exptype{List}{\exptype{List}{\type{?}}}$ equates to $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$. During type checking \emph{intermediate types} %like $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$ %or $\wctype{\rwildcard{X}}{Pair}{\rwildcard{X}, \rwildcard{X}}$ can emerge, which have no equivalent in the Java syntax. %Our type inference algorithm uses existential types internally but spawns type solutions compatible with Java. Example: % This program is not typable with the Type Inference algorithm from Plümicke \begin{lstlisting}[style=java,label=shuffleExample,caption=Intermediate Types Example] class List extends Object {...} class List2D extends List> {...} void shuffle(List> list) {...} List> l = ...; List2D l2d = ...; shuffle(l); // Error shuffle(l2d); // Valid \end{lstlisting} Java is using local type inference to allow method invocations which are not describable with regular Java types. The \texttt{shuffle} method in this case is invoked with the type $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$ which is a subtype of $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$. After capture conversion \texttt{l2d'} has the type $\exptype{List}{\exptype{List}{\rwildcard{X}}}$ and \texttt{shuffle} can be invoked with the type parameter $\rwildcard{X}$: \begin{lstlisting}[style=letfj] let l2d' : (*@$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$@*) = l2d in shuffle(l2d') \end{lstlisting} \subsection{Global Type Inference} \begin{description} \item[input] \tifj{} program \item[output] type solution \item[postcondition] the type solution applied to the input must yield a valid \letfj{} program \end{description} The input to our type inference algorithm is a modified version of the \letfj{}\cite{WildcardsNeedWitnessProtection} calculus (see chapter \ref{sec:tifj}). First \fjtype{} generates constraints and afterwards \unify{} computes a solution for the given constraint set. Constraints consist out of subtype constraints $(\type{T} \lessdot \type{T})$ and capture constraints $(\type{T} \lessdotCC \type{T})$. \textit{Note:} a type $\type{T}$ can either be a named type, a type placeholder or a wildcard type placeholder. A subtype constraint is satisfied if the left side is a subtype of the right side according to the rules in figure \ref{fig:subtyping}. \textit{Example:} $\exptype{List}{\ntv{a}} \lessdot \exptype{List}{\type{String}}$ is fulfilled by replacing type placeholder $\ntv{a}$ with the type $\type{String}$. Subtype constraints and type placeholders act the same as the ones used in \emph{Type Inference for Featherweight Generic Java} \cite{TIforFGJ}. The novel capture constraints and wildcard placeholders are needed for method invocations involving wildcards. \begin{recap}\textbf{TI for FGJ without Wildcards:} \TFGJ{} generates subtype constraints $(\type{T} \lessdot \type{T})$ consisting of named types and type placeholders. For example the method invocation \texttt{concat(l, new Object())} generates the constraints $\tv{l} \lessdot \exptype{List}{\tv{a}}, \type{Object} \lessdot \tv{a}$. Subtyping without the use of wildcards is invariant \cite{FJ}: Therefore the only possible solution for the type placeholder $\tv{a}$ is $\tv{a} \doteq \type{Object}$. % in Java and Featherweight Java. The correct type for the variable \texttt{l} is $\exptype{List}{\type{Object}}$ (or a direct subtype). %- usually the type of e must be subtypes of the method parameters %- in case of a polymorphic method: type placeholders resemble type parameters The type inference algorithm for Featherweight Generic Java \cite{TIforFGJ} (called \TFGJ{}) is complete and sound: It is able to find a type solution for a Featherweight Generic Java program, which has no type annotations at all, if there is any. It's only restriction is that no polymorphic recursion is allowed. \end{recap} % Lets have a look at the constraints generated by \fjtype{} for the example in listing \ref{lst:addExample}: \begin{constraintset} \begin{center} $\begin{array}{c} \wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}, \, \type{String} \lessdotCC \wtv{a} \\ \hline \textit{Capture Conversion:}\ \wildcard{Y}{\type{Object}}{\type{String}} \wcSep \exptype{List}{\rwildcard{Y}} \lessdot \exptype{List}{\wtv{a}}, \, \type{String} \lessdot \wtv{a} \\ \hline \textit{Solution:}\ \wtv{a} \doteq \rwildcard{Y} \implies \wildcard{Y}{\type{Object}}{\type{String}} \wcSep \exptype{List}{\rwildcard{Y}} \lessdot \exptype{List}{\rwildcard{Y}}, \, \type{String} \lessdot \rwildcard{Y} \end{array} $ \end{center} \end{constraintset} % Capture Constraints $(\wctype{\rwildcard{X}}{C}{\rwildcard{X}} \lessdotCC \type{T})$ allow for a capture conversion, which converts a constraint of the form $(\wctype{\rwildcard{X}}{C}{\rwildcard{X}} \lessdotCC \type{T})$ to $(\exptype{C}{\rwildcard{X}} \lessdot \type{T})$ %These constraints are used at places where a capture conversion via let statement can be added. %Why do we need the lessdotCC constraints here? The type of \texttt{l} can be capture converted by a let statement if needed (see listing \ref{lst:addExampleLet}). Therefore we assign the constraint $\wctype{\wildcard{X}{\type{Object}}{\type{String}}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$ which allows \unify{} to do a capture conversion to $\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$. The captured wildcard $\rwildcard{X}$ gets a fresh name and is stored in the wildcard environment of the \unify{} algorithm. \textit{Note:} The constraint $\type{String} \lessdot \rwildcard{Y}$ is satisfied because $\rwildcard{Y}$ has $\type{String}$ as lower bound. For the example shown in listing \ref{shuffleExample} the method call \texttt{shuffle(l2d)} creates the constraints: \begin{constraintset} \begin{center} $ \begin{array}{l} \wctype{\rwildcard{X}}{List2D}{\rwildcard{X}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}} \\ \hline \wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}} \\ \hline \textit{Capture Conversion:}\ \exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\wtv{x}}} \\ \hline \textit{Solution:} \wtv{x} \doteq \rwildcard{X} \implies \exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\rwildcard{X}}} \end{array} $ \end{center} \end{constraintset} The method call \texttt{shuffle(l)} is invalid however, because \texttt{l} has the type $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$. There is no solution for the subtype constraint: $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}$ % No capture conversion for methods in the same class: % Given two methods without type annotations like % \begin{verbatim} % // m :: () -> r % m() = new List() :? new List(); % // id :: (a) -> a % id(a) = a % \end{verbatim} % and a method call \texttt{id(m())} would lead to the constraints: % $\exptype{List}{\type{String}} \lessdot \ntv{r}, % \exptype{List}{\type{Integer}} \lessdot \ntv{r}, % \ntv{r} \lessdotCC \ntv{a}$ % In this example capture conversion is not applicable, % because the \texttt{id} method is not polymorphic. % The type solution provided by \unify{} for this constraint set is: % $\sigma(\ntv{r}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}}, % \sigma(\ntv{a}) = \wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ % \begin{verbatim} % List m() = new List() :? new List(); % List id(List a) = a % \end{verbatim} % The following example has the \texttt{id} method already typed and the method \texttt{m} % extended by a recursive call \texttt{id(m())}: % \begin{verbatim} % List id(List a) = a % m() = new List() :? new List() :? id(m()); % \end{verbatim} % Now the constraints make use of a $\lessdotCC$ constraint: % $\exptype{List}{\type{String}} \lessdot \ntv{r}, % \exptype{List}{\type{Integer}} \lessdot \ntv{r}, % \ntv{r} \lessdotCC \exptype{List}{\wtv{a}}$ % After substituting $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ for $\ntv{r}$ like in the example before % we get the constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{a}}$. % Due to the $\lessdotCC$ \unify{} is allowed to perform a capture conversion yielding % $\exptype{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$. % \textit{Note:} The wildcard placeholder $\wtv{a}$ is allowed to hold free variables whereas a normal placeholder like $\ntv{r}$ % is never assigned a type containing free variables. % Therefore \unify{} sets $\wtv{a} \doteq \rwildcard{X}$, completing the constraint set and resulting in the type solution: % \begin{verbatim} % List m() = new List() :? new List() :? id(m()); % \end{verbatim} \subsection{\TamedFJ{} and \letfj{}} %LetFJ -> Output language! %TamedFJ -> ANF transformed input langauge %Input language only described here. It is standard Featherweight Java % we use the transformation to proof soundness. this could also be moved to the end. % the constraint generation step assumes every method argument to be encapsulated in a let statement. This is the way Java is doing capture conversion The input to our algorithm is a typeless version of Featherweight Java. Methods are declared without parameter or return types. We still keep type annotations for fields and generic class parameters. This is a design choice by us, as we consider them as data declarations which are given by the programmer. % They are inferred in for example \cite{plue14_3b} Note the \texttt{new} expression not requiring generic parameters, which are also inferred by our algorithm. The method call naturally has no generic parameters aswell. We add the elvis operator ($\elvis{}$) to the syntax mainly to showcase applications involving wildcard types. The syntax is described in figure \ref{fig:outputSyntax} with optional type annotations highlighted in yellow. It is possible to exclude all optional types. The output is a valid Featherweight Java program. We use the syntax of the version introduced in \cite{WildcardsNeedWitnessProtection} calling it \letfj{} for that it is a Featherweight Java variant including \texttt{let} statements. % Our output syntax is shown in figure \ref{fig:outputSyntax} % which is actually a subset of \letfj{}, because we omit \texttt{null} types. \newcommand{\highlight}[1]{\begingroup\fboxsep=0pt\colorbox{yellow}{$\displaystyle #1$}\endgroup} \begin{figure} $ \begin{array}{lrcl} \hline \text{Parameterized classes} & \mv N & ::= & \exptype{C}{\ol{T}} \\ \text{Types} & \type{S}, \type{T}, \type{U} & ::= & \type{X} \mid \wcNtype{\Delta}{N} \\ \text{Lower bounds} & \type{K}, \type{L} & ::= & \type{T} \mid \bot \\ \text{Type variable contexts} & \Delta & ::= & \overline{\wildcard{X}{T}{L}} \\ \text{Class declarations} & D & ::= & \texttt{class}\ \exptype{C}{\ol{X \triangleleft T}} \triangleleft \type{N} \set{\overline{\type{T}\ f}; \ol{M}} \\ \text{Method declarations} & \texttt{M} & ::= & \highlight{\generics{\overline{\type{X} \triangleleft \type{N}}}}\ \highlight{\type{T}}\ \texttt{m}(\overline{\highlight{\type{T}}\ \expr{x}}) \{ \texttt{return} \ \expr{e}; \} \\ \text{Terms} & \expr{e} & ::= & \expr{x} \\ & & \ \ | & \texttt{new} \ \type{C}\highlight{\generics{\ol{T}}}(\overline{\expr{e}})\\ & & \ \ | & \expr{e}.f\\ & & \ \ | & \expr{e}.\texttt{m}\highlight{\generics{\ol{T}}}(\overline{\expr{e}})\\ & & \ \ | & \texttt{let}\ \expr{x} \highlight{: \wcNtype{\Delta}{N}} = \expr{e} \ \texttt{in} \ \expr{e}\\ & & \ \ | & \expr{e} \elvis{} \expr{e}\\ \text{Variable contexts} & \Gamma & ::= & \overline{\expr{x}:\type{T}}\\ \hline \end{array} $ \caption{Input Syntax with optional type annotations}\label{fig:outputSyntax} \end{figure} The output of our type inference algorithm is a valid \letfj{} program. Before generating constraints the input is transformed to \TamedFJ{} (see section \ref{sec:anfTransformation}). After adding the missing type annotations the resulting program is a valid \letfj{} program. %This is shown in chapter \ref{sec:soundness} Capture conversion is only needed for wildcard types, but we don't know which expressions will spawn wildcard types because there are no type annotations yet. We preemptively enclose every expression in a let statement before using it as a method argument. We need the let statements to deal with possible Wildcard types. The syntax used in our examples is the standard Featherweight Java syntax. \subsection{Challenges}\label{challenges} %TODO: Wildcard subtyping is infinite see \cite{TamingWildcards} The introduction of wildcards adds additional challenges. % we cannot replace every type variable with a wildcard Type variables can also be used as type parameters, for example $\exptype{List}{String} \lessdot \exptype{List}{\tv{a}}$. A problem arises when replacing type variables with wildcards. % Wildcards are not reflexive. % ( on the equals property ), every wildcard has to be capture converted when leaving its scope % do not substitute free type variables Lets have a look at a few challenges: \begin{itemize} \item %Passing wildcard types to a polymorphic method call comes with additional challenges. % TODO: Here the concat example! %- Wildcards are not reflexive %- Wildcards are opaque types. Behind a Java Wildcard is a real type. Wildcard types are virtual types. There is no instantiation of a \texttt{List}. It is a placeholder type which can hold any kind of list like \texttt{List} or \texttt{List}. This type can also change at any given time, for example when multiple threads are using the same field of type \texttt{List}. A wildcard \texttt{?} must be considered a different type everytime it is accessed. Therefore calling the method \texttt{concat} with two wildcard lists in the example in listing \ref{lst:concatError} is incorrect. The \texttt{concat} method does not create a new list, but adds all elements from the second argument to the list given as the first argument. This is allowed in Java, because both lists are of the polymorphic type \texttt{List}. As shown in listing \ref{lst:concatError} this leads to a inconsistent \texttt{List} if Java would treat \texttt{?} as a regular type and instantiate the type variable \texttt{X} of the \texttt{concat} function with \texttt{?}. To enable the use of wildcards in argument types of a method invocation Java uses a process called \textit{Capture Conversion}. \begin{lstlisting}[caption=Concat Example,label=lst:concatError]{java} List concat(List l1, List l2){ return l1.addAll(l2); } List ls = new List(); List l1 = ls; List l2 = new List(1); concat(l1, l2); \end{lstlisting} %Existential types enable us to formalize Java's method call behavior and the so called Capture Conversion. \begin{displaymath} %TODO: \begin{array}{l} \begin{array}{@{}c} \textit{mtype}(\texttt{concat}) = \generics{\type{X}}\ \exptype{List}{\type{X}} \to \exptype{List}{\type{X}} \to \exptype{List}{\type{X}} \\ \texttt{l1} : \wctype{\rwildcard{A}}{List}{\rwildcard{A}}, \texttt{l2} : \wctype{\rwildcard{B}}{List}{\rwildcard{B}} \\ \exptype{List}{\rwildcard{A}} \nless: [{\color{red}?}/\type{X}]\exptype{List}{\type{X}} \quad \quad \exptype{List}{\rwildcard{B}} \nless: [{\color{red}?}/\type{X}]\exptype{List}{\type{X}} \\ \hline \vspace*{-0.3cm}\\ \texttt{concat}(\expr{l1}, \expr{l2}) : {\color{red}\textbf{Errror}} \end{array} \end{array} \end{displaymath} %A method call in Java makes use of the fact that a real type is behind a wildcard type \item \begin{example} \label{intro-example1} The first one is a valid Java program. The type \texttt{List} is \textit{capture converted} to a fresh type variable $\rwildcard{X}$ which is used as the generic method parameter \texttt{A}. Java uses capture conversion to replace the generic \texttt{A} by a capture converted version of the \texttt{? super String} wildcard. Knowing that the type \texttt{String} is a subtype of any type the wildcard \texttt{? super String} can inherit it is safe to pass \texttt{"String"} for the first parameter of the function. \begin{verbatim} List add(List l, A v) {} List list = ...; add(list, "String"); \end{verbatim} \end{example} \item \begin{example}\label{intro-example2} This example displays an incorrect Java program. The method call to \texttt{concat} with two wildcard lists is unsound. Each list could be of a different kind and therefore the \texttt{concat} cannot succeed. \begin{verbatim} List concat(List l1, List l2) { ... } List getList() { ... } concat(getList(), getList()); \end{verbatim} The \letfj{} representation in listing \ref{letfjConcatFail} clarifies this. Inside the let statement the variables hold the types $\set{ \texttt{x1} : \exptype{List}{\rwildcard{X}}, \texttt{x2} : \exptype{List}{\rwildcard{Y}}}$. For the method call \texttt{concat(x1, x2)} no replacement for the generic \texttt{A} exists to satisfy $\exptype{List}{\type{A}} <: \exptype{List}{\type{X}}, \exptype{List}{\type{A}} <: \exptype{List}{\type{Y}}$. \begin{lstlisting}[style=letfj,caption=Incorrect method call,label=letfjConcatFail] let x1 : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) = getList() in let x2 : (*@$\wctype{\rwildcard{Y}}{List}{\rwildcard{Y}}$@*) = getList() in concat(x1, x2) \end{lstlisting} % $\wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}, \\ % \wctype{\wildcard{X}{\type{Object}}{\bot}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$ \end{example} % \item % \unify{} morphs a constraint set into a correct type solution % gradually assigning types to type placeholders during that process. % Solved constraints are removed and never considered again. % In the following example \unify{} solves the constraint generated by the expression % \texttt{l.add(l.head())} first, which results in $\ntv{l} \lessdot \exptype{List}{\wtv{a}}$. % \begin{verbatim} % anyList() = new List() :? new List() % add(anyList(), anyList().head()); % \end{verbatim} % The type for \texttt{l} can be any kind of list, but it has to be a invariant one. % Assigning a \texttt{List} for \texttt{l} is unsound, because the type list hiding behind % \texttt{List} could be a different one for the \texttt{add} call than the \texttt{head} method call. % An additional constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$ % is solved by removing the wildcard $\rwildcard{X}$ if possible. \item \textbf{Capture Conversion during \unify{}:} The return type of a generic method call depends on its argument types. A method like \texttt{String trim(String s)} will always return a \type{String} type. However the return type of \texttt{ A head(List l)} is a generic variable \texttt{A} and only shows its actual type when the type of the argument list \texttt{l} is known. The same goes for capture conversion, which can only be applied for a method call when the argument types are given. At the start of our global type inference algorithm no types are assigned yet. During the course of the \unify{} algorithm more and more types emerge. As soon as enough type information is given \unify{} can conduct a capture conversion if needed. The constraints where this is possible are marked as capture constraints. \item \textbf{Free Variables cannot leaver their scope}: \begin{example} Take the following Java program for example. It maps every element of a list $\expr{l} : \exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$ to a polymorphic \texttt{id} method. We want to use our \unify{} algorithm to determine the correct type for the variable \expr{l2}. Although we do not specify constraint generation for language constructs like lambda expressions used in this example, we can imagine that the constraints have to look something like this: \begin{minipage}{0.45\textwidth} \begin{lstlisting}{java} List id(List l){ ... } List> ls; l2 = l.map(x -> id(x)); \end{lstlisting}\end{minipage} \hfill \begin{minipage}{0.45\textwidth} \begin{constraintset} $ \begin{array}{l} \wctype{\rwildcard{A}}{List}{\rwildcard{A}} \lessdotCC \exptype{List}{\wtv{x}}, \\ \exptype{List}{\wtv{x}} \lessdot \tv{z}, \\ \exptype{List}{\tv{z}} \lessdot \tv{l2} \end{array} $ \end{constraintset} \end{minipage} The constraints $\wctype{\rwildcard{A}}{List}{\rwildcard{A}} \lessdotCC \exptype{List}{\wtv{x}}, \exptype{List}{\wtv{x}} \lessdot \tv{z}$ stem from the body of the lambda expression \texttt{shuffle(x)}. \textit{For clarification:} This method call would be represented as the following expression in \letfj{}: \texttt{let x1 :$\wctype{\rwildcard{A}}{List}{\rwildcard{A}}$ = x in id(x) :$\tv{z}$} The T-Let rule prevents us from using free variables created by the method call to \expr{id} to be used in the return type $\tv{z}$. But this has to be signaled to the \unify{} algorithm, which does not know about the origin and context of the constraints. If we naively substitute $\sigma(\tv{z}) = \exptype{List}{\rwildcard{A}}$ the return type of the \texttt{map} function would be the type $\exptype{List}{\exptype{List}{\rwildcard{A}}}$. This type solution is unsound. The type of \expr{l2} is the same as the one of \expr{l}: $\exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$ \textbf{Solution:} We solve this by distinguishing between wildcard placeholders and normal placeholders. $\ntv{z}$ is a normal placeholder and is not allowed to contain free variables. \end{example} \end{itemize} %TODO: Move this part. or move the third challenge some underneath. The \unify{} algorithm only sees the constraints with no information about the program they originated from. The main challenge was to find an algorithm which computes $\sigma(\wtv{a}) = \rwildcard{X}$ for example \ref{intro-example1} but not for example \ref{intro-example2}. \section{Discussion Pair Example} \begin{verbatim} Pair make(List l){ ... } boolean compare(Pair p) { ... } List l; Pair p; compare(make(l)); // Valid compare(p); // Error \end{verbatim} Our type inference algorithm is not able to solve this example. When we convert this to \TamedFJ{} and generate constraints we end up with: \begin{lstlisting}[style=tamedfj] let m = let x = l in make(x) in compare(m) \end{lstlisting} \begin{constraintset}$ \wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \ntv{x}, \ntv{x} \lessdotCC \exptype{List}{\wtv{a}} \exptype{Pair}{\wtv{a}, \wtv{a}} \lessdot \ntv{m}, %% TODO: Mark this constraint \ntv{m} \lessdotCC \exptype{Pair}{\wtv{b}, \wtv{b}} $\end{constraintset} $\ntv{x}$ will get the type $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ and from the constraint $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$ \unify{} deducts $\wtv{a} \doteq \rwildcard{X}$ leading to $\exptype{Pair}{\rwildcard{X}, \rwildcard{X}} \lessdot \ntv{m}$. Finding a supertype to $\exptype{Pair}{\rwildcard{X}, \rwildcard{X}}$ is the crucial part. The correct substition for $\ntv{m}$ would be $\wctype{\rwildcard{X}}{Pair}{\rwildcard{X}, \rwildcard{X}}$. But this leads to additional branching inside the \unify{} algorithm and increases runtime. %We refrain from using that type, because it is not denotable with Java syntax. %Types used for normal type placeholders should be expressable Java types. % They are not! The prefered way of dealing with this example in our opinion would be the addition of a multi-let statement to the syntax. \begin{lstlisting}[style=letfj] let x : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) = l, m = make(x) in compare(make(x)) \end{lstlisting} We can make it work with a special rule in the \unify{}. But this will only help in this specific example and not generally solve the issue. A type $\exptype{Pair}{\rwildcard{X}, \rwildcard{X}}$ has atleast two immediate supertypes: $\wctype{\rwildcard{X}}{Pair}{\rwildcard{X}, \rwildcard{X}}$ and $\wctype{\rwildcard{X}, \rwildcard{Y}}{Pair}{\rwildcard{X}, \rwildcard{Y}}$. Imagine a type $\exptype{Triple}{\rwildcard{X},\rwildcard{X},\rwildcard{X}}$ already has % TODO: how many supertypes are there? %X.Triple <: X,Y.Triple <: %X,Y,Z.Triple % TODO example: \begin{lstlisting}[style=java] Triple sameL(List l) Triple sameP(Pair l) void triple(Triple tr){} Pair p ... List l ... make(t) { return t; } triple(make(sameP(p))); triple(make(sameL(l))); \end{lstlisting} \begin{constraintset} $ \exptype{Triple}{\rwildcard{X}, \rwildcard{X}, \rwildcard{X}} \lessdot \ntv{t}, \ntv{t} \lessdotCC \exptype{Triple}{\wtv{a}, \wtv{b}, \wtv{b}}, \\ (\textit{This constraint is added later: } \exptype{Triple}{\rwildcard{X}, \rwildcard{Y}, \rwildcard{Y}} \lessdot \ntv{t}) $ % Triple <. t % ( Triple <. t ) <- this constraint is added later % t <. Triple % t =. x.Triple \end{constraintset} %TODO % The goal is to proof soundness in respect to the type rules introduced by \cite{aModelForJavaWithWildcards} % and \cite{WildcardsNeedWitnessProtection}. % \subsection{Capture Conversion} % The \texttt{let} statements in \TamedFJ{} act as capture conversion for wildcard types. % \begin{figure} % \begin{minipage}{0.45\textwidth} % \begin{lstlisting}[style=tfgj] % List clone(List l); % example(p){ % return clone(p); % } % \end{lstlisting} % \end{minipage}% % \hfill % \begin{minipage}{0.5\textwidth} % \begin{lstlisting}[style=letfj] % List clone(List l); % (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) example((*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) p) = % let x : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) = p in % clone(x) : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*); % \end{lstlisting} % \end{minipage} % \caption{Type inference adding capture conversion}\label{fig:addingLetExample} % \end{figure} % Figure \ref{fig:addingLetExample} shows a let statement getting added to the typed output. % The method \texttt{clone} cannot be called with the type $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$. % After a capture conversion \texttt{x} has the type $\exptype{List}{\rwildcard{X}}$ with $\rwildcard{X}$ being a free variable. % Afterwards we have to find a supertype of $\exptype{List}{\rwildcard{X}}$, which does not contain free variables % ($\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ in this case). % During the constraint generation step most types are not known yet and are represented by a type placeholder. % During a methodcall like the one in the \texttt{example} method in figure \ref{fig:ccExample} the type of the parameter \texttt{p} % is not known yet. % The type \texttt{List} would be one possibility as a parameter type for \texttt{p}. % To make wildcards work for our type inference algorithm \unify{} has to apply capture conversions if necessary. % The type placeholder $\tv{r}$ is the return type of the \texttt{example} method. % One possible type solution is $\tv{p} \doteq \tv{r} \doteq \wctype{\rwildcard{X}}{List}{\rwildcard{X}}$, % which leads to: % \begin{verbatim} % List example(List p){ % return clone(p); % } % \end{verbatim} % But by substituting $\tv{p} \doteq \wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ in the constraint % $\tv{p} \lessdotCC \exptype{List}{\wtv{x}}$ leads to % $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{x}}$. % To make this typing possible we have to introduce a capture conversion via a let statement: % $\texttt{return}\ (\texttt{let}\ \texttt{x} : \wctype{\rwildcard{X}}{List}{\rwildcard{X}} = \texttt{p}\ \texttt{in} \ % \texttt{clone}\generics{\rwildcard{X}}(x) : \wctype{\rwildcard{X}}{List}{\rwildcard{X}})$ % Inside the let statement the variable \texttt{x} has the type $\exptype{List}{\rwildcard{X}}$ % This spawns additional problems. % \begin{figure} % \begin{minipage}{0.45\textwidth} % \begin{verbatim} % List clone(List l){...} % example(p){ % return clone(p); % } % \end{verbatim} % \end{minipage}% % \hfill % \begin{minipage}{0.35\textwidth} % \begin{constraintset} % \textbf{Constraints:}\\ % $ % \tv{p} \lessdotCC \exptype{List}{\wtv{x}}, \\ % \tv{p} \lessdot \tv{r}, \\ % \tv{p} \lessdot \type{Object}, % \tv{r} \lessdot \type{Object} % $ % \end{constraintset} % \end{minipage} % \caption{Type inference example}\label{fig:ccExample} % \end{figure} % In addition with free variables this leads to unwanted behaviour. % Take the constraint % $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdot \exptype{List}{\wtv{a}}$ for example. % After a capture conversion from $\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$ to $\exptype{List}{\rwildcard{Y}}$ and a substitution $\wtv{a} \doteq \rwildcard{Y}$ % we get % $\exptype{List}{\rwildcard{Y}} \lessdot \exptype{List}{\rwildcard{Y}}$. % Which is correct if we apply capture conversion to the left side: % $\exptype{List}{\rwildcard{X}} <: \exptype{List}{\rwildcard{X}}$ % If the input constraints did not intend for this constraint to undergo a capture conversion then \unify{} would produce an invalid % type solution due to: % $\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \nless: \exptype{List}{\rwildcard{X}}$ % The reason for this is the \texttt{S-Exists} rule's premise % $\text{dom}(\Delta') \cap \text{fv}(\exptype{List}{\rwildcard{X}}) = \emptyset$. % Capture constraints cannot be stored in a set. % $\wtv{a} \lessdotCC \wtv{b}$ is not the same as $\wtv{a} \lessdotCC \wtv{b}$. % Both constraints will end up the same after a substitution for both placeholders $\tv{a}$ and $\tv{b}$. % But afterwards a capture conversion is applied, which can generate different types on the left sides. % \begin{itemize} % \item $\text{CC}(\wctype{\rwildcard{X}}{List}{\rwildcard{X}}) \implies \exptype{List}{\rwildcard{Y}}$ % \item $\text{CC}(\wctype{\rwildcard{X}}{List}{\rwildcard{X}}) \implies \exptype{List}{\rwildcard{Z}}$ % \end{itemize} % % Wildcards are not reflexive. A box of type $\wctype{\rwildcard{X}}{Box}{\rwildcard{X}}$ % is able to hold a value of any type. It could be a $\exptype{Box}{String}$ or a $\exptype{Box}{Integer}$ etc. % Also two of those boxes do not necessarily contain the same type. % But there are situations where it is possible to assume that. % For example the type $\wctype{\rwildcard{X}}{Pair}{\exptype{Box}{\rwildcard{X}}, \exptype{Box}{\rwildcard{X}}}$ % is a pair of two boxes, which always contain the same type. % Inside the scope of the \texttt{Pair} type, the wildcard $\rwildcard{X}$ stays the same.