11 Commits

Author SHA1 Message Date
Peter Thiemann
e42b0aaafe Merge branch 'master' of https://gitea.hb.dhbw-stuttgart.de/stan/Ecoop2024_TIforWildFJ 2024-05-29 11:01:32 +02:00
Peter Thiemann
8428ebdc70 changes 2024-05-29 11:01:30 +02:00
1ee343b87e Cleanup challenge 3 2024-05-28 23:06:58 +02:00
Peter Thiemann
d2f58b2489 intro 2024-05-28 18:21:37 +02:00
c714d49677 Intro to Challenges 2024-05-28 16:15:22 +02:00
fe64b87d09 Fix Titlepage 2024-05-28 16:15:10 +02:00
583b4acd5c Restructure Unify 2024-05-28 00:44:06 +02:00
3dbdce8e29 Fix TamedFJ introduction 2024-05-27 23:22:38 +02:00
49368e0d0e FIx 2024-05-27 18:23:44 +02:00
e5f577f577 Rephrase TamedFJ intro 2024-05-27 18:22:39 +02:00
a41802301e Rephrase TamedFJ intro 2024-05-27 18:22:17 +02:00
5 changed files with 559 additions and 429 deletions

4
.gitignore vendored
View File

@@ -17,3 +17,7 @@
*-blx.aux
*-blx.bib
*.run.xml
*.vtc
*.synctex.gz
auto
_region_.tex

View File

@@ -1,5 +1,5 @@
\documentclass{llncs}
\documentclass[anonymous,USenglish]{llncs}
%This is a template for producing LIPIcs articles.
%See lipics-v2021-authors-guidelines.pdf for further information.
%for A4 paper format use option "a4paper", for US-letter use option "letterpaper"
@@ -45,22 +45,19 @@
\title{Global Type Inference for Featherweight Java with Wildcards} %TODO Please add
%\titlerunning{Dummy short title} %TODO optional, please use if title is longer than one line
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% ANON
\author{Andreas Stadelmeier}{DHBW Stuttgart, Campus Horb, Germany}{a.stadelmeier@hb.dhbw-stuttgart.de}{}{}%TODO mandatory, please use full name; only 1 author per \author macro; first two parameters are mandatory, other parameters can be empty. Please provide at least the name of the affiliation and the country. The full address is optional. Use additional curly braces to indicate the correct name splitting when the last name consists of multiple name parts.
% \author{Andreas Stadelmeier\inst{1}, Martin Plümicke\inst{1}, Peter Thiemann\inst{2}}
% \institute{DHBW Stuttgart, Campus Horb, Germany\\
% \email{a.stadelmeier@hb.dhbw-stuttgart.de} \and
% Universität Freiburg, Institut für Informatik, Germany\\
% \email{thiemann@informatik.uni-freiburg.de}}
\author{Martin Plümicke}{DHBW Stuttgart, Campus Horb, Germany}{pl@dhbw.de}{}{}
% \authorrunning{A. Stadelmeier and M. Plümicke and P. Thiemann}
\author{Peter Thiemann}{Universität Freiburg, Institut für Informatik, Germany}{thiemann@informatik.uni-freiburg.de}{}{}
\authorrunning{A. Stadelmeier and M. Plümicke and P. Thiemann} %TODO mandatory. First: Use abbreviated first/middle names. Second (only in severe cases): Use first author plus 'et al.'
%\Copyright{Andreas Stadelmeier and Martin Plümicke and Peter Thiemann} %TODO mandatory, please use full first names. LIPIcs license is "CC-BY"; http://creativecommons.org/licenses/by/3.0/
%\ccsdesc[500]{Software and its engineering~Language features}
%\ccsdesc[300]{Software and its engineering~Syntax}
\keywords{type inference, Java, subtyping, generics} %TODO mandatory; please add comma-separated list of keywords
%% END ANON
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\category{} %optional, e.g. invited paper
@@ -97,8 +94,17 @@
%TODO mandatory: add short abstract of the document
\begin{abstract}
TODO: Abstract
Type inference is a hallmark of functional programming. Its use in
object-oriented programming is often restricted to type inference
for local variables and the instantiation of generic type
parameters.
Global type inference for Featherweight Generic Java is a
whole-program analysis that infers omitted method signatures.
Compared to previous work, its results are more general as it infers
types with wildcards. Global type inference is proved sound.
\end{abstract}
\keywords{type inference, Java, subtyping, generics} %TODO mandatory; please add comma-separated list of keywords
\input{introduction}

View File

@@ -26,30 +26,62 @@
% It's only restriction is that no polymorphic recursion is allowed.
% \end{recap}
\section{Introduction}
\label{sec:introduction}
\section{Type Inference for Java}
- Type inference helps rapid development
- used in Java already (var keyword)
- keeps source code clean
Type inference is a hallmark of functional programming, where it
improves readability and conciseness while
enabling safe rapid prototyping by allowing the compiler to deduce
types automatically without explicit type annotations. It allows
developers to safely refactor their programs before fixing function
signatures. In object-oriented programming (OOP), however, the use of
type inference is more restricted. It is often employed for local
variables and the instantiation of generic type parameters, offering
similar readability and conciseness benefits as in functional
programming. In languages like Java, the overall architecture of
method signatures must be designed by the programmer up-front.
Java comes with a local type inference algorithm
used for lambda expressions, method calls, and the \texttt{var} keyword.
Java features a limited form of local type inference for local
variables defined with the \texttt{var} keyword, lambda expressions,
and method calls. Java infers the type of variables defined with
\texttt{var}. The rationale here is that instantiations of generic
types (e.g., maps of maps) are often unwiedly and they can be easily
inferred from the initializing expression. The type of a lambda
expression is inferred from the target type, i.e., the type of the
method argument that accepts the lambda. The rationale is the intended
use of lambdas as callback functions for listeners etc. For calls
to generic methods, Java infers the most specific instantiation of the
generic parameters for each individual call.
A type inference algorithm that is able to recreate any type annotation
even when no type information is given at all is called a global type inference algorithm.
The global type inference algorithm presented here is able to deal with all Java types including wildcard types.
It can also be used for regular Java programs.
The local type inference features of Java add some convenience, but
programmers still have to provide method signatures beforehand. So there
is scope for a global type inference algorithm that infers
method signatures even if no type information (except class headers
and types of field variables) is given.
We present such a global type inference algorithm for Featherweight
Generic Java with wildcards, a Java core calculus with generics and
wildcards in the tradition of Featherweight Java \cite{FJ} and its
extensions \cite{WildFJ,WildcardsNeedWitnessProtection}. Our approach can also be used for regular Java programs,
but we limit the formal presentation to this core calculus.
%The goal is to find a correct typing for a given Java program.
Type inference for Java has many use cases and could be used to help programmers by inserting correct types for them,
finding better type solutions for already typed Java programs (for example more generical ones),
or allowing to write typeless Java code which is then type inferred and thereby type checked by our algorithm.
Our algorithm enables programmers to write Java code with only a
minimum of type annotations (class headers and types of field
variables), it can fill in missing type annotations in partially
type-annotated programs, and it can suggest better (more general)
types for ordinary, typed Java programs.
Listing~\ref{lst:intro-example-typeless} shows an example input of a
method implementation without any type annotation. Our algorithm
infers the type annotations in Listing~\ref{lst:intro-example-typed}:
it adds the type arguments to \texttt{List} at object creation and it
infers the most specific return type \texttt{List<?>}, which is a
wildcard type. Our algorithm is the first to infer wildcard types that
comes with a soundness proof.
Previous work on global type inference for Java either does not
consider wildcards or it simplifies the problem by not modeling key
features of Java wildcards.
We propose a global type inference algorithm for Java supporting Wildcards and proven sound.
Global type inference allows input programs without any type annotations.
A programmer could write Java code without stressing about types and type annotations which
are infered and inserted by our algorithm.
%This leads to better types (Poster referenz)
%The algorithm proposed in this paper can determine a correct typing for the untyped Java source code example shown in listing \ref{lst:intro-example-typeless}.
%In this case our algorithm would also be able to propose a solution including wildcards as shown in listing \ref{lst:intro-example-typed}.
@@ -57,125 +89,8 @@ are infered and inserted by our algorithm.
%This paper extends a type inference algorithm for Featherweight Java \cite{TIforFGJ} by adding wildcards.
%The last step to create a type inference algorithm compatible to the Java type system.
\subsection{Comparision to similar Type Inference Algorithms}
To outline the contributions in this paper we will list the advantages and improvements to smiliar type inference algorithms:
\begin{description}
\item[Global Type Inference for Featherweight Java] \cite{TIforFGJ} is a predecessor to our algorithm.
The type inference algorithm presented here supports Java Wildcards.
% Proven sound on type rules of Featherweight Java, which are also proven to produce sound programs
% implication rules that follow the subtyping rules directly. Easy to understand soundness proof
% capture conversion is needed
\textit{Example:} The type inference algorithm for Generic Featherweight Java produces \texttt{Object} as the return type of the
\texttt{genBox} method in listing \ref{lst:intro-example-typeless}
whereas our type inference algorithm will infer the type solution shown in listing \ref{lst:intro-example-typed}
involving a wildcard type.
\item[Type Unification for Java with Wildcards]
An existing unification algorithm for Java with wildcards \cite{plue09_1} states the same capabilities,
but exposes some errors when it comes to method invocations.
Especially the challenges shown in chapter \ref{challenges} are handled incorrectly.
The main reasons are that Plümickes algorithm only supports types which are expressible in Java syntax
and its soundness is proven towards a self-defined subtype ordering, but never against a complete type system.
It appears that the subtype relation changes depending on whether a type is used as an argument to a method invocation.
We resolve this by denoting Java wildcards as existential types
and introducing a second kind of subtype constraint. %, the current state of the art
%and is able to deal with types that are not directly denotable in Java.
Additionally the soundness of our algorithm is proven using a Featherweight Java calculus \cite{WildFJ}.
%The algorithm presented in this paper is able to solve all those challenges correctly
%and it's correctness is proven using a Featherweight Java calculus \cite{WildFJ}.
%But they are all correctly solved by our new type inference algorithm presented in this paper.
\item[Java Type Inference]
Standard Java provides type inference in a restricted form % namely {Local Type Inference}.
which only works for local environments where the surrounding context has known types.
But our global type inference algorithm is able to work on input programs which do not hold any type annotations at all.
We will show the different capabilities with an example.
In listing \ref{lst:tiExample} the method call \texttt{emptyList} is missing
its type parameters.
Java is using a matching algorithm \cite{javaTIisBroken} to replace \texttt{T} with \texttt{String}
resulting in the correct expected type \texttt{List<String>}.
%The invocation of \texttt{emptyList} missing type parameters can be added by Java's local type inference
%because the expected return type is known. \texttt{List<String>} in this case.
\begin{lstlisting}[caption=Extract of a valid Java program, label=lst:tiExample]
<T> List<T> emptyList() { ... }
List<String> ls = emptyList();
\end{lstlisting}
%\textit{Local Type Inference limitations:}
Note that local type inference depends on the type annotation on the left side of the assignment.
When calling the \texttt{emptyList} method without this type context its return value will be set to a \texttt{List<Object>}.
The Java code snippet in \ref{lst:tiLimit} is incorrect, because \texttt{emptyList()} returns
a \texttt{List<Object>} instead of the required $\exptype{List}{\exptype{List}{String}}$.
\begin{lstlisting}[caption=Limitations of Java's Type Inference, label=lst:tiLimit]
emptyList().add(new List<String>())
.get(0)
.get(0); //Typing Error
\end{lstlisting}
%List<String> <. A
%List<A> <: List<B>, B <: List<C>
% B = A and therefore A on the left and right side of constraints.
% this makes matching impossible
The second call to \texttt{get} produces a type error, because Java expects \texttt{emptyList} to return
a \texttt{List<Object>}.
The type inference algorithm presented in this paper will correctly replace the type parameter \texttt{T} of the \texttt{emptyList}
method with \texttt{List<List<String>>} and proof this code snippet correct.
The local type inference algorithm based on matching cannot produce this solution.
Here our type inference algorithm based on unification is needed.
\end{description}
% %motivate constraints:
% To solve this example our Type Inference algorithm will create constraints
% $
% \exptype{List}{\tv{a}} \lessdot \tv{b},
% \tv{b} \lessdot \exptype{List}{\tv{c}},
% \tv{c} \lessdot \exptype{List}{\tv{d}}
% $
%\subsection{Conclusion}
% Additions: TODO
% - Global Type Inference Algorithm, no type annotations are needed
% - Soundness Proof
% - Easy to implement
% - Capture Conversion support
% - Existential Types support
Our contributions are
\begin{itemize}
\item
We introduce the language \tifj{} (chapter \ref{sec:tifj}).
A Featherweight Java derivative including Generics, Wildcards and Type Inference.
\item
We support capture conversion and Java style method calls.
This requires existential types in a form which is not denotable by Java syntax \cite{aModelForJavaWithWildcards}.
\item
We present a novel approach to deal with existential types and capture conversion during constraint unification.
\item The algorithm is split in two parts. A constraint generation step and an unification step.
Input language and constraint generations can be extended without altering the unification part.
\item
We prove soundness and aim for a good compromise between completeness and time complexity.
\end{itemize}
% Our algorithm finds a correct type solution for the following example, where the Java local type inference fails:
% \begin{verbatim}
% class SuperPair<A,B>{
% A a;
% B b;
% }
% class Pair<A,B> extends SuperPair<B,A>{
% A a;
% B b;
% <X> X choose(X a, X b){ return b; }
% String m(List<? extends Pair<Integer, String>> a, List<? extends Pair<Integer, String>> b){
% return choose(choose(a,b).value.a,b.value.b);
% }
% }
% \end{verbatim}
\begin{figure}
\begin{figure}[tp]
\begin{minipage}{0.43\textwidth}
\begin{lstlisting}[style=java,label=lst:intro-example-typeless,caption=Missing return type]
genList() {
@@ -200,6 +115,118 @@ List<?> genList() {
\end{lstlisting}
\end{minipage}
\end{figure}
Global Type Inference for Featherweight Java \cite{TIforFGJ} is a
sound, but incomplete inference algorithm for Featherweight Generic
Java. It does not support wildcards, which means that it infers the
return type \texttt{Object} for the method \texttt{genList} in
Listing~\ref{lst:intro-example-typeless}. Like our algorithm, their
algorithm restricts to monomorphic recursion, which leads to
incompleteness.
A type unification algorithm for Java with wildcards \cite{plue09_1} states the same capabilities,
but it does not handle capture conversion correctly because it only supports types which are expressible in Java syntax (more details in
section~\ref{challenges}).
Moreover, it appears that the subtype relation changes depending on whether a type is used as an argument to a method invocation.
We resolve this problem by modeling Java wildcards as existential
types \cite{WildFJ,WildcardsNeedWitnessProtection}, which also serve
as the basis of our soundness proof.
% forces us to define a new kind of subtype constraint. %, the current state of the art
%and is able to deal with types that are not directly denotable in Java.
%The algorithm presented in this paper is able to solve all those challenges correctly
%and it's correctness is proven using a Featherweight Java calculus \cite{WildFJ}.
%But they are all correctly solved by our new type inference algorithm presented in this paper.
\begin{lstlisting}[caption=Part of a valid Java program, style=tfgj, label=lst:tiExample,float=tp]
<T> List<T> emptyList() { ... }
List<String> ls = emptyList();
\end{lstlisting}
Standard Java provides type inference in a restricted form % namely {Local Type Inference}.
which only works for local environments where the surrounding context has known types.
The example in Listing~\ref{lst:tiExample} exhibits the main differences to our global type inference algorithm.
The method call \texttt{emptyList} lacks its type parameters.
Java relies on a matching algorithm \cite{javaTIisBroken} to instantiate \texttt{T} with \texttt{String}
resulting in the correct expected type \texttt{List<String>}.
This local type inference depends on the type annotation on the left side of the assignment.
When calling the \texttt{emptyList} method without this type context
its return value will be inferred as \texttt{List<Object>}.
Therefore, Java rejects the code snippet in Listing~\ref{lst:tiLimit}:
it infers the type \texttt{List<Object>} for
\texttt{emptyList()} instead of the required
$\exptype{List}{\exptype{List}{String}}$.
\begin{lstlisting}[caption=Limitations of Java's Type Inference,
label=lst:tiLimit, float=tp]
emptyList().add(new List<String>())
.get(0)
.get(0); //Typing Error
\end{lstlisting}
%List<String> <. A
%List<A> <: List<B>, B <: List<C>
% B = A and therefore A on the left and right side of constraints.
% this makes matching impossible
Hence, the second call to \texttt{get} produces a type error.
The type inference algorithm presented in this paper correctly instantiates the type parameter \texttt{T} of the \texttt{emptyList}
method with \texttt{List<List<String>>} and render this code snippet correct.
% %motivate constraints:
% To solve this example our Type Inference algorithm will create constraints
% $
% \exptype{List}{\tv{a}} \lessdot \tv{b},
% \tv{b} \lessdot \exptype{List}{\tv{c}},
% \tv{c} \lessdot \exptype{List}{\tv{d}}
% $
%\subsection{Conclusion}
% Additions: TODO
% - Global Type Inference Algorithm, no type annotations are needed
% - Soundness Proof
% - Easy to implement
% - Capture Conversion support
% - Existential Types support
We summarize our contributions:
\begin{itemize}
\item
We introduce the language \tifj{} (section \ref{sec:tifj}),
a Featherweight Java derivative including generics, wildcards, and type inference.
\item
Our algorithm handles existential types in a form which is not
denotable by Java syntax \cite{aModelForJavaWithWildcards}. Thus, we support capture conversion and Java style method calls.
\item
We present a novel approach to deal with existential types and capture conversion during constraint unification.
% \item The algorithm is split in two parts. A constraint generation step and an unification step.
% Input language and constraint generations can be extended without altering the unification part.
\item
We prove soundness and aim for a good compromise between completeness and time complexity.
\end{itemize}
% Our algorithm finds a correct type solution for the following example, where the Java local type inference fails:
% \begin{verbatim}
% class SuperPair<A,B>{
% A a;
% B b;
% }
% class Pair<A,B> extends SuperPair<B,A>{
% A a;
% B b;
% <X> X choose(X a, X b){ return b; }
% String m(List<? extends Pair<Integer, String>> a, List<? extends Pair<Integer, String>> b){
% return choose(choose(a,b).value.a,b.value.b);
% }
% }
% \end{verbatim}
% \subsection{Wildcards}
% Java subtyping involving generics is invariant.
@@ -210,25 +237,25 @@ List<?> genList() {
% The type inference algorithm has to find the correct type involving wildcards (\texttt{List<?>}).
\section{Java Wildcards}
\label{sec:java-wildcards}
Java has invariant subtyping for polymorphic types.
As Java is an imperative language, subtyping for generic types is invariant.
%but it incooperates use-site variance via so called wildcard types.
A \texttt{List<String>} is not a subtype of \texttt{List<Object>}
even though it seems intuitive with \texttt{String} being a subtype of \texttt{Object}.
To make the type system more expressive Java incooperates use-site variance by allowing wildcards (\texttt{?}) in type annotations.
For example a type \texttt{List<?>} (short for \texttt{List<? extends Object>}, with \texttt{?} being a placeholder for any type) %with the wildcard \texttt{?} representing a placeholder for any type
Even though \texttt{String} is subtype of \texttt{Object},
a \texttt{List<String>} is not a subtype of \texttt{List<Object>}:
because someone might store an \texttt{Integer} in the list, which is
compatible with \texttt{Object}, but not with \texttt{String} (see Listing~\ref{lst:invarianceExample}).
Invariance is overly restrictive in read-only or write-only
contexts. Hence, Java incooperates use-site variance by allowing wildcards (\texttt{?}) in types.
For example, the type \texttt{List<?>} (short for \texttt{List<? extends Object>}, with \texttt{?} being a placeholder for any type)
is a supertype of \texttt{List<String>} and \texttt{List<Object>}.
%The \texttt{?} is a wildcard type which can be replaced by any type as needed.
%
Listing~\ref{lst:wildcardIntro} shows a use of wildcards that renders the assignment \texttt{lo = ls} correct.
The program still does not compile, because the addition of an \texttt{Integer} to \texttt{lo} is still incorrect.
%Class instances in Java are mutable and passed by reference.
Subtyping must be invariant in Java otherwise the integrity of data classes like \texttt{List} cannot be guaranteed.
See listing \ref{lst:invarianceExample} for example
where an \texttt{Integer} would be added to a list of \texttt{String}
if not for the Java type system which rejects the assignment \texttt{lo = ls}.
Listing \ref{lst:wildcardIntro} shows the use of wildcards rendering the assignment \texttt{lo = ls} correct.
The program still does not compile, because now the addition of an Integer to \texttt{lo} is rightfully deemed incorrect by Java.
\begin{figure}
\begin{figure}[tp]
\begin{minipage}{0.48\textwidth}
\begin{lstlisting}[caption=Java Invariance Example,label=lst:invarianceExample]{java}
List<String> ls = ...;
@@ -248,14 +275,13 @@ lo.add(new Integer(1)); // error!
\end{minipage}
\end{figure}
Wildcard types are virtual types.
There is no instantiation of a \texttt{List<?>}.
It is a placeholder type which can hold any kind of list like
Wildcard types like \texttt{List<?>} are virtual types, i.e., the run-time
type of an object is always a fully instantiated type like
\texttt{List<String>} or \texttt{List<Object>}.
This type can also change at any given time, for example when multiple threads
share a reference to the same field.
A wildcard \texttt{?} must be considered a different type everytime it is accessed.
Therefore calling the method \texttt{concat} with two wildcard lists in the example in listing \ref{lst:concatError} is incorrect.
The issue is that the run-time type underlying a wildcard type can
change at any time, for example when multiple threads share a reference to the same field.
Hence, a wildcard \texttt{?} must be considered a different type everytime it is accessed.
For that reason, the call to the method \texttt{concat} with two wildcard lists in Listing~\ref{lst:concatError} is rejected.
% The \texttt{concat} method does not create a new list,
% but adds all elements from the second argument to the list given as the first argument.
% This is allowed in Java, because both lists are of the polymorphic type \texttt{List<X>}.
@@ -263,50 +289,52 @@ Therefore calling the method \texttt{concat} with two wildcard lists in the exam
% if Java would treat \texttt{?} as a regular type and instantiate the type variable \texttt{X}
% of the \texttt{concat} function with \texttt{?}.
\begin{figure}
\begin{figure}[tp]
\begin{lstlisting}[caption=Wildcard Example with faulty call to a concat method,label=lst:concatError]{java}
<X> List<X> concat(List<X> l1, List<X> l2){
return l1.addAll(l2);
}
List<String> ls = new List<String>();
List<?> l1 = ls;
List<?> l1 = new List<String>("foo");
List<?> l2 = new List<Integer>(1); // List containing Integer
concat(l1, l2); // Error! Would concat two different lists
\end{lstlisting}
\end{figure}
To determine the correctness of method calls involving wildcard types Java's typecheck
makes use of a concept called \textbf{Capture Conversion}.
To determine the correctness of method calls involving wildcard types Java's typechecker
makes use of a concept called \textbf{capture conversion}.
% was designed to make Java wildcards useful.
% - without capture conversion
% - is used to open wildcard types
% -
This process was formalized for Featherweight Java \cite{FJ} by replacing existential types with wildcards and capture conversion with let statements \cite{WildcardsNeedWitnessProtection}.
We propose our own Featherweight Java derivative called \TamedFJ{} defined in chapter \ref{sec:tifj}.
To express the example in listing \ref{lst:wildcardIntro} with our calculus we first have to translate the wildcard types:
\texttt{List<? extends Object>} becomes $\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{A}}$.
The syntax used here allows for wildcard parameters to have a name, an uppper and lower bound,
and a type they are bound to.
In this case the name is $\rwildcard{A}$ with the upper bound $\type{Object}$ and it's bound to the the type \texttt{List}.
Before we can call the \texttt{add} method on this type we have to add a capture conversion via let statement:
One way to formalize this concept is by replacing wildcards with
existential types and modeling capture conversion with suitably
inserted let statements \cite{WildcardsNeedWitnessProtection}.
Our Featherweight Java derivative called \TamedFJ{} is modeled after
Bierhoff's calculus \cite{WildcardsNeedWitnessProtection} (see section~\ref{sec:tifj}).
To express the example in Listing~\ref{lst:wildcardIntro} in our calculus we first translate the wildcard types:
\texttt{List<? extends Object>} becomes
$\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{A}}$,
where the existentially bound variable \texttt{A} has a lower bound
$\bot$ and an upper bound $\type{Object}$.
Before we can call the \texttt{add} method on this type we perform
capture conversion by inserting a let statement:
\begin{lstlisting}
let v : (*@$\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{A}}$@*) = lo in v.<A>add(new Integer(1));
\end{lstlisting}
\expr{lo} is assigned to a new variable \expr{v} bearing the type $\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$,
but inside the let statement the variable \expr{v} will be treated as $\exptype{List}{\rwildcard{A}}$.
The idea is that every Wildcard type is backed by a concrete type.
By assigning \expr{lo} to a immutable variable \expr{v} we unpack the concrete type $\exptype{List}{\rwildcard{A}}$
that was concealed by \expr{lo}'s existential type.
Here $\rwildcard{A}$ is a fresh variable or a captured wildcard so to say.
The only information we have about $\rwildcard{A}$ is that it is any type inbetween the bounds $\bot$ and $\type{Object}$
The variable \expr{lo} (from Listing~\ref{lst:wildcardIntro}) is
assigned to a new immutable variable \expr{v} with type
$\wctype{\wildcard{A}{\type{Object}}{\bot}}{List}{\rwildcard{X}}$,
but inside the let statement the variable \expr{v} will be treated as
$\exptype{List}{\rwildcard{A}}$.
Here $\rwildcard{A}$ is a fresh variable or a captured wildcard.
The only information we have about $\rwildcard{A}$ is that it is a
supertype of $\bot$ and a subtype of $\type{Object}$
It is important to give the captured wildcard type $\rwildcard{A}$ an unique name which is used nowhere else.
With this formalization it gets obvious why the method call to \texttt{concat}
in listing \ref{lst:concatError} is irregular (see \ref{lst:concatTamedFJ}).
\begin{figure}
This approach also clarifies why the method call to \texttt{concat}
in listing \ref{lst:concatError} is rejected (see Listing~\ref{lst:concatTamedFJ}).
\begin{figure}[tp]
\begin{lstlisting}[style=TamedFJ,caption=\TamedFJ{} representation of the concat call from listing \ref{lst:concatError}, label=lst:concatTamedFJ]
let l1' : (*@$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$@*) = l1 in
let l2' : (*@$\wctype{\rwildcard{Y}}{List}{\exptype{List}{\rwildcard{Y}}}$@*) = l2 in
@@ -467,8 +495,19 @@ a type parameter to method call \texttt{<X>add(v, "String")}.
% do not substitute free type variables
Lets have a look at a few challenges:
\begin{itemize}
Global Type inference for Featherweight Java with generics but without wildcards is solved already.
But adding Wildcards to the calculus creates new problems we did not foresee
and which have not been recognized by an existing type unification algorithm for Java with wildcards \ref{plue09_1}.
% what is the problem?
% Java does invisible capture conversion steps
Java Wildcard types are represented as existential types and have to be opened before they can be used.
This can either be done implicitly (\cite{aModelForJavaWithWildcards}, \cite{javaTIisBroken}) or explicitly via let statements (\cite{WildcardsNeedWitnessProtection}).
For all of those variations it is vital to know the argument types with which a method is called.
But our input program does not contain any type annotations.
We do not know where an existential type will emerge and where a capture conversion is necessary.
This has to be figured out during the type inference algorithm.
We detected three main challenges related to Java Wildcards and Global Type Inference:
\begin{enumerate}
\item \label{challenge:1}
One challenge is to design the algorithm in a way that it finds a correct solution for the program shown in listing \ref{lst:addExample}
and rejects the program in listing \ref{lst:concatError}.
@@ -511,12 +550,31 @@ is solved by removing the wildcard $\rwildcard{X}$ if possible.
this problem is solved by ANF transformation
\item \textbf{Wildcards as Existential Types:}
One problem is the divergence between denotable and expressable types in Java \cite{semanticWildcardModel}.
A wildcard in the Java syntax has no name and is bound to its enclosing type:
$\exptype{List}{\exptype{List}{\type{?}}}$ equates to $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
During type checking \emph{intermediate types}
can emerge, which have no equivalent in the Java syntax.
\item
% This problem is solved by assuming everything is a wildcard and lateron erasing excessive wildcards
% this is solved by having wildcards bound to a type. But this makes it necessary to remove wildcards lateron otherwise Unify would have to backtrack
The program in listing \ref{shuffleExample} shows a challenge involving wildcards and subtyping.
The method call \texttt{shuffle(l)} is incorrect, because \texttt{l} has the type
$\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$ representing a list of unknown lists.
Whereas $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$ is a subtype of
$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$ representing a list of lists, all of the same type $\rwildcard{X}$,
and can savely be passed to \texttt{shuffle}.
This behavior can also be explained by looking at the types and their capture converted versions:
\begin{center}
\begin{tabular}{l | l | l}
Java type & \TamedFJ{} representation & Capture Conversion \\
\hline
$\exptype{List}{\exptype{List}{\texttt{?}}}$ & $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$ & $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$ \\
$\exptype{List2D}{\texttt{?}}$ & $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$ & $\exptype{List2D}{\rwildcard{X}}$ \\
%Supertype of $\exptype{List2D}{\texttt{?}}$ & $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$ & $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$ \\
\end{tabular}
\end{center}
%The direct supertype of $\exptype{List2D}{\rwildcard{X}}$ is $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$.
%One problem is the divergence between denotable and expressable types in Java \cite{semanticWildcardModel}.
%A wildcard in the Java syntax has no name and is bound to its enclosing type:
%$\exptype{List}{\exptype{List}{\type{?}}}$ equates to $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
%During type checking \emph{intermediate types}
%can emerge, which have no equivalent in the Java syntax.
\begin{lstlisting}[style=java,label=shuffleExample,caption=Intermediate Types Example]
class List<X> extends Object {...}
@@ -530,45 +588,64 @@ List2D<?> l2d = ...;
shuffle(l); // Error
shuffle(l2d); // Valid
\end{lstlisting}
Java is using local type inference to allow method invocations which are not describable with regular Java types.
The \texttt{shuffle} method in this case is invoked with the type $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$
which is a subtype of $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$.
After capture conversion \texttt{l2d'} has the type $\exptype{List}{\exptype{List}{\rwildcard{X}}}$
and \texttt{shuffle} can be invoked with the type parameter $\rwildcard{X}$:
\begin{lstlisting}[style=TamedFJ]
let l2d' : (*@$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$@*) = l2d in <X>shuffle(l2d')
\end{lstlisting}
% Java is using local type inference to allow method invocations which are not describable with regular Java types.
% The \texttt{shuffle} method in this case is invoked with the type $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$
% which is a subtype of $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$.
% After capture conversion \texttt{l2d'} has the type $\exptype{List}{\exptype{List}{\rwildcard{X}}}$
% and \texttt{shuffle} can be invoked with the type parameter $\rwildcard{X}$:
% \begin{lstlisting}[style=TamedFJ]
% let l2d' : (*@$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$@*) = l2d in <X>shuffle(l2d')
% \end{lstlisting}
For the example shown in listing \ref{shuffleExample} the method call \texttt{shuffle(l2d)} creates the constraints:
\begin{constraintset}
\begin{center}
$
\begin{array}{l}
\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}
\\
\hline
\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}
\\
\hline
\textit{Capture Conversion:}\
\exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\wtv{x}}}
\\
\hline
\textit{Solution:} \wtv{x} \doteq \rwildcard{X} \implies \exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\rwildcard{X}}}
\end{array}
$
\end{center}
\end{constraintset}
% For the example shown in listing \ref{shuffleExample} the method call \texttt{shuffle(l2d)} creates the constraints:
% \begin{constraintset}
% \begin{center}
% $
% \begin{array}{l}
% \wctype{\rwildcard{X}}{List2D}{\rwildcard{X}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}
% \\
% \hline
% \wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}
% \\
% \hline
% \textit{Capture Conversion:}\
% \exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\wtv{x}}}
% \\
% \hline
% \textit{Solution:} \wtv{x} \doteq \rwildcard{X} \implies \exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\rwildcard{X}}}
% \end{array}
% $
% \end{center}
% \end{constraintset}
The method call \texttt{shuffle(l)} is invalid however,
because \texttt{l} has the type
$\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
There is no solution for the subtype constraint:
$\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}$
Given such a program our type inference algorithm has to allow the call \texttt{shuffle(l2d)} and
decline the call to \texttt{shuffle(l)}.
% The method call \texttt{shuffle(l)} is invalid however,
% because \texttt{l} has the type
% $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
% There is no solution for the subtype constraint:
% $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}$
\item \label{challenge3} \textbf{Free Variables cannot leaver their scope}:
Let's assume we have a variable \texttt{ls} with type $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$
%When calling the \texttt{id} function with an element of this list we have to apply capture conversion.
and the following input program:
\noindent
\begin{minipage}{0.62\textwidth}
\begin{lstlisting}
let x : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) = ls.get(0) in id(x) : (*@$\ntv{z}$@*)
\end{lstlisting}\end{minipage}
\hfill
\begin{minipage}{0.36\textwidth}
\begin{lstlisting}[style=constraints]
(*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{x}}$@*),
(*@$\exptype{List}{\wtv{x}} \lessdot \ntv{z},$@*)
\end{lstlisting}
\end{minipage}
% the variable z has to
\begin{example}
Take the Java program in listing \ref{lst:mapExample} for example.
It maps every element of a list
$\expr{l} : \exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$
@@ -616,8 +693,11 @@ $\exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$
We solve this by distinguishing between wildcard placeholders and normal placeholders.
$\ntv{z}$ is a normal placeholder and is not allowed to contain free variables.
\end{example}
\end{itemize}
\end{enumerate}
%TODO: Move this part. or move the third challenge some underneath.
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "TIforWildFJ"
%%% End:

View File

@@ -5,38 +5,73 @@
%Input language only described here. It is standard Featherweight Java
% we use the transformation to proof soundness. this could also be moved to the end.
% the constraint generation step assumes every method argument to be encapsulated in a let statement. This is the way Java is doing capture conversion
We define our own calculus \TamedFJ{}, which is used as input aswell as output to our global type inference algorithm.
%We assume that the input to our algorithm is a program, which carries none of the optional type annotations.
%After calculating a type solution we can insert all missing types and generate a correct program.
%The input to our algorithm is a typeless version of Featherweight Java.
The syntax is shown in figure \ref{fig:syntax} with optional type annotations highlighted in yellow.
The respective type rules are defined by figure \ref{fig:expressionTyping} and \ref{fig:typing}.
\TamedFJ{} is practically a subset of the Featherweight Java calculus defined by \textit{Bierhoff} \cite{WildcardsNeedWitnessProtection}
with the exception that method argument and return type annotations are optional.
The point is that a correct and fully typed \TamedFJ{} program is also a correct Featherweight Java program,
which is vital for our soundness proof (see chapter \ref{sec:soundness}).
%The language is designed to showcase type inference involving existential types.
The input to our algorithm is a typeless version of Featherweight Java.
The syntax is shown in figure \ref{fig:syntax}
and the respective type rules in figure \ref{fig:expressionTyping} and \ref{fig:typing}.
Method parameters and return types are optional.
We still require type annotations for fields and generic class parameters.
A method assumption consists out of a method name, a list of type parameters, a list of argument types, and a return type.
The first type in the list of argument types is the type of the surrounding class also known as the \texttt{this} parameter.
See figure \ref{fig:methodTypeExample} for an example:
Here the \texttt{add} method internally is treated as a method with two arguments,
because we add \texttt{this} to its argument list.
Note that the type parameter $\type{A}$ of the surrounding class is part of the methods parameter list.
%For example the \texttt{add} method in listing \ref{lst:tamedfjSample} is represented by the assumption
%$\texttt{add} : \generics{\ol{X \triangleleft Object}}\ \type{X} \to \exptype{List}{\type{X}}$.
\begin{figure}
\center
\begin{minipage}{0.49\textwidth}
%[style=java,caption=\TamedFJ{} sample, label=lst:tamedfjSample]
\begin{lstlisting}
class List<A extends Object> {
List<A> add(A v){..,}
}
\end{lstlisting}
\end{minipage}
\begin{minipage}{0.49\textwidth}
\begin{align*}
&\mathrm{\Pi} = \set{\\
&\texttt{add} : \generics{\ol{A \triangleleft Object}}\ \exptype{List}{\type{A}},\type{A} \to \exptype{List}{\type{X}} \\
&}
\end{align*}
\end{minipage}
% \begin{minipage}{0.49\textwidth}
% \begin{lstlisting}[style=java,caption=\TamedFJ{} sample, label=lst:tamedfjSample]
% class List<A extends Object> {}
% <A extends Object> List<A>
% add(List<A> this, A v){..,}
% \end{lstlisting}
% \end{minipage}
\caption{\TamedFJ{} class and its corresponding method type environment}\label{fig:methodTypeExample}
\end{figure}
\textit{Additional Notes:}%
\begin{itemize}
%\item Method parameters and return types are optional.
\item We still require type annotations for fields and generic class parameters.
This is a design choice by us,
as we consider them as data declarations which are given by the programmer.
% They are inferred in for example \cite{plue14_3b}
Note the \texttt{new} expression not requiring generic parameters,
which are also inferred by our algorithm.
The method call naturally has type inferred generic parameters aswell.
We add the elvis operator ($\elvis{}$) to the syntax mainly to showcase applications involving wildcard types.
The syntax is described in figure \ref{fig:syntax} with optional type annotations highlighted in yellow.
It is possible to exclude all optional types.
\TamedFJ{} is a subset of the calculus defined by \textit{Bierhoff} \cite{WildcardsNeedWitnessProtection}.
%The language is designed to showcase type inference involving existential types.
The idea is for our type inference algorithm to calculate all missing types and output a fully and correctly typed \TamedFJ{} program,
which by default is also a correct Featherweight Java program (see chapter \ref{sec:soundness}).
\noindent
\textit{Notes:}
\begin{itemize}
\item The calculus does not include method overriding for simplicity reasons.
\item We add the elvis operator ($\elvis{}$) to the syntax mainly to showcase applications involving wildcard types.
\item \textit{Note:} The \texttt{new} expression is not requiring generic parameters.
\item Every method has an unique name.
The calculus does not include method overriding for simplicity reasons.
Type inference for that is described in \cite{TIforFGJ} and can be added to this algorithm accordingly.
%\textit{Note:}
\item The \textsc{T-Program} type rule ensures that there is one set of method assumptions used for all classes
that are part of the program.
\item The typing rules for expressions shown in figure \ref{fig:expressionTyping} refrain from restricting polymorphic recursion.
Type inference for polymorphic recursion is undecidable \cite{wells1999typability} and when proving completeness like in \cite{TIforFGJ} the calculus
needs to be restricted in that regard.
Our algorithm is not complete (see discussion in chapter \ref{sec:completeness}) and without the respective proof we can keep the \TamedFJ{} calculus as simple as possible
and as close to it's Featherweight Java correspondent \cite{WildcardsNeedWitnessProtection} as possible.
Our soundness proof works either way.
and as close to it's Featherweight Java correspondent as possible,
simplifying the soundness proof.
\end{itemize}
%Additional features like overriding methods and method overloading can be added by copying the respective parts from there.
@@ -73,6 +108,7 @@ $
& & \ \ | & \texttt{let}\ \expr{x} \highlight{: \wcNtype{\Delta}{N}} = \expr{e} \ \texttt{in} \ \expr{e}\\
& & \ \ | & \expr{e} \elvis{} \expr{e}\\
\text{Variable contexts} & \Gamma & ::= & \overline{\expr{x}:\type{T}}\\
\text{Method type environment} & \mathrm{\Pi} & ::= & \overline{ \texttt{m} : \generics{\ol{X \triangleleft N}}\ \ol{T} \to \type{T}}
%\hline
\end{array}
$

332
unify.tex
View File

@@ -6,6 +6,102 @@
% the algorithm only removes wildcards, never adds them
\section{Unify}\label{sec:unify}
%\newcommand{\tw}[1]{\tv{#1}_?}
\begin{description}
\item[Input:] An environment $\Delta'$
and a set of constraints $C = \set{ \type{T} \lessdot \type{T}, \type{T} \lessdotCC \type{T}, \type{T} \doteq \type{T} \ldots}$
\item[Output:]
Set of unifiers $Uni = \set{\sigma_1, \ldots, \sigma_n}$ and an environment $\Delta$
\end{description}
%The transformation steps are not applied all at once but in a specific order:
\unify{} executes the following steps until a type solution is found:
\begin{description}
\item[Step 1:]
Apply the rules depicted in the figures \ref{fig:normalizing-rules}, \ref{fig:reduce-rules} and \ref{fig:wildcard-rules} exhaustively,
starting with the \rulename{circle} rule.
\item[Step 2:]
%If there are no $(\type{T} \lessdot \tv{a})$ constraints remaining in the constraint set $C$
%resume with step 4.
The second step is nondeterministic.
%\unify{} has to pick the right transformation for each constraint of the form $\type{N} \lessdot \tv{a}$.
%The rules in figure \ref{fig:step2-rules} offer three possibilities to deal with constraints $\type{N} \lessdot \tv{a}$.
For every $\type{T} \lessdot \tv{a}$ constraint \unify{} has to pick exactly one transformation from figure \ref{fig:step2-rules}.
The same principle goes for constraints of the form $\tv{a} \lessdot \type{N}, \tv{a} \lessdot \tv{b}$ and the two transformations in figure \ref{fig:step2-rules2}.
%They have to be applied until the constraint set holds no constraints of the form $\tv{a} \lessdot \type{N}, \tv{a} \lessdot \tv{b}$.
If atleast one transformation was applied in this step revert to step 1.
Otherwise proceed with step 3.
%This builds a search tree over multiple possible solutions.
%\unify{} has to try each branch and accumulate the resulting solutions into a solution set.
%$\type{T} \lessdot \ntv{a}$ constraints have three and $\type{T} \lessdot \wtv{a}$ constraints have five possible transformations.
\textit{Hint:}
When implementing this step via backtracking
the rules \rulename{General} and \rulename{Super} should be tried first.
% is the Same rule really necessary?
The \rulename{Settle} and \rulename{Raise} rules should only be used when none of the rules in figure \ref{fig:step2-rules} can be applied.
\item[Step 3:]
Apply the rules in figure \ref{fig:cleanup-rules} exhaustively.
\rulename{Ground} and \rulename{Flatten} deal with constraints containing free variables.
If a type placeholder is solely used as lower bound \rulename{Ground} can replace it with the bottom type.
Otherwise the \rulename{Flatten} rule has to remove the wildcards responsible for the free variable.
\text{Note:} Only one of those rules has to be applied per constraint.
If the constraint set has been changed by one of these rules the algorithm must return to step 1.
But if only the \rulename{SubElim} rule is applied or the constraint set is not changed at all,
the algorithm can proceed with step 4.
The cleanup step prepares the constraint set for the last step by applying the following concepts:
%Two transformations are done which also help to remove unnecessary types from the solution set.
\begin{description}
\item[Bottom type]
The bottom type $\bot$ is used to generate \texttt{? extends} wildcard definitions.
This is the only possible solution when dealing with multiple upper bounds:
$\tv{a} \lessdot \type{T}, \tv{a} \lessdot \type{S}$ is usually not a correct solution (given $\type{S}$ and $\type{T}$ are no subtypes of eachother).
But if $\tv{a}$ is a lower bound of a wildcard it can be set to $\bot$.
Those constraints only stay in the constraint set after the first step if $\type{S}$ and $\type{T}$ do not have a common subtype.
The \rulename{Ground} rule uses this concept to generate \texttt{extends} Wildcards.
\end{description}
\item[Step 4:] \textit{(Generating Result)}
Apply the rules in figure \ref{fig:generation-rules} until $\wildcardEnv = \emptyset$ and $C = \emptyset$.
The resulting $\Delta, \sigma$ is a correct solution.
For this step to succeed there should only be four kinds of constraints left.
\begin{enumerate}
%\item\label{item:3} $\tv{a} \lessdot \tv{b}$ %, with $a$ and $b$ both isolated type variables
\item $\tv{a} \doteq \tv{b}$
%\item $\wtv{a} \doteq \type{G}$
\item\label{item:1} $\tv{a} \lessdot \wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}$, with $\text{fv}(\wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}) \subseteq \Delta_in$
\item\label{item:2} $\tv{a} \doteq \wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}$, with $\tv{a} \notin \ol{\type{T}}$ % and $\text{fv}(\wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}) = \emptyset$
\item\label{item:3} $\tv{a} \doteq \rwildcard{X}$
\end{enumerate}
%Each type placeholder $\tv{a}$ must solely appear on the left side of a constraint.
\unify{} fails if there is any $\tv{a} \doteq \type{T}$ such that $\tv{a}$ occurs in $\type{T}$.
For the cases \ref{item:1}, \ref{item:2}, and \ref{item:3} the placeholder $\tv{a}$
cannot appear anywhere else in the constraint set.
Otherwise the generation rules \rulename{GenSigma} and \rulename{GenDelta} will not be able to process every constraint.
% \begin{displaymath}
% \deduction{
% \wildcardEnv \cup \set{\wildcard{B}{\type{G}}{\type{G'}}} \vdash C \implies \Delta, \sigma
% }{
% \wildcardEnv \vdash C \implies \Delta \cup \set{\wildcard{B}{\type{G}}{\type{G'}}}, \sigma
% }\quad \text{tph}(\type{G}) = \emptyset, \text{tph}(\type{G'}) = \emptyset,
% \rwildcard{B} \notin \text{dom}(\Delta)
% \quad \rulename{AddDelta}
% \end{displaymath}
\end{description}
\subsection{Transformation Rules (Step 1)}
Our \unify{} process uses a similar concept as the standard unification by Martelli and Montanari \cite{MM82},
consisting of terms, relations and variables.
Instead of terms we have types of the form $\exptype{C}{\ol{T}}$ and
@@ -462,7 +558,75 @@ and every added wildcard gets a fresh unique name.
This ensures the wildcard environment $\wildcardEnv{}$ never
gets the same wildcard twice.
\subsection{Branching Step}
\begin{figure}
\begin{center}
\leavevmode
\fbox{
\begin{tabular}[t]{l@{~}l}
\rulename{Subst} &
$\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline
[\type{T}/\tv{a}]\wildcardEnv \vdash [\type{T}/\tv{a}]
C \cup \set{\ntv{a} \doteq \type{T}}
\end{array}
\quad \begin{array}{c}
\ntv{a} \notin \type{T} \\
\text{fv}(\type{T}) \subseteq \Delta', \, \text{wtv}(\type{T}) = \emptyset
\end{array}$\\
\\
\rulename{Subst-WC} &$
\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\wtv{a} \doteq \rwildcard{T}}\\
\hline
[\type{T}/\wtv{a}]\wildcardEnv \vdash [\type{T}/\wtv{a}]C
\end{array} \quad \wtv{a} \notin \type{T}
$\\
\\
\rulename{Normalize} &
$\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline
[\ntv{b}/\wtv{b}]\wildcardEnv \vdash [\ntv{b}/\wtv{b}]
C \cup \set{\ntv{a} \doteq [\ntv{b}/\wtv{b}]\type{T}}
\end{array}
\quad \begin{array}{c}
\wtv{b} \in \text{wtv}(\type{T})\\
\ntv{b} \ \text{fresh}
\end{array}$\\
\\
\rulename{Contract} &
$\begin{array}[c]{l}
\wildcardEnv \cup \set{\wildcard{A}{\type{U}}{\type{L}}} \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline
[\type{U}/\type{A}]\wildcardEnv \vdash [\type{U}/\type{A}]
C \cup [\type{U}/\type{A}]\set{\ntv{a} \doteq \type{T}, \type{L} \doteq \type{U}}
\end{array}
\quad \begin{array}{c}
\rwildcard{A} \in \text{fv}(\type{T})\\
\end{array}$\\
\end{tabular}}
\end{center}
\caption{Substitution rules}\label{fig:subst-rules}
\end{figure}
\unify{} must not replace normal type placeholders with free variables
except variables initially passed by $\Delta_{in}$.
The \rulename{Subst} rule checks if a type $\type{T}$ contains any
free variables or wildcard placeholders before replacing a normal type placeholder with it.
%This ensures that free variables are never substituted for normal type placeholders.
This ensures that a normal type placeholder is never replaced by a type containing free variables.
A type solution for a normal type placeholder will never contain free variables.
This is needed to guarantee well-formed type solutions and also keep free variables inside their scope
(see challenge \ref{challenge3}).
\rulename{Subst-WC} does not need to do that and can freely replace wildcard placehodlers with types despite their free variables.
We do not keep replacements for wildcard placeholders and they will not show up in the final type solution.
If the \rulename{Subst} rule is not applicable then either the \rulename{Normalize}
or \rulename{Contract} transformation has to be used to remove wildcard placeholders and
wildcards.
\subsection{Branching Step (Step 2)}
\unify{} is described as a nondeterministic algorithm.
Some constraints allow for multiple transformations from which
the algorithm has to pick the right one.
@@ -474,7 +638,9 @@ and gathers all possible type solutions.
We skip the definition of this practice, because it is already described in \cite{TIforFGJ}
and only needed for a proof of completeness.
\subsection{Generate Result (Step 4)}
The generation rules defined in figure \ref{fig:generation-rules} are similar to the other transformation rules but contain an additional part,
the result output consisting of a wildcard environment and a set of unifier $\sigma$.
\subsection{Adding Wildcards to the mix}
%\unify{} is able to create wildcard solutions even when the input set of constraints do not contain any wildcard variables.
@@ -622,101 +788,6 @@ The \texttt{concat} method takes two lists of the same generic type.
\subsection{Algorithm}
%\newcommand{\tw}[1]{\tv{#1}_?}
\begin{description}
\item[Input:] An environment $\Delta'$
and a set of constraints $C = \set{ \type{T} \lessdot \type{T}, \type{T} \lessdotCC \type{T}, \type{T} \doteq \type{T} \ldots}$
\item[Output:]
Set of unifiers $Uni = \set{\sigma_1, \ldots, \sigma_n}$ and an environment $\Delta$
\end{description}
%The transformation steps are not applied all at once but in a specific order:
\unify{} executes the following steps until a type solution is found:
\begin{description}
\item[Step 1:]
Apply the rules depicted in the figures \ref{fig:normalizing-rules}, \ref{fig:reduce-rules} and \ref{fig:wildcard-rules} exhaustively,
starting with the \rulename{circle} rule.
\item[Step 2:]
%If there are no $(\type{T} \lessdot \tv{a})$ constraints remaining in the constraint set $C$
%resume with step 4.
The second step is nondeterministic.
%\unify{} has to pick the right transformation for each constraint of the form $\type{N} \lessdot \tv{a}$.
%The rules in figure \ref{fig:step2-rules} offer three possibilities to deal with constraints $\type{N} \lessdot \tv{a}$.
For every $\type{T} \lessdot \tv{a}$ constraint \unify{} has to pick exactly one transformation from figure \ref{fig:step2-rules}.
The same principle goes for constraints of the form $\tv{a} \lessdot \type{N}, \tv{a} \lessdot \tv{b}$ and the two transformations in figure \ref{fig:step2-rules2}.
%They have to be applied until the constraint set holds no constraints of the form $\tv{a} \lessdot \type{N}, \tv{a} \lessdot \tv{b}$.
If atleast one transformation was applied in this step revert to step 1.
Otherwise proceed with step 3.
%This builds a search tree over multiple possible solutions.
%\unify{} has to try each branch and accumulate the resulting solutions into a solution set.
%$\type{T} \lessdot \ntv{a}$ constraints have three and $\type{T} \lessdot \wtv{a}$ constraints have five possible transformations.
\textit{Hint:}
When implementing this step via backtracking
the rules \rulename{General} and \rulename{Super} should be tried first.
% is the Same rule really necessary?
The \rulename{Settle} and \rulename{Raise} rules should only be used when none of the rules in figure \ref{fig:step2-rules} can be applied.
\item[Step 3:]
Apply the rules in figure \ref{fig:cleanup-rules} exhaustively.
\rulename{Ground} and \rulename{Flatten} deal with constraints containing free variables.
If a type placeholder is solely used as lower bound \rulename{Ground} can replace it with the bottom type.
Otherwise the \rulename{Flatten} rule has to remove the wildcards responsible for the free variable.
\text{Note:} Only one of those rules has to be applied per constraint.
If the constraint set has been changed by one of these rules the algorithm must return to step 1.
But if only the \rulename{SubElim} rule is applied or the constraint set is not changed at all,
the algorithm can proceed with step 4.
The cleanup step prepares the constraint set for the last step by applying the following concepts:
%Two transformations are done which also help to remove unnecessary types from the solution set.
\begin{description}
\item[Bottom type]
The bottom type $\bot$ is used to generate \texttt{? extends} wildcard definitions.
This is the only possible solution when dealing with multiple upper bounds:
$\tv{a} \lessdot \type{T}, \tv{a} \lessdot \type{S}$ is usually not a correct solution (given $\type{S}$ and $\type{T}$ are no subtypes of eachother).
But if $\tv{a}$ is a lower bound of a wildcard it can be set to $\bot$.
Those constraints only stay in the constraint set after the first step if $\type{S}$ and $\type{T}$ do not have a common subtype.
The \rulename{Ground} rule uses this concept to generate \texttt{extends} Wildcards.
\end{description}
\item[Step 4:] \textit{(Generating Result)}
Apply the rules in figure \ref{fig:generation-rules} until $\wildcardEnv = \emptyset$ and $C = \emptyset$.
The resulting $\Delta, \sigma$ is a correct solution.
For this step to succeed there should only be four kinds of constraints left.
\begin{enumerate}
%\item\label{item:3} $\tv{a} \lessdot \tv{b}$ %, with $a$ and $b$ both isolated type variables
\item $\tv{a} \doteq \tv{b}$
%\item $\wtv{a} \doteq \type{G}$
\item\label{item:1} $\tv{a} \lessdot \wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}$, with $\text{fv}(\wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}) \subseteq \Delta_in$
\item\label{item:2} $\tv{a} \doteq \wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}$, with $\tv{a} \notin \ol{\type{T}}$ % and $\text{fv}(\wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}) = \emptyset$
\item\label{item:3} $\tv{a} \doteq \rwildcard{X}$
\end{enumerate}
%Each type placeholder $\tv{a}$ must solely appear on the left side of a constraint.
\unify{} fails if there is any $\tv{a} \doteq \type{T}$ such that $\tv{a}$ occurs in $\type{T}$.
For the cases \ref{item:1}, \ref{item:2}, and \ref{item:3} the placeholder $\tv{a}$
cannot appear anywhere else in the constraint set.
Otherwise the generation rules \rulename{GenSigma} and \rulename{GenDelta} will not be able to process every constraint.
% \begin{displaymath}
% \deduction{
% \wildcardEnv \cup \set{\wildcard{B}{\type{G}}{\type{G'}}} \vdash C \implies \Delta, \sigma
% }{
% \wildcardEnv \vdash C \implies \Delta \cup \set{\wildcard{B}{\type{G}}{\type{G'}}}, \sigma
% }\quad \text{tph}(\type{G}) = \emptyset, \text{tph}(\type{G'}) = \emptyset,
% \rwildcard{B} \notin \text{dom}(\Delta)
% \quad \rulename{AddDelta}
% \end{displaymath}
\end{description}
With \texttt{C} being class names and \texttt{A} being wildcard names.
The wildcard type $\wildcard{X}{U}{L}$ consist of an upper bound $\type{U}$, a lower bound $\type{L}$
and a name $\mathtt{X}$.
@@ -820,73 +891,6 @@ which are used for the upper and lower bounds.
% if there are a <. List<x?> constraints remaining in the end, then this can be a sign of a irregular input constraint set.
\begin{figure}
\begin{center}
\leavevmode
\fbox{
\begin{tabular}[t]{l@{~}l}
\rulename{Subst} &
$\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline
[\type{T}/\tv{a}]\wildcardEnv \vdash [\type{T}/\tv{a}]
C \cup \set{\ntv{a} \doteq \type{T}}
\end{array}
\quad \begin{array}{c}
\ntv{a} \notin \type{T} \\
\text{fv}(\type{T}) \subseteq \Delta', \, \text{wtv}(\type{T}) = \emptyset
\end{array}$\\
\\
\rulename{Subst-WC} &$
\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\wtv{a} \doteq \rwildcard{T}}\\
\hline
[\type{T}/\wtv{a}]\wildcardEnv \vdash [\type{T}/\wtv{a}]C
\end{array} \quad \wtv{a} \notin \type{T}
$\\
\\
\rulename{Normalize} &
$\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline
[\ntv{b}/\wtv{b}]\wildcardEnv \vdash [\ntv{b}/\wtv{b}]
C \cup \set{\ntv{a} \doteq [\ntv{b}/\wtv{b}]\type{T}}
\end{array}
\quad \begin{array}{c}
\wtv{b} \in \text{wtv}(\type{T})\\
\ntv{b} \ \text{fresh}
\end{array}$\\
\\
\rulename{Contract} &
$\begin{array}[c]{l}
\wildcardEnv \cup \set{\wildcard{A}{\type{U}}{\type{L}}} \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline
[\type{U}/\type{A}]\wildcardEnv \vdash [\type{U}/\type{A}]
C \cup [\type{U}/\type{A}]\set{\ntv{a} \doteq \type{T}, \type{L} \doteq \type{U}}
\end{array}
\quad \begin{array}{c}
\rwildcard{A} \in \text{fv}(\type{T})\\
\end{array}$\\
\end{tabular}}
\end{center}
\caption{Substitution rules}\label{fig:subst-rules}
\end{figure}
\unify{} must not replace normal type placeholders with free variables
except variables initially passed by $\Delta_{in}$.
The \rulename{Subst} rule checks if a type $\type{T}$ contains any
free variables or wildcard placeholders before replacing a normal type placeholder with it.
%This ensures that free variables are never substituted for normal type placeholders.
This ensures that a normal type placeholder is never replaced by a type containing free variables.
A type solution for a normal type placeholder will never contain free variables.
This is needed to guarantee well-formed type solutions and also keep free variables inside their scope
(see challenge \ref{challenge3}).
\rulename{Subst-WC} does not need to do that and can freely replace wildcard placehodlers with types despite their free variables.
We do not keep replacements for wildcard placeholders and they will not show up in the final type solution.
If the \rulename{Subst} rule is not applicable then either the \rulename{Normalize}
or \rulename{Contract} transformation has to be used to remove wildcard placeholders and
wildcards.
% \begin{example}{Free variables must not leave the scope of the surrounding \texttt{let} statement}
% \noindent