9 Commits

Author SHA1 Message Date
1ee343b87e Cleanup challenge 3 2024-05-28 23:06:58 +02:00
Peter Thiemann
d2f58b2489 intro 2024-05-28 18:21:37 +02:00
c714d49677 Intro to Challenges 2024-05-28 16:15:22 +02:00
fe64b87d09 Fix Titlepage 2024-05-28 16:15:10 +02:00
583b4acd5c Restructure Unify 2024-05-28 00:44:06 +02:00
3dbdce8e29 Fix TamedFJ introduction 2024-05-27 23:22:38 +02:00
49368e0d0e FIx 2024-05-27 18:23:44 +02:00
e5f577f577 Rephrase TamedFJ intro 2024-05-27 18:22:39 +02:00
a41802301e Rephrase TamedFJ intro 2024-05-27 18:22:17 +02:00
5 changed files with 418 additions and 286 deletions

4
.gitignore vendored
View File

@@ -17,3 +17,7 @@
*-blx.aux
*-blx.bib
*.run.xml
*.vtc
*.synctex.gz
auto
_region_.tex

View File

@@ -1,5 +1,5 @@
\documentclass{llncs}
\documentclass[anonymous,USenglish]{llncs}
%This is a template for producing LIPIcs articles.
%See lipics-v2021-authors-guidelines.pdf for further information.
%for A4 paper format use option "a4paper", for US-letter use option "letterpaper"
@@ -45,22 +45,19 @@
\title{Global Type Inference for Featherweight Java with Wildcards} %TODO Please add
%\titlerunning{Dummy short title} %TODO optional, please use if title is longer than one line
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% ANON
\author{Andreas Stadelmeier}{DHBW Stuttgart, Campus Horb, Germany}{a.stadelmeier@hb.dhbw-stuttgart.de}{}{}%TODO mandatory, please use full name; only 1 author per \author macro; first two parameters are mandatory, other parameters can be empty. Please provide at least the name of the affiliation and the country. The full address is optional. Use additional curly braces to indicate the correct name splitting when the last name consists of multiple name parts.
% \author{Andreas Stadelmeier\inst{1}, Martin Plümicke\inst{1}, Peter Thiemann\inst{2}}
% \institute{DHBW Stuttgart, Campus Horb, Germany\\
% \email{a.stadelmeier@hb.dhbw-stuttgart.de} \and
% Universität Freiburg, Institut für Informatik, Germany\\
% \email{thiemann@informatik.uni-freiburg.de}}
\author{Martin Plümicke}{DHBW Stuttgart, Campus Horb, Germany}{pl@dhbw.de}{}{}
% \authorrunning{A. Stadelmeier and M. Plümicke and P. Thiemann}
\author{Peter Thiemann}{Universität Freiburg, Institut für Informatik, Germany}{thiemann@informatik.uni-freiburg.de}{}{}
\authorrunning{A. Stadelmeier and M. Plümicke and P. Thiemann} %TODO mandatory. First: Use abbreviated first/middle names. Second (only in severe cases): Use first author plus 'et al.'
%\Copyright{Andreas Stadelmeier and Martin Plümicke and Peter Thiemann} %TODO mandatory, please use full first names. LIPIcs license is "CC-BY"; http://creativecommons.org/licenses/by/3.0/
%\ccsdesc[500]{Software and its engineering~Language features}
%\ccsdesc[300]{Software and its engineering~Syntax}
\keywords{type inference, Java, subtyping, generics} %TODO mandatory; please add comma-separated list of keywords
%% END ANON
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\category{} %optional, e.g. invited paper
@@ -97,8 +94,17 @@
%TODO mandatory: add short abstract of the document
\begin{abstract}
TODO: Abstract
Type inference is a hallmark of functional programming. Its use in
object-oriented programming is often restricted to type inference
for local variables and the instantiation of generic type
parameters.
Global type inference for Featherweight Generic Java is a
whole-program analysis that infers omitted method signatures.
Compared to previous work, its results are more general as it infers
types with wildcards. Global type inference is proved sound.
\end{abstract}
\keywords{type inference, Java, subtyping, generics} %TODO mandatory; please add comma-separated list of keywords
\input{introduction}

View File

@@ -26,24 +26,52 @@
% It's only restriction is that no polymorphic recursion is allowed.
% \end{recap}
\section{Introduction}
\label{sec:introduction}
\section{Type Inference for Java}
- Type inference helps rapid development
- used in Java already (var keyword)
- keeps source code clean
Type inference is a hallmark of functional programming, where it
improves readability and conciseness while
enabling safe rapid prototyping by allowing the compiler to deduce
types automatically without explicit type annotations. It allows
developers to safely refactor their programs before fixing function
signatures. In object-oriented programming (OOP), however, the use of
type inference is more restricted. It is often employed for local
variables and the instantiation of generic type parameters, offering
similar readability and conciseness benefits as in functional
programming. In languages like Java, the overall architecture of
method signatures must be designed by the programmer up-front.
Java comes with a local type inference algorithm
used for lambda expressions, method calls, and the \texttt{var} keyword.
Java features a limited form of local type inference for local
variables defined with the \texttt{var} keyword, lambda expressions,
and method calls. Java infers the type of variables defined with
\texttt{var}. The rationale here is that instantiations of generic
types (e.g., maps of maps) are often unwiedly and they can be easily
inferred from the initializing expression. The type of a lambda
expression is inferred from the target type, i.e., the type of the
method argument that accepts the lambda. The rationale is the intended
use of lambdas as callback functions for listeners etc. For calls
to generic methods, Java infers the most specific instantiation of the
generic parameters for each individual call.
A type inference algorithm that is able to recreate any type annotation
even when no type information is given at all is called a global type inference algorithm.
The global type inference algorithm presented here is able to deal with all Java types including wildcard types.
It can also be used for regular Java programs.
The local type inference features of Java add some convenience, but
programmers still have to provide method signatures beforehand. So there
is scope for a global type inference algorithm that infers
method signatures even if no type information (except class headers
and types of field variables) is given.
We present such a global type inference algorithm for Featherweight
Generic Java with wildcards. The same approach can also be used for
regular Java programs.
%The goal is to find a correct typing for a given Java program.
Type inference for Java has many use cases and could be used to help programmers by inserting correct types for them,
finding better type solutions for already typed Java programs (for example more generical ones),
or allowing to write typeless Java code which is then type inferred and thereby type checked by our algorithm.
Our algorithm enables programmers to write Java code with only a
minimum of type annotations (class headers and types of field
variables), it can fill in missing type annotations in partially
type-annotated programs, and it can suggest better (more general)
types for ordinary, typed Java programs.
\textbf{TO BE CONTINUED}
We propose a global type inference algorithm for Java supporting Wildcards and proven sound.
Global type inference allows input programs without any type annotations.
@@ -57,6 +85,34 @@ are infered and inserted by our algorithm.
%This paper extends a type inference algorithm for Featherweight Java \cite{TIforFGJ} by adding wildcards.
%The last step to create a type inference algorithm compatible to the Java type system.
\begin{figure}
\begin{minipage}{0.43\textwidth}
\begin{lstlisting}[style=java,label=lst:intro-example-typeless,caption=Missing return type]
genList() {
if( ... ) {
return new List(1);
} else {
return new List("Str");
}
}
\end{lstlisting}
\end{minipage}%
\hfill
\begin{minipage}{0.55\textwidth}
\begin{lstlisting}[style=tfgj,caption=Type inference solution,label=lst:intro-example-typed]
List<?> genList() {
if( ... ) {
return new List<Integer>(1);
} else {
return new List<String>("Str");
}
}
\end{lstlisting}
\end{minipage}
\end{figure}
\subsection{Comparision to similar Type Inference Algorithms}
To outline the contributions in this paper we will list the advantages and improvements to smiliar type inference algorithms:
\begin{description}
@@ -175,32 +231,6 @@ We prove soundness and aim for a good compromise between completeness and time c
% \end{verbatim}
\begin{figure}
\begin{minipage}{0.43\textwidth}
\begin{lstlisting}[style=java,label=lst:intro-example-typeless,caption=Missing return type]
genList() {
if( ... ) {
return new List(1);
} else {
return new List("Str");
}
}
\end{lstlisting}
\end{minipage}%
\hfill
\begin{minipage}{0.55\textwidth}
\begin{lstlisting}[style=tfgj,caption=Type inference solution,label=lst:intro-example-typed]
List<?> genList() {
if( ... ) {
return new List<Integer>(1);
} else {
return new List<String>("Str");
}
}
\end{lstlisting}
\end{minipage}
\end{figure}
% \subsection{Wildcards}
% Java subtyping involving generics is invariant.
% For example \texttt{List<String>} is not a subtype of \texttt{List<Object>}.
@@ -467,8 +497,19 @@ a type parameter to method call \texttt{<X>add(v, "String")}.
% do not substitute free type variables
Lets have a look at a few challenges:
\begin{itemize}
Global Type inference for Featherweight Java with generics but without wildcards is solved already.
But adding Wildcards to the calculus creates new problems we did not foresee
and which have not been recognized by an existing type unification algorithm for Java with wildcards \ref{plue09_1}.
% what is the problem?
% Java does invisible capture conversion steps
Java Wildcard types are represented as existential types and have to be opened before they can be used.
This can either be done implicitly (\cite{aModelForJavaWithWildcards}, \cite{javaTIisBroken}) or explicitly via let statements (\cite{WildcardsNeedWitnessProtection}).
For all of those variations it is vital to know the argument types with which a method is called.
But our input program does not contain any type annotations.
We do not know where an existential type will emerge and where a capture conversion is necessary.
This has to be figured out during the type inference algorithm.
We detected three main challenges related to Java Wildcards and Global Type Inference:
\begin{enumerate}
\item \label{challenge:1}
One challenge is to design the algorithm in a way that it finds a correct solution for the program shown in listing \ref{lst:addExample}
and rejects the program in listing \ref{lst:concatError}.
@@ -511,12 +552,31 @@ is solved by removing the wildcard $\rwildcard{X}$ if possible.
this problem is solved by ANF transformation
\item \textbf{Wildcards as Existential Types:}
One problem is the divergence between denotable and expressable types in Java \cite{semanticWildcardModel}.
A wildcard in the Java syntax has no name and is bound to its enclosing type:
$\exptype{List}{\exptype{List}{\type{?}}}$ equates to $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
During type checking \emph{intermediate types}
can emerge, which have no equivalent in the Java syntax.
\item
% This problem is solved by assuming everything is a wildcard and lateron erasing excessive wildcards
% this is solved by having wildcards bound to a type. But this makes it necessary to remove wildcards lateron otherwise Unify would have to backtrack
The program in listing \ref{shuffleExample} shows a challenge involving wildcards and subtyping.
The method call \texttt{shuffle(l)} is incorrect, because \texttt{l} has the type
$\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$ representing a list of unknown lists.
Whereas $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$ is a subtype of
$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$ representing a list of lists, all of the same type $\rwildcard{X}$,
and can savely be passed to \texttt{shuffle}.
This behavior can also be explained by looking at the types and their capture converted versions:
\begin{center}
\begin{tabular}{l | l | l}
Java type & \TamedFJ{} representation & Capture Conversion \\
\hline
$\exptype{List}{\exptype{List}{\texttt{?}}}$ & $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$ & $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$ \\
$\exptype{List2D}{\texttt{?}}$ & $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$ & $\exptype{List2D}{\rwildcard{X}}$ \\
%Supertype of $\exptype{List2D}{\texttt{?}}$ & $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$ & $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$ \\
\end{tabular}
\end{center}
%The direct supertype of $\exptype{List2D}{\rwildcard{X}}$ is $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$.
%One problem is the divergence between denotable and expressable types in Java \cite{semanticWildcardModel}.
%A wildcard in the Java syntax has no name and is bound to its enclosing type:
%$\exptype{List}{\exptype{List}{\type{?}}}$ equates to $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
%During type checking \emph{intermediate types}
%can emerge, which have no equivalent in the Java syntax.
\begin{lstlisting}[style=java,label=shuffleExample,caption=Intermediate Types Example]
class List<X> extends Object {...}
@@ -530,45 +590,64 @@ List2D<?> l2d = ...;
shuffle(l); // Error
shuffle(l2d); // Valid
\end{lstlisting}
Java is using local type inference to allow method invocations which are not describable with regular Java types.
The \texttt{shuffle} method in this case is invoked with the type $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$
which is a subtype of $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$.
After capture conversion \texttt{l2d'} has the type $\exptype{List}{\exptype{List}{\rwildcard{X}}}$
and \texttt{shuffle} can be invoked with the type parameter $\rwildcard{X}$:
\begin{lstlisting}[style=TamedFJ]
let l2d' : (*@$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$@*) = l2d in <X>shuffle(l2d')
\end{lstlisting}
% Java is using local type inference to allow method invocations which are not describable with regular Java types.
% The \texttt{shuffle} method in this case is invoked with the type $\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}}$
% which is a subtype of $\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$.
% After capture conversion \texttt{l2d'} has the type $\exptype{List}{\exptype{List}{\rwildcard{X}}}$
% and \texttt{shuffle} can be invoked with the type parameter $\rwildcard{X}$:
% \begin{lstlisting}[style=TamedFJ]
% let l2d' : (*@$\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}}$@*) = l2d in <X>shuffle(l2d')
% \end{lstlisting}
For the example shown in listing \ref{shuffleExample} the method call \texttt{shuffle(l2d)} creates the constraints:
\begin{constraintset}
\begin{center}
$
\begin{array}{l}
\wctype{\rwildcard{X}}{List2D}{\rwildcard{X}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}
\\
\hline
\wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}
\\
\hline
\textit{Capture Conversion:}\
\exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\wtv{x}}}
\\
\hline
\textit{Solution:} \wtv{x} \doteq \rwildcard{X} \implies \exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\rwildcard{X}}}
\end{array}
$
\end{center}
\end{constraintset}
% For the example shown in listing \ref{shuffleExample} the method call \texttt{shuffle(l2d)} creates the constraints:
% \begin{constraintset}
% \begin{center}
% $
% \begin{array}{l}
% \wctype{\rwildcard{X}}{List2D}{\rwildcard{X}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}
% \\
% \hline
% \wctype{\rwildcard{X}}{List}{\exptype{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}
% \\
% \hline
% \textit{Capture Conversion:}\
% \exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\wtv{x}}}
% \\
% \hline
% \textit{Solution:} \wtv{x} \doteq \rwildcard{X} \implies \exptype{List}{\exptype{List}{\rwildcard{X}}} \lessdot \exptype{List}{\exptype{List}{\rwildcard{X}}}
% \end{array}
% $
% \end{center}
% \end{constraintset}
The method call \texttt{shuffle(l)} is invalid however,
because \texttt{l} has the type
$\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
There is no solution for the subtype constraint:
$\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}$
Given such a program our type inference algorithm has to allow the call \texttt{shuffle(l2d)} and
decline the call to \texttt{shuffle(l)}.
% The method call \texttt{shuffle(l)} is invalid however,
% because \texttt{l} has the type
% $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$.
% There is no solution for the subtype constraint:
% $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}} \lessdotCC \exptype{List}{\exptype{List}{\wtv{x}}}$
\item \label{challenge3} \textbf{Free Variables cannot leaver their scope}:
Let's assume we have a variable \texttt{ls} with type $\exptype{List}{\wctype{\rwildcard{X}}{List}{\rwildcard{X}}}$
%When calling the \texttt{id} function with an element of this list we have to apply capture conversion.
and the following input program:
\noindent
\begin{minipage}{0.62\textwidth}
\begin{lstlisting}
let x : (*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}}$@*) = ls.get(0) in id(x) : (*@$\ntv{z}$@*)
\end{lstlisting}\end{minipage}
\hfill
\begin{minipage}{0.36\textwidth}
\begin{lstlisting}[style=constraints]
(*@$\wctype{\rwildcard{X}}{List}{\rwildcard{X}} \lessdotCC \exptype{List}{\wtv{x}}$@*),
(*@$\exptype{List}{\wtv{x}} \lessdot \ntv{z},$@*)
\end{lstlisting}
\end{minipage}
% the variable z has to
\begin{example}
Take the Java program in listing \ref{lst:mapExample} for example.
It maps every element of a list
$\expr{l} : \exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$
@@ -616,8 +695,11 @@ $\exptype{List}{\wctype{\rwildcard{A}}{List}{\rwildcard{A}}}$
We solve this by distinguishing between wildcard placeholders and normal placeholders.
$\ntv{z}$ is a normal placeholder and is not allowed to contain free variables.
\end{example}
\end{itemize}
\end{enumerate}
%TODO: Move this part. or move the third challenge some underneath.
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "TIforWildFJ"
%%% End:

View File

@@ -5,38 +5,73 @@
%Input language only described here. It is standard Featherweight Java
% we use the transformation to proof soundness. this could also be moved to the end.
% the constraint generation step assumes every method argument to be encapsulated in a let statement. This is the way Java is doing capture conversion
We define our own calculus \TamedFJ{}, which is used as input aswell as output to our global type inference algorithm.
%We assume that the input to our algorithm is a program, which carries none of the optional type annotations.
%After calculating a type solution we can insert all missing types and generate a correct program.
%The input to our algorithm is a typeless version of Featherweight Java.
The syntax is shown in figure \ref{fig:syntax} with optional type annotations highlighted in yellow.
The respective type rules are defined by figure \ref{fig:expressionTyping} and \ref{fig:typing}.
\TamedFJ{} is practically a subset of the Featherweight Java calculus defined by \textit{Bierhoff} \cite{WildcardsNeedWitnessProtection}
with the exception that method argument and return type annotations are optional.
The point is that a correct and fully typed \TamedFJ{} program is also a correct Featherweight Java program,
which is vital for our soundness proof (see chapter \ref{sec:soundness}).
%The language is designed to showcase type inference involving existential types.
The input to our algorithm is a typeless version of Featherweight Java.
The syntax is shown in figure \ref{fig:syntax}
and the respective type rules in figure \ref{fig:expressionTyping} and \ref{fig:typing}.
Method parameters and return types are optional.
We still require type annotations for fields and generic class parameters.
A method assumption consists out of a method name, a list of type parameters, a list of argument types, and a return type.
The first type in the list of argument types is the type of the surrounding class also known as the \texttt{this} parameter.
See figure \ref{fig:methodTypeExample} for an example:
Here the \texttt{add} method internally is treated as a method with two arguments,
because we add \texttt{this} to its argument list.
Note that the type parameter $\type{A}$ of the surrounding class is part of the methods parameter list.
%For example the \texttt{add} method in listing \ref{lst:tamedfjSample} is represented by the assumption
%$\texttt{add} : \generics{\ol{X \triangleleft Object}}\ \type{X} \to \exptype{List}{\type{X}}$.
\begin{figure}
\center
\begin{minipage}{0.49\textwidth}
%[style=java,caption=\TamedFJ{} sample, label=lst:tamedfjSample]
\begin{lstlisting}
class List<A extends Object> {
List<A> add(A v){..,}
}
\end{lstlisting}
\end{minipage}
\begin{minipage}{0.49\textwidth}
\begin{align*}
&\mathrm{\Pi} = \set{\\
&\texttt{add} : \generics{\ol{A \triangleleft Object}}\ \exptype{List}{\type{A}},\type{A} \to \exptype{List}{\type{X}} \\
&}
\end{align*}
\end{minipage}
% \begin{minipage}{0.49\textwidth}
% \begin{lstlisting}[style=java,caption=\TamedFJ{} sample, label=lst:tamedfjSample]
% class List<A extends Object> {}
% <A extends Object> List<A>
% add(List<A> this, A v){..,}
% \end{lstlisting}
% \end{minipage}
\caption{\TamedFJ{} class and its corresponding method type environment}\label{fig:methodTypeExample}
\end{figure}
\textit{Additional Notes:}%
\begin{itemize}
%\item Method parameters and return types are optional.
\item We still require type annotations for fields and generic class parameters.
This is a design choice by us,
as we consider them as data declarations which are given by the programmer.
% They are inferred in for example \cite{plue14_3b}
Note the \texttt{new} expression not requiring generic parameters,
which are also inferred by our algorithm.
The method call naturally has type inferred generic parameters aswell.
We add the elvis operator ($\elvis{}$) to the syntax mainly to showcase applications involving wildcard types.
The syntax is described in figure \ref{fig:syntax} with optional type annotations highlighted in yellow.
It is possible to exclude all optional types.
\TamedFJ{} is a subset of the calculus defined by \textit{Bierhoff} \cite{WildcardsNeedWitnessProtection}.
%The language is designed to showcase type inference involving existential types.
The idea is for our type inference algorithm to calculate all missing types and output a fully and correctly typed \TamedFJ{} program,
which by default is also a correct Featherweight Java program (see chapter \ref{sec:soundness}).
\noindent
\textit{Notes:}
\begin{itemize}
\item The calculus does not include method overriding for simplicity reasons.
\item We add the elvis operator ($\elvis{}$) to the syntax mainly to showcase applications involving wildcard types.
\item \textit{Note:} The \texttt{new} expression is not requiring generic parameters.
\item Every method has an unique name.
The calculus does not include method overriding for simplicity reasons.
Type inference for that is described in \cite{TIforFGJ} and can be added to this algorithm accordingly.
%\textit{Note:}
\item The \textsc{T-Program} type rule ensures that there is one set of method assumptions used for all classes
that are part of the program.
\item The typing rules for expressions shown in figure \ref{fig:expressionTyping} refrain from restricting polymorphic recursion.
Type inference for polymorphic recursion is undecidable \cite{wells1999typability} and when proving completeness like in \cite{TIforFGJ} the calculus
needs to be restricted in that regard.
Our algorithm is not complete (see discussion in chapter \ref{sec:completeness}) and without the respective proof we can keep the \TamedFJ{} calculus as simple as possible
and as close to it's Featherweight Java correspondent \cite{WildcardsNeedWitnessProtection} as possible.
Our soundness proof works either way.
and as close to it's Featherweight Java correspondent as possible,
simplifying the soundness proof.
\end{itemize}
%Additional features like overriding methods and method overloading can be added by copying the respective parts from there.
@@ -73,6 +108,7 @@ $
& & \ \ | & \texttt{let}\ \expr{x} \highlight{: \wcNtype{\Delta}{N}} = \expr{e} \ \texttt{in} \ \expr{e}\\
& & \ \ | & \expr{e} \elvis{} \expr{e}\\
\text{Variable contexts} & \Gamma & ::= & \overline{\expr{x}:\type{T}}\\
\text{Method type environment} & \mathrm{\Pi} & ::= & \overline{ \texttt{m} : \generics{\ol{X \triangleleft N}}\ \ol{T} \to \type{T}}
%\hline
\end{array}
$

332
unify.tex
View File

@@ -6,6 +6,102 @@
% the algorithm only removes wildcards, never adds them
\section{Unify}\label{sec:unify}
%\newcommand{\tw}[1]{\tv{#1}_?}
\begin{description}
\item[Input:] An environment $\Delta'$
and a set of constraints $C = \set{ \type{T} \lessdot \type{T}, \type{T} \lessdotCC \type{T}, \type{T} \doteq \type{T} \ldots}$
\item[Output:]
Set of unifiers $Uni = \set{\sigma_1, \ldots, \sigma_n}$ and an environment $\Delta$
\end{description}
%The transformation steps are not applied all at once but in a specific order:
\unify{} executes the following steps until a type solution is found:
\begin{description}
\item[Step 1:]
Apply the rules depicted in the figures \ref{fig:normalizing-rules}, \ref{fig:reduce-rules} and \ref{fig:wildcard-rules} exhaustively,
starting with the \rulename{circle} rule.
\item[Step 2:]
%If there are no $(\type{T} \lessdot \tv{a})$ constraints remaining in the constraint set $C$
%resume with step 4.
The second step is nondeterministic.
%\unify{} has to pick the right transformation for each constraint of the form $\type{N} \lessdot \tv{a}$.
%The rules in figure \ref{fig:step2-rules} offer three possibilities to deal with constraints $\type{N} \lessdot \tv{a}$.
For every $\type{T} \lessdot \tv{a}$ constraint \unify{} has to pick exactly one transformation from figure \ref{fig:step2-rules}.
The same principle goes for constraints of the form $\tv{a} \lessdot \type{N}, \tv{a} \lessdot \tv{b}$ and the two transformations in figure \ref{fig:step2-rules2}.
%They have to be applied until the constraint set holds no constraints of the form $\tv{a} \lessdot \type{N}, \tv{a} \lessdot \tv{b}$.
If atleast one transformation was applied in this step revert to step 1.
Otherwise proceed with step 3.
%This builds a search tree over multiple possible solutions.
%\unify{} has to try each branch and accumulate the resulting solutions into a solution set.
%$\type{T} \lessdot \ntv{a}$ constraints have three and $\type{T} \lessdot \wtv{a}$ constraints have five possible transformations.
\textit{Hint:}
When implementing this step via backtracking
the rules \rulename{General} and \rulename{Super} should be tried first.
% is the Same rule really necessary?
The \rulename{Settle} and \rulename{Raise} rules should only be used when none of the rules in figure \ref{fig:step2-rules} can be applied.
\item[Step 3:]
Apply the rules in figure \ref{fig:cleanup-rules} exhaustively.
\rulename{Ground} and \rulename{Flatten} deal with constraints containing free variables.
If a type placeholder is solely used as lower bound \rulename{Ground} can replace it with the bottom type.
Otherwise the \rulename{Flatten} rule has to remove the wildcards responsible for the free variable.
\text{Note:} Only one of those rules has to be applied per constraint.
If the constraint set has been changed by one of these rules the algorithm must return to step 1.
But if only the \rulename{SubElim} rule is applied or the constraint set is not changed at all,
the algorithm can proceed with step 4.
The cleanup step prepares the constraint set for the last step by applying the following concepts:
%Two transformations are done which also help to remove unnecessary types from the solution set.
\begin{description}
\item[Bottom type]
The bottom type $\bot$ is used to generate \texttt{? extends} wildcard definitions.
This is the only possible solution when dealing with multiple upper bounds:
$\tv{a} \lessdot \type{T}, \tv{a} \lessdot \type{S}$ is usually not a correct solution (given $\type{S}$ and $\type{T}$ are no subtypes of eachother).
But if $\tv{a}$ is a lower bound of a wildcard it can be set to $\bot$.
Those constraints only stay in the constraint set after the first step if $\type{S}$ and $\type{T}$ do not have a common subtype.
The \rulename{Ground} rule uses this concept to generate \texttt{extends} Wildcards.
\end{description}
\item[Step 4:] \textit{(Generating Result)}
Apply the rules in figure \ref{fig:generation-rules} until $\wildcardEnv = \emptyset$ and $C = \emptyset$.
The resulting $\Delta, \sigma$ is a correct solution.
For this step to succeed there should only be four kinds of constraints left.
\begin{enumerate}
%\item\label{item:3} $\tv{a} \lessdot \tv{b}$ %, with $a$ and $b$ both isolated type variables
\item $\tv{a} \doteq \tv{b}$
%\item $\wtv{a} \doteq \type{G}$
\item\label{item:1} $\tv{a} \lessdot \wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}$, with $\text{fv}(\wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}) \subseteq \Delta_in$
\item\label{item:2} $\tv{a} \doteq \wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}$, with $\tv{a} \notin \ol{\type{T}}$ % and $\text{fv}(\wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}) = \emptyset$
\item\label{item:3} $\tv{a} \doteq \rwildcard{X}$
\end{enumerate}
%Each type placeholder $\tv{a}$ must solely appear on the left side of a constraint.
\unify{} fails if there is any $\tv{a} \doteq \type{T}$ such that $\tv{a}$ occurs in $\type{T}$.
For the cases \ref{item:1}, \ref{item:2}, and \ref{item:3} the placeholder $\tv{a}$
cannot appear anywhere else in the constraint set.
Otherwise the generation rules \rulename{GenSigma} and \rulename{GenDelta} will not be able to process every constraint.
% \begin{displaymath}
% \deduction{
% \wildcardEnv \cup \set{\wildcard{B}{\type{G}}{\type{G'}}} \vdash C \implies \Delta, \sigma
% }{
% \wildcardEnv \vdash C \implies \Delta \cup \set{\wildcard{B}{\type{G}}{\type{G'}}}, \sigma
% }\quad \text{tph}(\type{G}) = \emptyset, \text{tph}(\type{G'}) = \emptyset,
% \rwildcard{B} \notin \text{dom}(\Delta)
% \quad \rulename{AddDelta}
% \end{displaymath}
\end{description}
\subsection{Transformation Rules (Step 1)}
Our \unify{} process uses a similar concept as the standard unification by Martelli and Montanari \cite{MM82},
consisting of terms, relations and variables.
Instead of terms we have types of the form $\exptype{C}{\ol{T}}$ and
@@ -462,7 +558,75 @@ and every added wildcard gets a fresh unique name.
This ensures the wildcard environment $\wildcardEnv{}$ never
gets the same wildcard twice.
\subsection{Branching Step}
\begin{figure}
\begin{center}
\leavevmode
\fbox{
\begin{tabular}[t]{l@{~}l}
\rulename{Subst} &
$\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline
[\type{T}/\tv{a}]\wildcardEnv \vdash [\type{T}/\tv{a}]
C \cup \set{\ntv{a} \doteq \type{T}}
\end{array}
\quad \begin{array}{c}
\ntv{a} \notin \type{T} \\
\text{fv}(\type{T}) \subseteq \Delta', \, \text{wtv}(\type{T}) = \emptyset
\end{array}$\\
\\
\rulename{Subst-WC} &$
\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\wtv{a} \doteq \rwildcard{T}}\\
\hline
[\type{T}/\wtv{a}]\wildcardEnv \vdash [\type{T}/\wtv{a}]C
\end{array} \quad \wtv{a} \notin \type{T}
$\\
\\
\rulename{Normalize} &
$\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline
[\ntv{b}/\wtv{b}]\wildcardEnv \vdash [\ntv{b}/\wtv{b}]
C \cup \set{\ntv{a} \doteq [\ntv{b}/\wtv{b}]\type{T}}
\end{array}
\quad \begin{array}{c}
\wtv{b} \in \text{wtv}(\type{T})\\
\ntv{b} \ \text{fresh}
\end{array}$\\
\\
\rulename{Contract} &
$\begin{array}[c]{l}
\wildcardEnv \cup \set{\wildcard{A}{\type{U}}{\type{L}}} \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline
[\type{U}/\type{A}]\wildcardEnv \vdash [\type{U}/\type{A}]
C \cup [\type{U}/\type{A}]\set{\ntv{a} \doteq \type{T}, \type{L} \doteq \type{U}}
\end{array}
\quad \begin{array}{c}
\rwildcard{A} \in \text{fv}(\type{T})\\
\end{array}$\\
\end{tabular}}
\end{center}
\caption{Substitution rules}\label{fig:subst-rules}
\end{figure}
\unify{} must not replace normal type placeholders with free variables
except variables initially passed by $\Delta_{in}$.
The \rulename{Subst} rule checks if a type $\type{T}$ contains any
free variables or wildcard placeholders before replacing a normal type placeholder with it.
%This ensures that free variables are never substituted for normal type placeholders.
This ensures that a normal type placeholder is never replaced by a type containing free variables.
A type solution for a normal type placeholder will never contain free variables.
This is needed to guarantee well-formed type solutions and also keep free variables inside their scope
(see challenge \ref{challenge3}).
\rulename{Subst-WC} does not need to do that and can freely replace wildcard placehodlers with types despite their free variables.
We do not keep replacements for wildcard placeholders and they will not show up in the final type solution.
If the \rulename{Subst} rule is not applicable then either the \rulename{Normalize}
or \rulename{Contract} transformation has to be used to remove wildcard placeholders and
wildcards.
\subsection{Branching Step (Step 2)}
\unify{} is described as a nondeterministic algorithm.
Some constraints allow for multiple transformations from which
the algorithm has to pick the right one.
@@ -474,7 +638,9 @@ and gathers all possible type solutions.
We skip the definition of this practice, because it is already described in \cite{TIforFGJ}
and only needed for a proof of completeness.
\subsection{Generate Result (Step 4)}
The generation rules defined in figure \ref{fig:generation-rules} are similar to the other transformation rules but contain an additional part,
the result output consisting of a wildcard environment and a set of unifier $\sigma$.
\subsection{Adding Wildcards to the mix}
%\unify{} is able to create wildcard solutions even when the input set of constraints do not contain any wildcard variables.
@@ -622,101 +788,6 @@ The \texttt{concat} method takes two lists of the same generic type.
\subsection{Algorithm}
%\newcommand{\tw}[1]{\tv{#1}_?}
\begin{description}
\item[Input:] An environment $\Delta'$
and a set of constraints $C = \set{ \type{T} \lessdot \type{T}, \type{T} \lessdotCC \type{T}, \type{T} \doteq \type{T} \ldots}$
\item[Output:]
Set of unifiers $Uni = \set{\sigma_1, \ldots, \sigma_n}$ and an environment $\Delta$
\end{description}
%The transformation steps are not applied all at once but in a specific order:
\unify{} executes the following steps until a type solution is found:
\begin{description}
\item[Step 1:]
Apply the rules depicted in the figures \ref{fig:normalizing-rules}, \ref{fig:reduce-rules} and \ref{fig:wildcard-rules} exhaustively,
starting with the \rulename{circle} rule.
\item[Step 2:]
%If there are no $(\type{T} \lessdot \tv{a})$ constraints remaining in the constraint set $C$
%resume with step 4.
The second step is nondeterministic.
%\unify{} has to pick the right transformation for each constraint of the form $\type{N} \lessdot \tv{a}$.
%The rules in figure \ref{fig:step2-rules} offer three possibilities to deal with constraints $\type{N} \lessdot \tv{a}$.
For every $\type{T} \lessdot \tv{a}$ constraint \unify{} has to pick exactly one transformation from figure \ref{fig:step2-rules}.
The same principle goes for constraints of the form $\tv{a} \lessdot \type{N}, \tv{a} \lessdot \tv{b}$ and the two transformations in figure \ref{fig:step2-rules2}.
%They have to be applied until the constraint set holds no constraints of the form $\tv{a} \lessdot \type{N}, \tv{a} \lessdot \tv{b}$.
If atleast one transformation was applied in this step revert to step 1.
Otherwise proceed with step 3.
%This builds a search tree over multiple possible solutions.
%\unify{} has to try each branch and accumulate the resulting solutions into a solution set.
%$\type{T} \lessdot \ntv{a}$ constraints have three and $\type{T} \lessdot \wtv{a}$ constraints have five possible transformations.
\textit{Hint:}
When implementing this step via backtracking
the rules \rulename{General} and \rulename{Super} should be tried first.
% is the Same rule really necessary?
The \rulename{Settle} and \rulename{Raise} rules should only be used when none of the rules in figure \ref{fig:step2-rules} can be applied.
\item[Step 3:]
Apply the rules in figure \ref{fig:cleanup-rules} exhaustively.
\rulename{Ground} and \rulename{Flatten} deal with constraints containing free variables.
If a type placeholder is solely used as lower bound \rulename{Ground} can replace it with the bottom type.
Otherwise the \rulename{Flatten} rule has to remove the wildcards responsible for the free variable.
\text{Note:} Only one of those rules has to be applied per constraint.
If the constraint set has been changed by one of these rules the algorithm must return to step 1.
But if only the \rulename{SubElim} rule is applied or the constraint set is not changed at all,
the algorithm can proceed with step 4.
The cleanup step prepares the constraint set for the last step by applying the following concepts:
%Two transformations are done which also help to remove unnecessary types from the solution set.
\begin{description}
\item[Bottom type]
The bottom type $\bot$ is used to generate \texttt{? extends} wildcard definitions.
This is the only possible solution when dealing with multiple upper bounds:
$\tv{a} \lessdot \type{T}, \tv{a} \lessdot \type{S}$ is usually not a correct solution (given $\type{S}$ and $\type{T}$ are no subtypes of eachother).
But if $\tv{a}$ is a lower bound of a wildcard it can be set to $\bot$.
Those constraints only stay in the constraint set after the first step if $\type{S}$ and $\type{T}$ do not have a common subtype.
The \rulename{Ground} rule uses this concept to generate \texttt{extends} Wildcards.
\end{description}
\item[Step 4:] \textit{(Generating Result)}
Apply the rules in figure \ref{fig:generation-rules} until $\wildcardEnv = \emptyset$ and $C = \emptyset$.
The resulting $\Delta, \sigma$ is a correct solution.
For this step to succeed there should only be four kinds of constraints left.
\begin{enumerate}
%\item\label{item:3} $\tv{a} \lessdot \tv{b}$ %, with $a$ and $b$ both isolated type variables
\item $\tv{a} \doteq \tv{b}$
%\item $\wtv{a} \doteq \type{G}$
\item\label{item:1} $\tv{a} \lessdot \wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}$, with $\text{fv}(\wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}) \subseteq \Delta_in$
\item\label{item:2} $\tv{a} \doteq \wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}$, with $\tv{a} \notin \ol{\type{T}}$ % and $\text{fv}(\wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}) = \emptyset$
\item\label{item:3} $\tv{a} \doteq \rwildcard{X}$
\end{enumerate}
%Each type placeholder $\tv{a}$ must solely appear on the left side of a constraint.
\unify{} fails if there is any $\tv{a} \doteq \type{T}$ such that $\tv{a}$ occurs in $\type{T}$.
For the cases \ref{item:1}, \ref{item:2}, and \ref{item:3} the placeholder $\tv{a}$
cannot appear anywhere else in the constraint set.
Otherwise the generation rules \rulename{GenSigma} and \rulename{GenDelta} will not be able to process every constraint.
% \begin{displaymath}
% \deduction{
% \wildcardEnv \cup \set{\wildcard{B}{\type{G}}{\type{G'}}} \vdash C \implies \Delta, \sigma
% }{
% \wildcardEnv \vdash C \implies \Delta \cup \set{\wildcard{B}{\type{G}}{\type{G'}}}, \sigma
% }\quad \text{tph}(\type{G}) = \emptyset, \text{tph}(\type{G'}) = \emptyset,
% \rwildcard{B} \notin \text{dom}(\Delta)
% \quad \rulename{AddDelta}
% \end{displaymath}
\end{description}
With \texttt{C} being class names and \texttt{A} being wildcard names.
The wildcard type $\wildcard{X}{U}{L}$ consist of an upper bound $\type{U}$, a lower bound $\type{L}$
and a name $\mathtt{X}$.
@@ -820,73 +891,6 @@ which are used for the upper and lower bounds.
% if there are a <. List<x?> constraints remaining in the end, then this can be a sign of a irregular input constraint set.
\begin{figure}
\begin{center}
\leavevmode
\fbox{
\begin{tabular}[t]{l@{~}l}
\rulename{Subst} &
$\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline
[\type{T}/\tv{a}]\wildcardEnv \vdash [\type{T}/\tv{a}]
C \cup \set{\ntv{a} \doteq \type{T}}
\end{array}
\quad \begin{array}{c}
\ntv{a} \notin \type{T} \\
\text{fv}(\type{T}) \subseteq \Delta', \, \text{wtv}(\type{T}) = \emptyset
\end{array}$\\
\\
\rulename{Subst-WC} &$
\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\wtv{a} \doteq \rwildcard{T}}\\
\hline
[\type{T}/\wtv{a}]\wildcardEnv \vdash [\type{T}/\wtv{a}]C
\end{array} \quad \wtv{a} \notin \type{T}
$\\
\\
\rulename{Normalize} &
$\begin{array}[c]{l}
\wildcardEnv \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline
[\ntv{b}/\wtv{b}]\wildcardEnv \vdash [\ntv{b}/\wtv{b}]
C \cup \set{\ntv{a} \doteq [\ntv{b}/\wtv{b}]\type{T}}
\end{array}
\quad \begin{array}{c}
\wtv{b} \in \text{wtv}(\type{T})\\
\ntv{b} \ \text{fresh}
\end{array}$\\
\\
\rulename{Contract} &
$\begin{array}[c]{l}
\wildcardEnv \cup \set{\wildcard{A}{\type{U}}{\type{L}}} \vdash C \cup \set{\ntv{a} \doteq \type{T}}\\
\hline
[\type{U}/\type{A}]\wildcardEnv \vdash [\type{U}/\type{A}]
C \cup [\type{U}/\type{A}]\set{\ntv{a} \doteq \type{T}, \type{L} \doteq \type{U}}
\end{array}
\quad \begin{array}{c}
\rwildcard{A} \in \text{fv}(\type{T})\\
\end{array}$\\
\end{tabular}}
\end{center}
\caption{Substitution rules}\label{fig:subst-rules}
\end{figure}
\unify{} must not replace normal type placeholders with free variables
except variables initially passed by $\Delta_{in}$.
The \rulename{Subst} rule checks if a type $\type{T}$ contains any
free variables or wildcard placeholders before replacing a normal type placeholder with it.
%This ensures that free variables are never substituted for normal type placeholders.
This ensures that a normal type placeholder is never replaced by a type containing free variables.
A type solution for a normal type placeholder will never contain free variables.
This is needed to guarantee well-formed type solutions and also keep free variables inside their scope
(see challenge \ref{challenge3}).
\rulename{Subst-WC} does not need to do that and can freely replace wildcard placehodlers with types despite their free variables.
We do not keep replacements for wildcard placeholders and they will not show up in the final type solution.
If the \rulename{Subst} rule is not applicable then either the \rulename{Normalize}
or \rulename{Contract} transformation has to be used to remove wildcard placeholders and
wildcards.
% \begin{example}{Free variables must not leave the scope of the surrounding \texttt{let} statement}
% \noindent