Add ANF transformation. Change syntax of TamedFJ. Restructure Unify description. Some fixes in Unify

This commit is contained in:
Andreas Stadelmeier 2024-03-28 03:40:39 +01:00
parent 495e37b370
commit 7b86dc0cf3
4 changed files with 402 additions and 291 deletions

View File

@ -30,7 +30,8 @@ We support capture conversion and Java style method calls.
This requires existential types in a form which is not denotable by Java syntax \cite{aModelForJavaWithWildcards}.
\item
We present a novel approach to deal with existential types and capture conversion during constraint unification.
The algorithm is split in two parts. A constraint generation step and an unification step.
\item The algorithm is split in two parts. A constraint generation step and an unification step.
Input language and constraint generations can be extended without altering the unification part.
\item
We proof soundness and aim for a good compromise between completeness and time complexity.
\end{itemize}
@ -117,11 +118,6 @@ class Example {
%\label{fig:intro-example-code}
\end{figure}
%TODO
% The goal is to proof soundness in respect to the type rules introduced by \cite{aModelForJavaWithWildcards}
% and \cite{WildcardsNeedWitnessProtection}.
\begin{figure}%[tp]
\begin{subfigure}[t]{0.49\linewidth}
\begin{lstlisting}[style=fgj]
@ -259,7 +255,7 @@ The input to our type inference algorithm is a modified version of the \letfj{}\
First \fjtype{} generates constraints
and afterwards \unify{} computes a solution for the given constraint set.
Constraints consist out of subtype constraints $(\type{T} \lessdot \type{T})$ and capture constraints $(\type{T} \lessdotCC \type{T})$.
\textit{Note:} a type $\type{T}$ can either be a named type, a type placeholder $\ntv{a}$ or a wildcard type placeholder $\wtv{a}$.
\textit{Note:} a type $\type{T}$ can either be a named type, a type placeholder or a wildcard type placeholder.
A subtype constraint is satisfied if the left side is a subtype of the right side according to the rules in figure \ref{fig:subtyping}.
\textit{Example:} $\exptype{List}{\ntv{a}} \lessdot \exptype{List}{\type{String}}$ is fulfilled by replacing type placeholder $\ntv{a}$ with the type $\type{String}$.
Subtype constraints and type placeholders act the same as the ones used in \emph{Type Inference for Featherweight Generic Java} \cite{TIforFGJ}.
@ -535,38 +531,16 @@ $
The \unify{} algorithm only sees the constraints with no information about the program they originated from.
The main challenge was to find an algorithm which computes $\sigma(\wtv{a}) = \rwildcard{X}$ for example \ref{intro-example1} but not for example \ref{intro-example2}.
\subsection{ANF transformation}
%TODO: describe ANF syntax (which is different then the one from the wiki: https://en.wikipedia.org/wiki/A-normal_form)
The input is a \letfj{} program without \texttt{let} statements, which is then transformed to A-normal form \cite{aNormalForm}
before getting processed by \fjtype{}.
\section{\TamedFJ{} and \letfj{}}
The input to our type inference algorithm is a \TamedFJ{} program.
The output is a valid \letfj{} program.
\letfj{} is defined in \cite{WildcardsNeedWitnessProtection}.
It is an extension to Featherweight Java which adds Wildcard types and let statements.
The let statements are used to explicitly apply capture conversion.
$
\begin{array}{lrcl|l}
\hline
& & & \textbf{\letfj{}} & \textbf{A-Normal form} \\
\text{Terms} & t & ::=
& \expr{x}
& \expr{x}
\\
& & \ \ |
& \texttt{new} \ \type{C}(\overline{t})
& \texttt{let}\ \overline{x} = \overline{t} \ \texttt{in}\ \texttt{new} \ \type{C}(\overline{x})
\\
& & \ \ |
& t.f
& \texttt{let}\ x = t \ \texttt{in}\ x.f
\\
& & \ \ |
& t.\texttt{m}(\overline{t})
& \texttt{let}\ x_1 = t \ \texttt{in}\ \texttt{let}\ \overline{x} = \overline{v} \ \texttt{in}\ x_1.\texttt{m}(\overline{x})
\\
& & \ \ |
& t \elvis{} t
& t \elvis{} t\\
%
\hline
\end{array}
$
%TODO
% The goal is to proof soundness in respect to the type rules introduced by \cite{aModelForJavaWithWildcards}
% and \cite{WildcardsNeedWitnessProtection}.
\subsection{Capture Conversion}
The input to our type inference algorithm does not contain let statements.

View File

@ -143,7 +143,6 @@
\newcommand{\ruleSubstWC}[0]{\rulename{Subst-WC}}
\newcommand{\subclass}{\mathtt{\sqsubset}\mkern-5mu :}
\newcommand{\TameFJ}{\text{TameFJ}}
\newcommand{\TamedFJ}{\textbf{TamedFJ}}
\newcommand{\TYPE}{\textbf{TYPE}}

View File

@ -38,6 +38,10 @@ Additional language constructs can be added by implementing the respective const
% on overriding methods, because their type is already determined.
% Allowing overriding therefore has no implication on our type inference algorithm.
The syntax forces every expression to undergo a capture conversion before it can be used as a method argument.
Even variables have to be catched by a let statement first.
This behaviour emulates Java's implicit capture conversion.
\begin{figure}
$
\begin{array}{lrcl}
@ -47,19 +51,176 @@ $
\text{Lower bounds} & \type{K}, \type{L} & ::= & \type{T} \mid \bot \\
\text{Type variable contexts} & \Delta & ::= & \overline{\wildcard{X}{T}{L}} \\
\text{Class declarations} & D & ::= & \texttt{class}\ \exptype{C}{\ol{X \triangleleft T}} \triangleleft \type{N} \set{\overline{\type{T}\ f}; \ol{M}} \\
\text{Method declarations} & \texttt{M} & ::= & \texttt{m}(\overline{x}) = t \\
\text{Terms} & t & ::= & x \\
& & \ \ | & \texttt{new} \ \type{C}(\overline{t})\\
& & \ \ | & t.f\\
& & \ \ | & t.\texttt{m}(\overline{t})\\
& & \ \ | & t \elvis{} t\\
\text{Variable contexts} & \Gamma & ::= & \overline{x:\type{T}}\\
\text{Method declarations} & \texttt{M} & ::= & \texttt{m}(\overline{\expr{x}}) = t \\
\text{Terms} & t & ::= & \expr{x} \\
& & \ \ | & \texttt{let}\ \overline{\expr{x}_c} = \overline{t} \ \texttt{in}\ \texttt{new} \ \type{C}(\overline{\expr{x}_c}) \\
& & \ \ | & \texttt{let}\ \expr{x}_c = t \ \texttt{in}\ \expr{x}_c.f \\
& & \ \ | & \texttt{let}\ \expr{x}_c = t \ \texttt{in}\ \texttt{let}\ \overline{\expr{x}_c} = \overline{t} \ \texttt{in}\ \expr{x}_c.\texttt{m}(\overline{\expr{x}_c}) \\
& & \ \ | & t \elvis{} t \\
\text{Variable contexts} & \Gamma & ::= & \overline{\expr{x}:\type{T}}\\
\hline
\end{array}
$
\caption{Input Syntax}\label{fig:syntax}
\end{figure}
% \begin{figure}
% $
% \begin{array}{lrcl}
% \hline
% \text{Parameterized classes} & \mv N & ::= & \exptype{C}{\ol{T}} \\
% \text{Types} & \type{S}, \type{T}, \type{U} & ::= & \type{X} \mid \wcNtype{\Delta}{N} \\
% \text{Lower bounds} & \type{K}, \type{L} & ::= & \type{T} \mid \bot \\
% \text{Type variable contexts} & \Delta & ::= & \overline{\wildcard{X}{T}{L}} \\
% \text{Class declarations} & D & ::= & \texttt{class}\ \exptype{C}{\ol{X \triangleleft T}} \triangleleft \type{N} \set{\overline{\type{T}\ f}; \ol{M}} \\
% \text{Method declarations} & \texttt{M} & ::= & \texttt{m}(\overline{\expr{x}}) = t \\
% \text{Values} & v & ::= & \expr{x} \\
% \text{Terms} & t & ::= & v \\
% & & \ \ | & \texttt{let} \ \expr{x} = \texttt{new} \ \type{C}(\overline{v}) \ \texttt{in} \ t \\
% & & \ \ | & \texttt{let} \ \expr{x} = v.f \ \texttt{in} \ t \\
% & & \ \ | & \texttt{let} \ \expr{x} = v.\texttt{m}(\overline{v}) \ \texttt{in} \ t \\
% & & \ \ | & \texttt{let} \ \expr{x} = v \elvis{} v \ \texttt{in} \ t \\
% \text{Variable contexts} & \Gamma & ::= & \overline{\expr{x}:\type{T}}\\
% \hline
% \end{array}
% $
% \caption{Input Syntax}\label{fig:syntax}
% \end{figure}
% \begin{figure}
% $
% \begin{array}{lrcl}
% \hline
% \text{Parameterized classes} & \mv N & ::= & \exptype{C}{\ol{T}} \\
% \text{Types} & \type{S}, \type{T}, \type{U} & ::= & \type{X} \mid \wcNtype{\Delta}{N} \\
% \text{Lower bounds} & \type{K}, \type{L} & ::= & \type{T} \mid \bot \\
% \text{Type variable contexts} & \Delta & ::= & \overline{\wildcard{X}{T}{L}} \\
% \text{Class declarations} & D & ::= & \texttt{class}\ \exptype{C}{\ol{X \triangleleft T}} \triangleleft \type{N} \set{\overline{\type{T}\ f}; \ol{M}} \\
% \text{Method declarations} & \texttt{M} & ::= & \texttt{m}(\overline{\expr{x}}) = t \\
% \text{Terms} & t & ::= & \expr{x} \\
% & & \ \ | & \texttt{let} \ \expr{x} = t \ \texttt{in} \ t \\
% & & \ \ | & \expr{x}.f \\
% & & \ \ | & \expr{x}.\texttt{m}(\overline{\expr{x}}) \\
% & & \ \ | & t \elvis{} t \\
% \text{Variable contexts} & \Gamma & ::= & \overline{\expr{x}:\type{T}}\\
% \hline
% \end{array}
% $
% \caption{Input Syntax}\label{fig:syntax}
% \end{figure}
% \begin{figure}
% $
% \begin{array}{lrcl}
% \hline
% \text{Parameterized classes} & \mv N & ::= & \exptype{C}{\ol{T}} \\
% \text{Types} & \type{S}, \type{T}, \type{U} & ::= & \type{X} \mid \wcNtype{\Delta}{N} \\
% \text{Lower bounds} & \type{K}, \type{L} & ::= & \type{T} \mid \bot \\
% \text{Type variable contexts} & \Delta & ::= & \overline{\wildcard{X}{T}{L}} \\
% \text{Class declarations} & D & ::= & \texttt{class}\ \exptype{C}{\ol{X \triangleleft T}} \triangleleft \type{N} \set{\overline{\type{T}\ f}; \ol{M}} \\
% \text{Method declarations} & \texttt{M} & ::= & \texttt{m}(\overline{x}) = t \\
% \text{Terms} & t & ::= & x \\
% & & \ \ | & \texttt{new} \ \type{C}(\overline{t})\\
% & & \ \ | & t.f\\
% & & \ \ | & t.\texttt{m}(\overline{t})\\
% & & \ \ | & t \elvis{} t\\
% \text{Variable contexts} & \Gamma & ::= & \overline{x:\type{T}}\\
% \hline
% \end{array}
% $
% \caption{Input Syntax}\label{fig:syntax}
% \end{figure}
\subsection{ANF transformation}
\newcommand{\anf}[1]{\ensuremath{\tau}(#1)}
%https://en.wikipedia.org/wiki/A-normal_form)
Featherweight Java's syntax involves no \texttt{let} statement
and terms can be nested freely.
This is similar to Java's syntax.
To convert it to \TamedFJ{} additional let statements have to be added.
This is done by a \textit{A-Normal Form} \cite{aNormalForm} transformation.
\textit{Example:}\\
\noindent
\begin{minipage}{0.45\textwidth}
\begin{lstlisting}[style=fgj,caption=Featherweight Java]
m(l, v){
return id(l).add(v);
}
\end{lstlisting}
\end{minipage}%
\hfill
\begin{minipage}{0.5\textwidth}
\begin{lstlisting}[style=tfgj,caption=\TamedFJ{} representation]
m(l, v) =
let x = id(l) in x.add(v)
\end{lstlisting}
\end{minipage}
$
\begin{array}{|lrcl|l}
\hline
& & & \textbf{Featherweight Java Terms}\\
\text{Terms} & t & ::=
& \expr{x}
\\
& & \ \ |
& \texttt{new} \ \type{C}(\overline{t})
\\
& & \ \ |
& t.f
\\
& & \ \ |
& t.\texttt{m}(\overline{t})
\\
& & \ \ |
& t \elvis{} t\\
%
\hline
\end{array}
$
$\begin{array}{lrcl}
\text{ANF} & \anf{\expr{x}} & = & \expr{x} \\
& \anf{\texttt{new} \ \type{C}(\overline{t})} & = & \texttt{let}\ \overline{\expr{x}} = \anf{\overline{t}} \ \texttt{in}\ \texttt{new} \ \type{C}(\overline{\expr{x}}) \\
& \anf{t.f} & = & \texttt{let}\ \expr{x} = \anf{t} \ \texttt{in}\ \expr{x}.f \\
& \anf{t.\texttt{m}(\overline{t})} & = & \texttt{let}\ \expr{x} = \anf{t} \ \texttt{in}\ \texttt{let}\ \overline{\expr{x}} = \anf{\overline{t}} \ \texttt{in}\ \expr{x}.\texttt{m}(\overline{\expr{x}}) \\
& \anf{t_1 \elvis{} t_2} & = & \anf{t_1} \elvis{} \anf{t_2}
\end{array}$
% $
% \begin{array}{lrcl|l}
% \hline
% & & & \textbf{Featherweight Java} & \textbf{A-Normal form} \\
% \text{Terms} & t & ::=
% & \expr{x}
% & \expr{x}
% \\
% & & \ \ |
% & \texttt{new} \ \type{C}(\overline{t})
% & \texttt{let}\ \overline{x} = \overline{t} \ \texttt{in}\ \texttt{new} \ \type{C}(\overline{x})
% \\
% & & \ \ |
% & t.f
% & \texttt{let}\ x = t \ \texttt{in}\ x.f
% \\
% & & \ \ |
% & t.\texttt{m}(\overline{t})
% & \texttt{let}\ x_1 = t \ \texttt{in}\ \texttt{let}\ \overline{x} = \overline{t} \ \texttt{in}\ x_1.\texttt{m}(\overline{x})
% \\
% & & \ \ |
% & t \elvis{} t
% & t \elvis{} t\\
% %
% \hline
% \end{array}
% $
% Each class type has a set of wildcard types $\overline{\Delta}$ attached to it.
% The type $\wctype{\overline{\Delta}}{C}{\ol{T}}$ defines a set of wildcards $\overline{\Delta}$,
% which can be used inside the type parameters $\ol{T}$.

467
unify.tex
View File

@ -70,6 +70,7 @@ We define two types as equal if they are mutual subtypes of each other.
% then $\Delta \vdash [\type{S}/\type{S'}]\type{T} <: \type{T'}$.
\textbf{Capture Constraints:}
The equality relation on Capture constraints is not reflexive.
A capture constraint is never equal to another capture constraint even when structurally the same
($\type{T} \lessdotCC \type{S} \neq \type{T} \lessdotCC \type{S}$).
@ -78,6 +79,29 @@ All constraints are stored in a set and there are no dublicates of subtype const
Capture constraints however have to be stored as a list or have an unique number assigned
so that duplicates don't get automatically discarded.
Capture constraints are treated like regular subtype constraints.
All transformations for subtype constraints work for capture constraints aswell.
For clarification Subtype constraints are marked with a number.
Constraints with the same number stay the same type.
Newly created subtype constraints are always regular subtype constrains unless stated otherwise.
The \rulename{Adopt} rule for example takes multiple subtype constraints and adds a new one.
Having the constraints
$\ntv{a} \lessdotCC \wtv{b}, \ntv{a} \lessdot \type{String}, \wtv{b} \lessdot \type{Object}$
would lead to
$\wtv{b} \lessdot \type{String}, \ntv{a} \lessdotCC \wtv{b}, \ntv{a} \lessdot \type{String}, \wtv{b} \lessdotCC \type{Object}$
after applying \rulename{Adopt}.
The new generated constraint $\wtv{b} \lessdot \type{String}$ is a normal subtype constraint.
The type placeholders which are annotated as wildcard placeholders also stay wildcard placeholders.
The only rule that replaces wildcard type placeholders with regular placeholders is the \rulename{Normalize} rule.
\textbf{Wildcard Environment:}
Additional to a constraint set \unify{} holds a wildcard environment $\wildcardEnv{}$ keeping free variables.
The algorithm starts with an empty wildcard environment $\wildcardEnv{}$.
Only the \rulename{Capture} rule adds wildcards to that environment
and every added wildcard gets a fresh unique name.
This ensures the wildcard environment $\wildcardEnv{}$ never
gets the same wildcard twice.
% \subsection{Capture Conversion}
% % TODO: Describe Capture conversion
% Capture conversion is done during the \unify{} algorithm.
@ -119,6 +143,57 @@ The \unify{} algorithm internally uses the following data types:
$\wtype{W}$ & $::=$ & $\wildcard{X}{U}{L}$ & Wildcard \\
\end{tabular}
The algorithm is split into multiple parts:
\begin{description}
\item[Step 1:]
Apply the rules depicted in the figures \ref{fig:normalizing-rules}, \ref{fig:reduce-rules} and \ref{fig:wildcard-rules} exhaustively.
Starting with the \rulename{circle} rule.
\item[Step 2:]
%If there are no $(\type{T} \lessdot \tv{a})$ constraints remaining in the constraint set $C$
%resume with step 4.
The rules in figure \ref{fig:step2-rules} offer multiple possibilities to deal with constraints of the form $\type{N} \lessdot \tv{a}$.
This builds a search tree over multiple possible solutions.
\unify{} has to try each branch and accumulate the resulting solutions into a solution set.
$\type{T} \lessdot \ntv{a}$ constraints have three and $\type{T} \lessdot \wtv{a}$ constraints have five possible transformations.
\item[Step 3:]
We apply the rules in figure \ref{fig:cleanup-rules} exhaustively and proceed with step 4.
\item[Step 4:] \textit{(Generating Result)}
Apply the rules in figure \ref{fig:generation-rules} until $\wildcardEnv = \emptyset$ and $C = \emptyset$.
The resulting $\Delta, \sigma$ is a correct solution.
For this step to succeed there should only be four kinds of constraints left.
\begin{enumerate}
%\item\label{item:3} $\tv{a} \lessdot \tv{b}$ %, with $a$ and $b$ both isolated type variables
\item $\tv{a} \doteq \tv{b}$
%\item $\wtv{a} \doteq \type{G}$
\item\label{item:1} $\tv{a} \lessdot \wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}$, with $\text{fv}(\wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}) \subseteq \Delta_in$
\item\label{item:2} $\tv{a} \doteq \wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}$, with $\tv{a} \notin \ol{\type{T}}$ % and $\text{fv}(\wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}) = \emptyset$
\item\label{item:3} $\tv{a} \doteq \rwildcard{X}$
\end{enumerate}
%Each type placeholder $\tv{a}$ must solely appear on the left side of a constraint.
\unify{} fails if there is any $\tv{a} \doteq \type{T}$ such that $\tv{a}$ occurs in $\type{T}$.
For the cases \ref{item:1}, \ref{item:2}, and \ref{item:3} the placeholder $\tv{a}$
cannot appear anywhere else in the constraint set.
Otherwise the generation rules \rulename{GenSigma} and \rulename{GenDelta} will not be able to process every constraint.
% \begin{displaymath}
% \deduction{
% \wildcardEnv \cup \set{\wildcard{B}{\type{G}}{\type{G'}}} \vdash C \implies \Delta, \sigma
% }{
% \wildcardEnv \vdash C \implies \Delta \cup \set{\wildcard{B}{\type{G}}{\type{G'}}}, \sigma
% }\quad \text{tph}(\type{G}) = \emptyset, \text{tph}(\type{G'}) = \emptyset,
% \rwildcard{B} \notin \text{dom}(\Delta)
% \quad \rulename{AddDelta}
% \end{displaymath}
\end{description}
The $\wtv{a}$ type variables are flagged as wildcard type variables.
These type variables can be substituted by a wildcard or a type with free wildcard variables.
As long as a type variable is flagged as $\wtv{a}$ it can be used by the \rulename{Subst-WC} rule in step 1.
@ -131,8 +206,137 @@ The \rulename{Tame} rule eliminates wildcards. %TODO
This is done by setting the upper and lower bound to the same value.
\unify{} applies a capture conversion everywhere it is possible (see \rulename{Capture} rule).
Capture conversion removes a types bounding environment $\Delta$.
Type variables used in its type parameters are now bound to a global scope and not locally anymore.
Capture conversion removes a types bounding environment $\Delta$ and adds the included wildcard definitions to the global environment $\wildcardEnv{}$.
%Type variables used in its type parameters are now bound to a global scope and not locally anymore.
The \rulename{match} rule generates fresh wildcards $\overline{\wildcard{A}{\tv{u}}{\tv{l}}}$.
Their upper and lower bounds are fresh type variables.
\unify{} is able to remove wildcards by assigning their upper and lower bounds the same type
(Def: $\type{Object} = \wildcard{A}{Object}{Object}$ by definition \ref{def:equal}).
This is used by the \rulename{Tame} rule.
\textbf{Helper functions:}
\begin{description}
\item[$\tph{}$] returns all type placeholders inside a given type.
\textit{Example:} $\tph(\wctype{\wildcard{X}{\tv{a}}{\bot}}{Pair}{\wtv{b},\rwildcard{X}}) = \set{\tv{a}, \wtv{b}}$
\item [$\ll$ relation:]
The $\ll$ relation is the reflexive and transitive closure of the \texttt{extends} relations:
\begin{displaymath}
\begin{array}[c]{c}
\exptype{C}{\ol{X} \triangleleft \ol{N}} \triangleleft \exptype{D}{\ol{N}} \\
\hline
\vspace*{-0.4cm}\\
\texttt{C} \ll \texttt{D}
\end{array}
\quad
\begin{array}[c]{l}
\\
\hline
\vspace*{-0.4cm}\\
\texttt{C} \ll \texttt{C}
\end{array}
\quad
\begin{array}[c]{l}
\texttt{C} \ll \texttt{D}, \texttt{D} \ll \texttt{E} \\
\hline
\vspace*{-0.4cm}\\
\texttt{C} \ll \texttt{E}
\end{array}
\end{displaymath}
The algorithm uses it to determine if two types are possible subtypes of one another.
This is needed in the \rulename{adapt} and \rulename{match} rules.
%\textbf{Wildcard renaming}\\
\item[Wildcard renaming:]
The \rulename{reduce} rule separates wildcards from their environment.
At this point each wildcard gets a new and unique name.
To only rename the respective wildcards the reduce rule renames wildcards up to alpha conversion:
($[\ol{C}/\ol{B}]$ in the \rulename{reduce} rule)
\begin{align*}
[\type{A}/\type{B}]\type{B} &= \type{A} \\
[\type{A}/\type{B}]\type{C} &= \type{C} & \text{if}\ \type{B} \neq \type{C}\\
[\type{A}/\type{B}]\wcNtype{\overline{\wildcard{X}{\type{U}}{\type{L}}}}{N} &= \wcNtype{\overline{\wildcard{X}{[\type{A}/\type{B}]\type{U}}{[\type{A}/\type{B}]\type{L}}}}{[\type{A}/\type{B}]N} & \text{if}\ \type{B} \notin \overline{\rwildcard{X}} \\
[\type{A}/\type{B}]\wcNtype{\overline{\wildcard{X}{\type{U}}{\type{L}}}}{N} &= \wcNtype{\overline{\wildcard{X}{\type{U}}{\type{L}}}}{N} & \text{if}\ \type{B} \in \overline{\rwildcard{X}} \\
\end{align*}
\item[Free Variables:]
The $\text{fv}$ function assumes every wildcard type variable to be a free variable aswell.
% TODO: describe a function which determines free variables? or do an example?
\item[Fresh Wildcards:]
$\text{fresh}\ \overline{\wildcard{A}{\tv{u}}{\tv{l}}}$ generates fresh wildcards.
The new names $\ol{A}$ are fresh, aswell as the type variables $\ol{\tv{u}}$ and $\ol{\tv{l}}$,
which are used for the upper and lower bounds.
\end{description}
% \noindent
% \textbf{Example: (reduce-rule)}
% %The \ruleReduceWC{} resembles the \texttt{S-Exists} subtyping rule.
% %It converts wildcard types to fresh type variables.
% %For example take the input constraint $\exptype{Pair}{\ntype{Object},\tv{b}} \lessdot \wctype{\wildcard{A}{\tv{c}}{\tv{d}}}{Pair}{\wildcard{A}{\tv{c}}{\tv{d}},\wildcard{A}{\tv{c}}{\tv{d}}}$.
% %First we apply the \ruleReduceWC{} rule, which replaces $\wildcard{A}{\tv{c}}{\tv{d}}$ with a fresh type variable $\tv{a}$:
% We assume the input constraints $C = \exptype{Pair}{\ntype{Object},\tv{b}} \lessdot \wctype{\wildcard{A}{\tv{c}}{\tv{d}}}{Pair}{\wildcard{A}{\tv{c}}{\tv{d}},\wildcard{A}{\tv{c}}{\tv{d}}}$.
% The type on the right side $\wctype{\wildcard{A}{\tv{c}}{\tv{d}}}{Pair}{\wildcard{A}{\tv{c}}{\tv{d}},\wildcard{A}{\tv{c}}{\tv{d}}}$
% \begin{align*}
% \ddfrac{
% \exptype{Pair}{\ntype{Object},\tv{b}} \lessdot \wctype{\wildcard{A}{\tv{c}}{\tv{d}}}{Pair}{\wildcard{A}{\tv{c}}{\tv{d}},\wildcard{A}{\tv{c}}{\tv{d}}}
% }{
% \ntype{Object} \doteq \tv{a}, \tv{b} \doteq \tv{a}, \tv{a} \lessdot \tv{c}, \tv{d} \lessdot \tv{a}
% } \ruleReduceWC{}
% \end{align*}
% By substition we get the result: % $\tv{a} \doteq \type{Object}$, $\tv{a} \doteq \type{Object}$, $\ldots$.
% \begin{align*}
% \ddfrac{
% \ntype{Object} \doteq \tv{a}, \tv{b} \doteq \tv{a}, \tv{a} \lessdot \tv{c}, \tv{d} \lessdot \tv{a}
% }{
% \tv{a} \doteq \ntype{Object} , \tv{b} \doteq \ntype{Object}, \ntype{Object} \lessdot \tv{c}, \tv{d} \lessdot \ntype{Object}
% } \rulename{Subst}
% \end{align*}
% \\[1em]
The cleanup step prepares the constraint set for the last step by applying the following concepts:
%Two transformations are done which also help to remove unnecessary types from the solution set.
\begin{description}
\item[Bottom type]
The bottom type $\bot$ is used to generate \texttt{? extends} wildcard definitions.
This is the only possible solution when dealing with multiple upper bounds:
$\tv{a} \lessdot \type{T}, \tv{a} \lessdot \type{S}$ is usually not a correct solution (given $\type{S}$ and $\type{T}$ are no subtypes of eachother).
But if $\tv{a}$ is a lower bound of a wildcard it can be set to $\bot$.
Those constraints only stay in the constraint set after the first step if $\type{S}$ and $\type{T}$ do not have a common subtype.
The \rulename{Ground} rule uses this concept to generate \texttt{extends} Wildcards.
\end{description}
% \textbf{Example:}
% \begin{displaymath}
% \begin{array}[c]{@{}ll}
% \begin{array}[c]{l}
% \wildcardEnv \cup \set{ \wildcard{X}{\tv{u}}{\tv{l}} } \vdash
% C \cup \, \set{ \type{Object} \doteq \type{X}, \tv{l} \lessdot \tv{u} } \\
% \hline
% \vspace*{-0.4cm}\\
% \wildcardEnv \cup \set{ \wildcard{X}{\tv{u}}{\tv{l}} } \vdash C \cup \,
% \set{\type{Object} \lessdot \type{X}, \type{X} \lessdot \type{Object}, \tv{l} \lessdot \tv{u}
% }\\
% \hline
% \vspace*{-0.4cm}\\
% \wildcardEnv \cup \set{ \wildcard{X}{\tv{u}}{\tv{l}} } \vdash C \cup \,
% \set{\type{Object} \lessdot \tv{l}, \tv{u} \lessdot \type{Object}, \tv{l} \lessdot \tv{u}
% }\\
% \hline
% \vspace*{-0.4cm}\\
% \ldots\\
% \hline
% \vspace*{-0.4cm}\\
% \wildcardEnv \cup \set{ \wildcard{X}{\type{Object}}{\type{Object}} } \vdash C \\
% \end{array}
% \end{array}
% \end{displaymath}
\begin{figure}
\begin{center}
@ -365,10 +569,6 @@ $
\caption{Constraint normalize rules}\label{fig:normalizing-rules}
\end{figure}
The \rulename{match} rule generates fresh wildcards $\overline{\wildcard{A}{\tv{u}}{\tv{l}}}$.
Their upper and lower bounds are fresh type variables.
\begin{figure}
\begin{center}
\leavevmode
@ -452,7 +652,7 @@ Their upper and lower bounds are fresh type variables.
C \cup \, \set{ \wctype{\overline{\wildcard{B}{\type{U}}{\type{L}}}}{C}{\ol{S}} \lessdotCC \type{T} } \\
\hline
\vspace*{-0.4cm}\\
\wildcardEnv \cup \overline{\wildcard{C}{\type{U}}{\type{L}}}
\wildcardEnv \cup \overline{\wildcard{C}{[\ol{\rwildcard{C}}/\ol{\rwildcard{B}}]\type{U}}{[\ol{\rwildcard{C}}/\ol{\rwildcard{B}}]\type{L}}}
\vdash C \cup \, \set{
[\ol{\rwildcard{C}}/\ol{\rwildcard{B}}] \exptype{C}{\ol{S}} \lessdot \type{T} }
\end{array}
@ -584,34 +784,6 @@ $
\caption{Constraint reduce rules}\label{fig:reduce-rules}
\end{figure}
Capture constraints are treated like regular subtype constraints.
All transformations for subtype constraints work for capture constraints aswell.
For clarification Subtype constraints are marked with a number.
Subtype constraints holding the same number keep their respective type.
Newly created subtype constraints are always regular subtype constrains unless stated otherwise.
The \rulename{Adopt} rule for example takes multiple subtype constraints and adds a new one.
Having the constraints
$\ntv{a} \lessdotCC \wtv{b}, \ntv{a} \lessdot \type{String}, \wtv{b} \lessdot \type{Object}$
would lead to
$\wtv{b} \lessdot \type{String}, \ntv{a} \lessdotCC \wtv{b}, \ntv{a} \lessdot \type{String}, \wtv{b} \lessdotCC \type{Object}$
after applying \rulename{Adopt}.
The new generated constraint $\wtv{b} \lessdot \type{String}$ is a normal subtype constraint.
The type placeholders which are annotated as wildcard placeholders also stay wildcard placeholders.
The only rule that replaces wildcard type placeholders with regular placeholders is the \rulename{Normalize} rule.
The algorithm holds two sets.
The input constraints and a wildcard environment $\wildcardEnv{}$ keeping free variables.
The algorithm starts with an empty wildcard environment $\wildcardEnv{}$.
Only the \rulename{Capture} rule adds wildcards to that environment.
And every added wildcard gets a fresh unique name.
This ensures the wildcard environment $\wildcardEnv{}$ never
gets the same wildcard twice.
\textbf{Step 1:}
Apply the rules depicted in the figures \ref{fig:normalizing-rules}, \ref{fig:reduce-rules} and \ref{fig:wildcard-rules} exhaustively.
Starting with the \rulename{circle} rule. Afterwards the other rules in figure \ref{fig:normalizing-rules}.
\begin{figure}
If we find an illicit constraint assigning a type containing free variables to a type placeholder not flagged as a wildcard placeholder the algorithm fails.
@ -623,130 +795,6 @@ $\set{\tv{a} \doteq \type{N}} \in C$ with $\text{fv}(\type{N}) \cap \Delta_{in}
\caption{Fail conditions}
\end{figure}
The first step of the algorithm is able to remove wildcards.
Removing a wildcard works by setting its lower and upper bound to be equal.
(Def: $\type{Object} = \wildcard{A}{Object}{Object}$).
The \rulename{Equals} rule is responsible for this.
The subtype constraints are a subset of the capture constraints $(\type{T} \lessdot \type{S}) \implies (\type{T} \lessdotCC \type{S})$.
Every transformation on a subtype constraint can also be applied to a capture constraint, but the capture constraints need to be preserved.
We indicate this by numbering the constraints in each transformation.
Constraints with the same number stay the same type.
\textbf{Example:}
\begin{displaymath}
\begin{array}[c]{@{}ll}
\begin{array}[c]{l}
\wildcardEnv \cup \set{ \wildcard{X}{\tv{u}}{\tv{l}} } \vdash
C \cup \, \set{ \type{Object} \doteq \type{X}, \tv{l} \lessdot \tv{u} } \\
\hline
\vspace*{-0.4cm}\\
\wildcardEnv \cup \set{ \wildcard{X}{\tv{u}}{\tv{l}} } \vdash C \cup \,
\set{\type{Object} \lessdot \type{X}, \type{X} \lessdot \type{Object}, \tv{l} \lessdot \tv{u}
}\\
\hline
\vspace*{-0.4cm}\\
\wildcardEnv \cup \set{ \wildcard{X}{\tv{u}}{\tv{l}} } \vdash C \cup \,
\set{\type{Object} \lessdot \tv{l}, \tv{u} \lessdot \type{Object}, \tv{l} \lessdot \tv{u}
}\\
\hline
\vspace*{-0.4cm}\\
\ldots\\
\hline
\vspace*{-0.4cm}\\
\wildcardEnv \cup \set{ \wildcard{X}{\type{Object}}{\type{Object}} } \vdash C \\
\end{array}
\end{array}
\end{displaymath}
\textbf{Helper functions:}
\begin{description}
\item[$\tph{}$] returns all type placeholders inside a given type.
\textit{Example:} $\tph(\wctype{\wildcard{X}{\tv{a}}{\bot}}{Pair}{\wtv{b},\rwildcard{X}}) = \set{\tv{a}, \wtv{b}}$
\item [$\ll$ relation:]
The $\ll$ relation is the reflexive and transitive closure of the \texttt{extends} relations:
\begin{displaymath}
\begin{array}[c]{c}
\exptype{C}{\ol{X} \triangleleft \ol{N}} \triangleleft \exptype{D}{\ol{N}} \\
\hline
\vspace*{-0.4cm}\\
\texttt{C} \ll \texttt{D}
\end{array}
\quad
\begin{array}[c]{l}
\\
\hline
\vspace*{-0.4cm}\\
\texttt{C} \ll \texttt{C}
\end{array}
\quad
\begin{array}[c]{l}
\texttt{C} \ll \texttt{D}, \texttt{D} \ll \texttt{E} \\
\hline
\vspace*{-0.4cm}\\
\texttt{C} \ll \texttt{E}
\end{array}
\end{displaymath}
The algorithm uses it to determine if two types are possible subtypes of one another.
This is needed in the \rulename{adapt} and \rulename{match} rules.
%\textbf{Wildcard renaming}\\
\item[Wildcard renaming:]
The \rulename{reduce} rule separates wildcards from their environment.
At this point each wildcard gets a new and unique name.
To only rename the respective wildcards the reduce rule renames wildcards up to alpha conversion:
($[\ol{C}/\ol{B}]$ in the \rulename{reduce} rule)
\begin{align*}
[\type{A}/\type{B}]\type{B} &= \type{A} \\
[\type{A}/\type{B}]\type{C} &= \type{C} & \text{if}\ \type{B} \neq \type{C}\\
[\type{A}/\type{B}]\wcNtype{\overline{\wildcard{X}{\type{U}}{\type{L}}}}{N} &= \wcNtype{\overline{\wildcard{X}{[\type{A}/\type{B}]\type{U}}{[\type{A}/\type{B}]\type{L}}}}{[\type{A}/\type{B}]N} & \text{if}\ \type{B} \notin \overline{\rwildcard{X}} \\
[\type{A}/\type{B}]\wcNtype{\overline{\wildcard{X}{\type{U}}{\type{L}}}}{N} &= \wcNtype{\overline{\wildcard{X}{\type{U}}{\type{L}}}}{N} & \text{if}\ \type{B} \in \overline{\rwildcard{X}} \\
\end{align*}
\item[Free Variables:]
The $\text{fv}$ function assumes every wildcard type variable to be a free variable aswell.
%\textbf{Fresh Wildcards}\\
\item[Fresh Wildcards:]
$\text{fresh}\ \overline{\wildcard{A}{\tv{u}}{\tv{l}}}$ generates fresh wildcards.
The new names $\ol{A}$ are fresh, aswell as the type variables $\ol{\tv{u}}$ and $\ol{\tv{l}}$,
which are used for the upper and lower bounds.
\end{description}
% \noindent
% \textbf{Example: (reduce-rule)}
% %The \ruleReduceWC{} resembles the \texttt{S-Exists} subtyping rule.
% %It converts wildcard types to fresh type variables.
% %For example take the input constraint $\exptype{Pair}{\ntype{Object},\tv{b}} \lessdot \wctype{\wildcard{A}{\tv{c}}{\tv{d}}}{Pair}{\wildcard{A}{\tv{c}}{\tv{d}},\wildcard{A}{\tv{c}}{\tv{d}}}$.
% %First we apply the \ruleReduceWC{} rule, which replaces $\wildcard{A}{\tv{c}}{\tv{d}}$ with a fresh type variable $\tv{a}$:
% We assume the input constraints $C = \exptype{Pair}{\ntype{Object},\tv{b}} \lessdot \wctype{\wildcard{A}{\tv{c}}{\tv{d}}}{Pair}{\wildcard{A}{\tv{c}}{\tv{d}},\wildcard{A}{\tv{c}}{\tv{d}}}$.
% The type on the right side $\wctype{\wildcard{A}{\tv{c}}{\tv{d}}}{Pair}{\wildcard{A}{\tv{c}}{\tv{d}},\wildcard{A}{\tv{c}}{\tv{d}}}$
% \begin{align*}
% \ddfrac{
% \exptype{Pair}{\ntype{Object},\tv{b}} \lessdot \wctype{\wildcard{A}{\tv{c}}{\tv{d}}}{Pair}{\wildcard{A}{\tv{c}}{\tv{d}},\wildcard{A}{\tv{c}}{\tv{d}}}
% }{
% \ntype{Object} \doteq \tv{a}, \tv{b} \doteq \tv{a}, \tv{a} \lessdot \tv{c}, \tv{d} \lessdot \tv{a}
% } \ruleReduceWC{}
% \end{align*}
% By substition we get the result: % $\tv{a} \doteq \type{Object}$, $\tv{a} \doteq \type{Object}$, $\ldots$.
% \begin{align*}
% \ddfrac{
% \ntype{Object} \doteq \tv{a}, \tv{b} \doteq \tv{a}, \tv{a} \lessdot \tv{c}, \tv{d} \lessdot \tv{a}
% }{
% \tv{a} \doteq \ntype{Object} , \tv{b} \doteq \ntype{Object}, \ntype{Object} \lessdot \tv{c}, \tv{d} \lessdot \ntype{Object}
% } \rulename{Subst}
% \end{align*}
% \\[1em]
\noindent
\textbf{Step 2:}
%If there are no $(\type{T} \lessdot \tv{a})$ constraints remaining in the constraint set $C$
%resume with step 4.
The rules in figure \ref{fig:step2-rules} offer multiple possibilities to deal with constraints of the form $\type{N} \lessdot \tv{a}$.
This builds a search tree over multiple possible solutions.
\unify{} has to try each branch and accumulate the resulting solutions into a solution set.
\def\boxit#1#2{%
\smash{\color{red}\fboxrule=1pt\relax\fboxsep=2pt\relax%
\llap{\rlap{\fbox{\phantom{\rule{#1}{#2}}}}~}}\ignorespaces
@ -892,8 +940,6 @@ This builds a search tree over multiple possible solutions.
\label{fig:step2-rules}
\end{figure}
$\type{T} \lessdot \ntv{a}$ constraints have three and $\type{T} \lessdot \wtv{a}$ constraints have five possible transformations.
\begin{figure}
\begin{center}
\fbox{
@ -927,23 +973,6 @@ This builds a search tree over multiple possible solutions.
\label{fig:step2-rules}
\end{figure}
\unify{} generates wildcard types for constraints of the form $\type{N} \lessdot \tv{a}$ with the \rulename{\generalizeRule} rule.
Otherwise only the wildcards already defined in the input constraints will be included in the result.
\rulename{\generalizeRule} attempts to give $\tv{a}$ a more general type by replacing only the type parameters with fresh wildcards.
The second variation sets $\tv{a}$ to the direct supertype of type \texttt{C}.
For the constraint $\texttt{Object} \lessdot \tv{a}$ the algorithm can only apply $\tv{a} \doteq \texttt{Object}$,
because \texttt{Object} has no other supertype than itself.
Constraints of the form $\set{ \tv{a} \lessdot \type{N}, \tv{a} \lessdot^* \tv{b}}$
need to be handled in a similiar fashion.
The type variable $\tv{b}$ could either be a sub or a supertype of the type $\type{N}$.
We have to consider both possibilities.
\\[1em]
\noindent
\textbf{Step 3:}
We apply the rules in figure \ref{fig:cleanup-rules} exhaustively and proceed with step 4.
\begin{figure}
\begin{center}
\leavevmode
@ -978,70 +1007,6 @@ C \cup \set{ \tv{a} \doteq \bot }
\caption{Cleanup rules}\label{fig:cleanup-rules}
\end{figure}
The cleanup step prepares the constraint set for the last step by applying the following concepts:
%Two transformations are done which also help to remove unnecessary types from the solution set.
\begin{description}
\item[Bottom type]
The bottom type $\bot$ is used to generate \texttt{? extends} wildcard definitions.
This is the only possible solution when dealing with multiple upper bounds:
$\tv{a} \lessdot \type{T}, \tv{a} \lessdot \type{S}$ is usually not a correct solution (given $\type{S}$ and $\type{T}$ are no subtypes of eachother).
But if $\tv{a}$ is a lower bound of a wildcard it can be set to $\bot$.
Those constraints only stay in the constraint set after the first step if $\type{S}$ and $\type{T}$ do not have a common subtype.
The \rulename{Ground} rule uses this concept to generate \texttt{extends} Wildcards.
\item[Eliminating Wildcards]
Wildcards that have the same upper and lower bounds can be removed.
This is done by the \rulename{Crunch} rule.
\textit{Example:} The type $\wctype{\wildcard{X}{\type{String}}{\type{String}}}{List}{\rwildcard{X}}$
becomes $\exptype{List}{\type{String}}$.
%TODO: The a =. T (with T containing free variables) could be removed here.
% Not needed for the soundness of the algorithm, but handy for the implementation (check this when implementing the algorithm)
\end{description}
\noindent
\textbf{Step 6 (Generating Result):}
Apply the rules in figure \ref{fig:generation-rules} until $\wildcardEnv = \emptyset$ and $C = \emptyset$.
The resulting $\Delta, \sigma$ is a correct solution.
For this step to succeed there should only be four kinds of constraints left.
\begin{enumerate}
%\item\label{item:3} $\tv{a} \lessdot \tv{b}$ %, with $a$ and $b$ both isolated type variables
\item $\tv{a} \doteq \tv{b}$
%\item $\wtv{a} \doteq \type{G}$
\item\label{item:1} $\tv{a} \lessdot \wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}$, with $\text{fv}(\wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}) \subseteq \Delta_in$
\item\label{item:2} $\tv{a} \doteq \wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}$, with $\tv{a} \notin \ol{\type{T}}$ % and $\text{fv}(\wctype{\ol{\wtype{W}}}{C}{\ol{\type{T}}}) = \emptyset$
\item\label{item:3} $\tv{a} \doteq \rwildcard{X}$
\end{enumerate}
% Constraints of the form $\tv{a} \doteq \rwildcard{X}$ are also possible.
% should we add those to the \Delta environment?
% How about removing them completely from the result set? Check with soundness condition
%Each type placeholder $\tv{a}$ must solely appear on the left side of a constraint.
\unify{} fails if there is any $\tv{a} \doteq \type{T}$ such that $\tv{a}$ occurs in $\type{T}$.
For the cases \ref{item:1}, \ref{item:2}, and \ref{item:3} the placeholder $\tv{a}$
cannot appear anywhere else in the constraint set.
Otherwise the generation rules \rulename{GenSigma} and \rulename{GenDelta} will not be able to process every constraint.
% After applying the GenDelta and GenSigma rules unifiers $\sigma$ do not contain
% a unifier of the form $\tv{a} \to \tv{b}$.
% Otherwise the found solution is incorrect.
% This only happens if the input constraints contain type variables with no upper bound constraint like $\tv{a} \lessdot \type{N}$.
% \begin{displaymath}
% \deduction{
% \wildcardEnv \cup \set{\wildcard{B}{\type{G}}{\type{G'}}} \vdash C \implies \Delta, \sigma
% }{
% \wildcardEnv \vdash C \implies \Delta \cup \set{\wildcard{B}{\type{G}}{\type{G'}}}, \sigma
% }\quad \text{tph}(\type{G}) = \emptyset, \text{tph}(\type{G'}) = \emptyset,
% \rwildcard{B} \notin \text{dom}(\Delta)
% \quad \rulename{AddDelta}
% \end{displaymath}
\dashbox{The \text{tph} function returns all type placeholders inside a given type.}
\begin{figure}
\begin{center}
\fbox{
@ -1159,6 +1124,7 @@ The General rule only creates the ones expressable by Java syntax, which still a
%thats not true. it can spawn X^T_T2.List<X> where T and T2 are types and we need to choose one inbetween them
Otherwise the algorithm could generate more solutions, but they have to be filterd out afterwards, because they cannot be translated into Java.
\letfj{} is not able to represent all Java programs. %TODO: this makes ist impossible for our algorithm to be complete on Java
\subsection{Implementation}
%List this under implementation details
@ -1167,8 +1133,13 @@ Every constraint that is not in solved form and is not able to be processed by a
The new constraint generated by the adopt rule may be eliminated by the match rule.
The adopt rule still needs to be applied only once per constraint.
Two additional rules can be added to remove clutter in the output.
Wildcards with same upper and lower bounds can be replaced with their upper bound.
\textbf{Eliminating Wildcards}
Wildcards that have the same upper and lower bounds can be removed.
This is done by the \rulename{Crunch} rule.
\textit{Example:} The type $\wctype{\wildcard{X}{\type{String}}{\type{String}}}{List}{\rwildcard{X}}$
becomes $\exptype{List}{\type{String}}$.
\begin{tabular}[t]{l@{~}l}
\\\\
@ -1199,4 +1170,10 @@ C \cup \set{ \tv{a} \lessdot \wctype{\Delta'}{C}{[\ol{U}/\ol{X}]\ol{S}}}
\ol{U} = \ol{L}
\end{array}
\end{array}$
\end{tabular}
\end{tabular}
% After applying the GenDelta and GenSigma rules unifiers $\sigma$ do not contain
% a unifier of the form $\tv{a} \to \tv{b}$.
% Otherwise the found solution is incorrect.
% This only happens if the input constraints contain type variables with no upper bound constraint like $\tv{a} \lessdot \type{N}$.