Add Motivation and Bib
This commit is contained in:
parent
df402dfb2b
commit
0373eb69e8
232
aspUnify.tex
232
aspUnify.tex
@ -26,19 +26,20 @@
|
||||
\usepackage{amssymb}
|
||||
\usepackage{amsthm}
|
||||
\usepackage{enumitem}
|
||||
\usepackage{xcolor}
|
||||
%\usepackage{amsthm}
|
||||
|
||||
|
||||
\newtheorem{theorem}{Theorem}
|
||||
\newtheorem{lemma}[theorem]{Lemma}
|
||||
|
||||
\title{Global Type Inference for Java using SAT Solvers}
|
||||
\title{Global Type Inference for Java using Answer Set Programming}
|
||||
\author{Andreas Stadelmeier
|
||||
\institute{DHBW\\ Stuttgart}
|
||||
\email{a.stadelmeier@hb.dhbw-stuttgart.de}
|
||||
}
|
||||
\newcommand{\authorrunning}{Andreas Stadelmeier}
|
||||
\newcommand{\titlerunning}{Type Unification to SAT}
|
||||
\newcommand{\titlerunning}{Type Unification with ASP}
|
||||
\hypersetup{
|
||||
bookmarksnumbered,
|
||||
pdftitle = {\titlerunning},
|
||||
@ -57,14 +58,16 @@
|
||||
%
|
||||
\begin{abstract}
|
||||
Global type inference for Java is able to compute correct types for an input program
|
||||
which has no type annotations at all.
|
||||
Global type inference for Featherweigt Generic Java is NP-hard.
|
||||
Former implementations of the algorithm struggled with bad runtime performance.
|
||||
In this paper we translated the type inference algorithm for Featherweight Generic Java to an
|
||||
which has no type annotations at all, but turns out to be NP-hard.
|
||||
Former implementations of the algorithm therefore struggled with bad runtime performance.
|
||||
In this paper we translate the type inference algorithm for Featherweight Generic Java to an
|
||||
Answer Set Program.
|
||||
Answer Set Programming (ASP) promises to solve complex computational search problems
|
||||
as long as they are finite domain.
|
||||
|
||||
This paper shows that it is possible to implement global type inference for FGJ as an ASP program.
|
||||
% We also show soundness, completeness and termination for our ASP implementation.
|
||||
Another advantage is that the specification of the algorithm can be translated one on one to ASP code
|
||||
leading to less errors in the respective implementation.
|
||||
%TODO: Is this correct?
|
||||
\end{abstract}
|
||||
%
|
||||
@ -94,6 +97,94 @@ fast execution times.
|
||||
These programs tend to be in exponential time upper bounded by the amount of ambiguous method calls and subtype relations.
|
||||
%TODO: Example with multiple calls to m.toString(). All possibilities are checked in a naive implementation!
|
||||
|
||||
\section{Motivation}
|
||||
The nature of the global type inference algorithm causes the search space of the unification algorithm to
|
||||
increase exponentially.
|
||||
Java allows overloaded methods causing our type inference algorithm to create so called Or-Constraints.
|
||||
This can also happen if multiple classes implement a method with the same name and the same amount of parameters.
|
||||
Given the following input program we do not know, which method \texttt{self} is meant for the method call
|
||||
\texttt{var.self()}, because there is no type annotation for \texttt{var}.
|
||||
|
||||
|
||||
%\begin{figure}[h]
|
||||
%\begin{minipage}{0.49\textwidth}
|
||||
\begin{lstlisting}
|
||||
class C1 {
|
||||
C1 self(){
|
||||
return this;
|
||||
}
|
||||
}
|
||||
class C2 {
|
||||
C2 self(){
|
||||
return this;
|
||||
}
|
||||
}
|
||||
class Example {
|
||||
untypedMethod(var){
|
||||
return var.self(); (*@$\implies \set{\tv{var} \lessdot \type{C1}, \tv{ret} \doteq \type{C1} }\ ||\ \set{\tv{var} \lessdot \type{C2}, \tv{ret} \doteq \type{C2}}$@*)
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
% \end{minipage}%
|
||||
% \hfill
|
||||
% \begin{minipage}{0.49\textwidth}
|
||||
% \begin{lstlisting}[style=constraints]
|
||||
% (*@$\tv{var} \lessdot \type{C1} | $@*),
|
||||
% (*@$\sigma(\tv{xp}) = \wctype{\wildcard{A}{\type{String}}{\bot}}{List}{\rwildcard{A}}$@*)
|
||||
% \end{lstlisting}
|
||||
% \end{minipage}
|
||||
%\end{figure}
|
||||
|
||||
An Or-Constraint like $\set{\tv{var} \lessdot \type{C1}, \tv{ret} \doteq \type{C1} }\ ||\ \set{\tv{var} \lessdot \type{C2}, \tv{ret} \doteq \type{C2}}$
|
||||
means that either the the constraint set $\set{\tv{var} \lessdot \type{C1}, \tv{ret} \doteq \type{C1} }$
|
||||
or $\set{\tv{var} \lessdot \type{C2}, \tv{ret} \doteq \type{C2}}$ has to be satisfied.
|
||||
In this example the type placeholder $\tv{var}$ is a placeholder for the type of the \texttt{var} variable.
|
||||
The $\tv{ret}$ placeholder represents the return type of the \texttt{untypedMethod} method.
|
||||
If we set the type of \texttt{var} to \texttt{C1}, then the return type of the method will be \texttt{C1} aswell.
|
||||
If we set it to \texttt{C2} then also the return type will be \texttt{C2}.
|
||||
There are two possibilities therefore the Or-Constraint.
|
||||
|
||||
If we chain multiple overloaded method calls together we end up with multiple Or-Constraints.
|
||||
The type unification algorithm \unify{} only sees the supplied constraints and has to potentially try all combinations of them.
|
||||
This is proven in \cite{TIforGFJ} and is the reason why type inference for Featherweight Generic Java is NP-hard.
|
||||
Let's have a look at the following alternation of our example:
|
||||
|
||||
\begin{lstlisting}
|
||||
untypedMethod(var){
|
||||
return var.self().self().self().self();
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
Now there are four chained method calls leading to four Or-Constraints:
|
||||
\begin{align*}
|
||||
\set{\tv{var} \lessdot \type{C1}, \tv{r1} \doteq \type{C1} }\ ||\ \set{\tv{var} \lessdot \type{C2}, \tv{r1} \doteq \type{C2}} \\
|
||||
\set{\tv{r1} \lessdot \type{C1}, \tv{r2} \doteq \type{C1} }\ ||\ \set{\tv{r1} \lessdot \type{C2}, \tv{r2} \doteq \type{C2}} \\
|
||||
\set{\tv{r2} \lessdot \type{C1}, \tv{r3} \doteq \type{C1} }\ ||\ \set{\tv{r2} \lessdot \type{C2}, \tv{r3} \doteq \type{C2}} \\
|
||||
\set{\tv{r3} \lessdot \type{C1}, \tv{ret} \doteq \type{C1} }\ ||\ \set{\tv{r3} \lessdot \type{C2}, \tv{ret} \doteq \type{C2}} \\
|
||||
\end{align*}
|
||||
|
||||
The placeholder $\tv{r1}$ stands for the return type of the first call to \texttt{self} and
|
||||
$\tv{r2}$ for the return type of the second call and so on.
|
||||
It is obvious that this constraint set only has two solutions.
|
||||
The variable \texttt{var} and the return type of the method aswell as all intermediate placeholders $\tv{r1}-\tv{r3}$
|
||||
get either the type \texttt{C1} or \texttt{C2}.
|
||||
A first prototype implementation of the \unify{} algorithm simply created the cartesian product of all Or-Constraints,
|
||||
16 possibilities in this example, and iteratively tried all of them.
|
||||
This leads to a exponential runtime increase with every added overloaded method call.
|
||||
Eventhough the current algorithm is equipped with various optimizations as presented in \cite{plue181} and \cite{plue231},
|
||||
there is still a runtime increas sensible when dealing with many Or-Constraints.
|
||||
|
||||
Our global type inference algorithm should construct type solutions in real time.
|
||||
Then it can effectively used as a Java compiler which can deal with large inputs of untyped Java code.
|
||||
Another use case scenario is as an editor plugin which helps a Java programmer by enabling him to write untyped Java code
|
||||
and letting our tool insert the missing type statements.
|
||||
For both applications a short execution time is vital.
|
||||
|
||||
Answer Set programming promises to solve complex search problems in a highly optimized way.
|
||||
The idea in this paper is to implement the algorithm presented in \cite{TIforGFJ}
|
||||
as an ASP program and see how well it handles our type unification problem.
|
||||
|
||||
|
||||
\section{Subtyping}
|
||||
\begin{mathpar}
|
||||
\inferrule[Refl]{}{
|
||||
@ -125,6 +216,9 @@ These programs tend to be in exponential time upper bounded by the amount of amb
|
||||
}
|
||||
\end{mathpar}
|
||||
|
||||
%Subtyping has no bounds for generic type parameters.
|
||||
% but this is propably not needed
|
||||
|
||||
\section{Unify}
|
||||
Input: Every type placeholder must have an upper bound.
|
||||
Output: Every $\tv{a} \lessdot \type{T}$ constraint gets a
|
||||
@ -136,28 +230,22 @@ We have to formulate the unification algorithm with implication rules.
|
||||
Those can be directly translated to ASP.
|
||||
\label{sec:implicationRules}
|
||||
\begin{mathpar}
|
||||
\inferrule[Subst]{
|
||||
\tv{a} \doteq \type{N}
|
||||
}{
|
||||
\tv{a} \mapsto \type{N}
|
||||
}
|
||||
\and
|
||||
\inferrule[Subst-L]{
|
||||
\tv{a} \mapsto \type{T}_1 \\
|
||||
\tv{a} \doteq \type{T}_1 \\
|
||||
\tv{a} \lessdot \type{T}_2
|
||||
}{
|
||||
\type{T}_1 \lessdot \type{T}_2
|
||||
}
|
||||
\and
|
||||
\inferrule[Subst-R]{
|
||||
\tv{a} \mapsto \type{T}_1 \\
|
||||
\tv{a} \doteq \type{T}_1 \\
|
||||
\type{T}_2 \lessdot \tv{a}
|
||||
}{
|
||||
\type{T}_2 \lessdot \type{T}_1
|
||||
}
|
||||
\and
|
||||
\inferrule[Subst-Equal]{
|
||||
\tv{a} \mapsto \type{T}_1 \\
|
||||
\tv{a} \doteq \type{T}_1 \\
|
||||
\tv{a} \doteq \type{T}_2
|
||||
}{
|
||||
\type{T}_1 \doteq \type{T}_2
|
||||
@ -176,12 +264,19 @@ Those can be directly translated to ASP.
|
||||
}
|
||||
\and
|
||||
\inferrule[Subst-Param]{
|
||||
\tv{a} \mapsto \type{S} \\
|
||||
\tv{a} \doteq \type{S} \\
|
||||
\type{T} \doteq \exptype{C}{\type{T}_1 \ldots, \tv{a}, \ldots \type{T}_n} \\
|
||||
}{
|
||||
\type{T} \doteq \exptype{C}{\type{T}_1, \ldots \type{S}, \ldots \type{T}_n}
|
||||
}
|
||||
\and
|
||||
\inferrule[Subst-Param']{
|
||||
\tv{a} \doteq \type{S} \\
|
||||
\tv{b} \lessdot \exptype{C}{\type{T}_1 \ldots, \tv{a}, \ldots \type{T}_n} \\
|
||||
}{
|
||||
\tv{b} \lessdot \exptype{C}{\type{T}_1, \ldots \type{S}, \ldots \type{T}_n}
|
||||
}
|
||||
\and
|
||||
\inferrule[S-Object]{}{\tv{a} \lessdot \type{Object}}
|
||||
\and
|
||||
\inferrule[Match]{
|
||||
@ -224,10 +319,10 @@ Those can be directly translated to ASP.
|
||||
\begin{mathpar}
|
||||
\text{Apply only once per constraint:}\quad
|
||||
\inferrule[Super]{
|
||||
\type{T} \lessdot \tv{a}\\
|
||||
\type{T} <: \type{N}
|
||||
\type{N} \lessdot \tv{a}\\
|
||||
\type{N} <: \type{N}'
|
||||
}{
|
||||
\tv{a} \doteq \type{N}
|
||||
\tv{a} \doteq \type{N}'
|
||||
}
|
||||
\end{mathpar}
|
||||
|
||||
@ -254,40 +349,11 @@ Those can be directly translated to ASP.
|
||||
|
||||
Result:
|
||||
\begin{mathpar}
|
||||
% \inferrule[Solution]{
|
||||
% \tv{a} \doteq \type{N} \\
|
||||
% \tv{a} \notin \type{N}
|
||||
% }{
|
||||
% \sigma(\tv{a}) = \type{N}
|
||||
% }
|
||||
\inferrule[Solution]{
|
||||
\tv{a} \doteq \type{G}
|
||||
}{
|
||||
\sigma(\tv{a}) = \type{G}
|
||||
}
|
||||
% \and
|
||||
% \inferrule[Generic]{
|
||||
% \tv{a} \lessdot \type{N} %, \ldots, \tv{a} \lessdot \exptype{C_n}{\ol{T_n}} \\
|
||||
% \\
|
||||
% \text{not}\ \tv{a} \doteq \type{N}'
|
||||
% }{
|
||||
% \tv{a} \mapsto \type{A}
|
||||
% }
|
||||
% \and
|
||||
% \inferrule[Solution-Sub]{
|
||||
% \tv{a} \lessdot \exptype{C_1}{\ol{T_1}}, \ldots, \tv{a} \lessdot \exptype{C_n}{\ol{T_n}} \\
|
||||
% \forall i: \type{C_m} << \type{C_i} \\
|
||||
% \text{not}\ \tv{a} \doteq \type{N}
|
||||
% }{
|
||||
% \Delta(\type{A}) = \exptype{C_m}{\ol{T_m}} \\ \sigma(\tv{a}) = \type{A}
|
||||
% }
|
||||
% \and
|
||||
% \inferrule[Solution-Gen]{
|
||||
% \tv{a} \lessdot \type{G}_1, \ldots, \tv{a} \lessdot \type{G}_n \\
|
||||
% \forall i: \type{G} <: \type{G}_i \\
|
||||
% }{
|
||||
% \Delta(\type{A}) = \type{G} \\ \sigma(\tv{a}) = \type{A}
|
||||
% }
|
||||
\and
|
||||
\inferrule[Solution-Gen]{
|
||||
\tv{a} \lessdot \type{C}_1, \ldots, \tv{a} \lessdot \type{C}_n \\
|
||||
@ -295,6 +361,22 @@ Result:
|
||||
}{
|
||||
\tv{a} \doteq \type{C}_m
|
||||
}
|
||||
% \and
|
||||
% \inferrule[Solution-Gen]{
|
||||
% \tv{a} \lessdot \type{C}\\
|
||||
% \sigma(\tv{a}) = \emptyset
|
||||
% }{
|
||||
% \tv{a} \doteq \type{A} \\ \sigma'(\tv{a}) = \type{A}
|
||||
% }
|
||||
% \and
|
||||
% \inferrule[Solution-Gen]{
|
||||
% \tv{a} \lessdot \type{G} \\
|
||||
% \tv{a} \lessdot \type{G}_1, \ldots, \tv{a} \lessdot \type{G}_n \\
|
||||
% \forall i: \type{G} << \type{G}_i \\
|
||||
% \sigma'(\tv{a}) = \type{A}
|
||||
% }{
|
||||
% \Delta(\type{A}) = \type{G}
|
||||
% }
|
||||
\end{mathpar}
|
||||
|
||||
Fail:
|
||||
@ -314,8 +396,15 @@ Fail:
|
||||
\emptyset
|
||||
}
|
||||
\and
|
||||
\inferrule[Fail-Generic]{
|
||||
\type{X} \doteq \type{T}\\
|
||||
\type{X} \neq \type{T}
|
||||
}{
|
||||
\emptyset
|
||||
}
|
||||
\and
|
||||
\inferrule[Fail-Sigma]{
|
||||
\tv{a} \mapsto \type{N} \\
|
||||
\tv{a} \doteq \type{N} \\
|
||||
\tv{a} \in \type{N}
|
||||
}{
|
||||
\emptyset
|
||||
@ -347,14 +436,48 @@ The algorithm terminates if every type placeholder in the input constraint set h
|
||||
The implication rules defined in chapter \ref{sec:implicationRules} can be translated to an ASP program.
|
||||
|
||||
\begin{tabular}{l | r}
|
||||
$\exptype{C}{\ol{X}}$ & \texttt{type("C", paramX)}\\
|
||||
$\tv{a}$ & \texttt{tph("a")}\\
|
||||
$\exptype{C}{}$ & \texttt{type("C", null)}\\
|
||||
$\exptype{C}{\ol{X}}$ & \texttt{type("C", params(\ldots))}\\
|
||||
|
||||
\end{tabular}
|
||||
\newcommand{\aspify}{\nabla}
|
||||
\begin{align*}
|
||||
\aspify(\tv{a}) =& \texttt{tph("a")}\\
|
||||
\exptype{C}{} =& \texttt{type("C", null)}\\
|
||||
\exptype{C}{\type{T}_1, \type{T}_2, \ldots} =& \texttt{type("C", params(}\aspify(\type{T}_1), \aspify(\type{T}_1), \ldots\texttt{))}\\
|
||||
\aspify(\type{T} \doteq \type{T}') =& \texttt{equalcons}(\aspify(\type{T}), \aspify(\type{T}')) \\
|
||||
\aspify(\type{T} \lessdot \type{T}') =& \texttt{subcons}(\aspify(\type{T}), \aspify(\type{T}')) \\
|
||||
\aspify{} \left(\inferrule{
|
||||
c_1, c_2
|
||||
}{
|
||||
c
|
||||
}\right) =& \aspify(c)\ \texttt{:-} \aspify(c_1), \aspify(c_2)
|
||||
\end{align*}
|
||||
|
||||
The S-Class rule contains a substitution and has to be encoded with variables.
|
||||
Given a extends relation $\texttt{class}\ \exptype{C}{\type{X}} \triangleleft \exptype{D}{\type{X}}$
|
||||
we generate the ASP code:\\
|
||||
{\small{\texttt{subtype(type("C", params(X)), type("D", params(X))) :- subtype(type("C", params(X))).}}}
|
||||
|
||||
Capital letters like \texttt{X} are variables in ASP and the former statement causes any
|
||||
literal like \texttt{subtype(type("C", params(tph("a"))))} leads to the literal
|
||||
\texttt{subtype(type("C", params(tph("a"))), type("D", params(tph("a"))))}.
|
||||
|
||||
|
||||
\section{Proofs}
|
||||
\begin{lemma}{Substitution}
|
||||
|
||||
\begin{lemma}{Substitution:}
|
||||
For every $\tv{a} \doteq \type{T}$ and every constraint $c$,
|
||||
there also exists a constraint $[\type{T}/\tv{a}]c$.
|
||||
|
||||
\noindent
|
||||
\normalfont
|
||||
This lemma proofs that equal constraints lead to a substitution in every other constraint.
|
||||
For exmaple $\tv{b} \doteq \type{String}$ and $\tv{b} \lessdot \exptype{Comparable}{\tv{b}}$
|
||||
lead to a constraint $\type{String} \lessdot \exptype{Comparable}{\type{String}}$
|
||||
\end{lemma}
|
||||
\textit{Proof:}
|
||||
%TODO
|
||||
|
||||
\section{Termination}
|
||||
|
||||
@ -369,6 +492,9 @@ Otherwise there has to be a loop and this woul lead to an incorrect constraint s
|
||||
The Subst-Param rule can only be applied a finite number of times.
|
||||
Due to the substitution lemma and the Sigma-Fail rule the Subst-Param rule can only be applied once per type placeholder.
|
||||
|
||||
The Subst-Param rule can only be applied once per type placeholder.
|
||||
If $\type{T} \doteq \exptype{C}{\tv{a}}$ is substituted to $\type{T} \doteq \exptype{C}{\type{N}}$
|
||||
then there is no $\tv{a} \in \type{N}$.
|
||||
|
||||
\section{Completeness}
|
||||
To proof completeness we have to show that every type can be replaced by a placeholder in a correct constraint set.
|
||||
|
5677
martin.bib
Executable file
5677
martin.bib
Executable file
File diff suppressed because it is too large
Load Diff
@ -7,7 +7,7 @@
|
||||
%breaklines=true,
|
||||
showstringspaces=false,
|
||||
%breakatwhitespace=true,
|
||||
basicstyle=\ttfamily\fontsize{8}{9.6}\selectfont, %\footnotesize
|
||||
basicstyle=\ttfamily, %\footnotesize
|
||||
escapeinside={(*@}{@*)},
|
||||
captionpos=t,float,abovecaptionskip=-\medskipamount
|
||||
}
|
||||
|
Loading…
Reference in New Issue
Block a user