736 lines
29 KiB
TeX
736 lines
29 KiB
TeX
% This is samplepaper.tex, a sample chapter demonstrating the
|
||
% LLNCS macro package for Springer Computer Science proceedings;
|
||
% Version 2.21 of 2022/01/12
|
||
%
|
||
\documentclass[submission]{eptcs}
|
||
\providecommand{\event}{Symposium in honor of Peter Thiemann's 60th birthday} % Name of the event you are submitting to
|
||
\usepackage{underscore}
|
||
\usepackage[T1]{fontenc}
|
||
% T1 fonts will be used to generate the final print and online PDFs,
|
||
% so please use T1 fonts in your manuscript whenever possible.
|
||
% Other font encondings may result in incorrect characters.
|
||
%
|
||
\usepackage{graphicx}
|
||
% Used for displaying a sample figure. If possible, figure files should
|
||
% be included in EPS format.
|
||
%
|
||
% If you use the hyperref package, please uncomment the following two lines
|
||
% to display URLs in blue roman font according to Springer's eBook style:
|
||
%\usepackage{color}
|
||
%\renewcommand\UrlFont{\color{blue}\rmfamily}
|
||
%\urlstyle{rm}
|
||
%
|
||
\include{prolog}
|
||
\usepackage{mathpartir}
|
||
\usepackage{amsmath}
|
||
\usepackage{amssymb}
|
||
\usepackage{amsthm}
|
||
\usepackage{enumitem}
|
||
\usepackage{xcolor}
|
||
%\usepackage{amsthm}
|
||
|
||
|
||
\newtheorem{theorem}{Theorem}
|
||
\newtheorem{lemma}[theorem]{Lemma}
|
||
|
||
\title{Global Type Inference for Java using Answer Set Programming}
|
||
\author{Andreas Stadelmeier
|
||
\institute{DHBW\\ Stuttgart}
|
||
\email{a.stadelmeier@hb.dhbw-stuttgart.de}
|
||
}
|
||
\newcommand{\authorrunning}{Andreas Stadelmeier}
|
||
\newcommand{\titlerunning}{Type Unification with ASP}
|
||
\hypersetup{
|
||
bookmarksnumbered,
|
||
pdftitle = {\titlerunning},
|
||
pdfauthor = {\authorrunning},
|
||
pdfsubject = {Global Type Inference for Java}, % Consider adding a more appropriate subject or description
|
||
pdfkeywords = {typeinference, java, sat solving, answer set programming} % Uncomment and enter keywords specific to your paper
|
||
}
|
||
|
||
\begin{document}
|
||
%
|
||
% First names are abbreviated in the running head.
|
||
% If there are more than two authors, 'et al.' is used.
|
||
%
|
||
%
|
||
\maketitle % typeset the header of the contribution
|
||
%
|
||
\begin{abstract}
|
||
Global type inference for Java is able to compute correct types for an input program
|
||
which has no type annotations at all, but turns out to be NP-hard.
|
||
Former implementations of the algorithm therefore struggled with bad runtime performance.
|
||
In this paper we translate the type inference algorithm for Featherweight Generic Java to an
|
||
Answer Set Program.
|
||
Answer Set Programming (ASP) promises to solve complex computational search problems
|
||
as long as they are finite domain.
|
||
This paper shows that it is possible to implement global type inference for Featherweight Generic Java (FGJ) as an ASP program.
|
||
Afterwards we compared the ASP implementation with our own implementation in Java
|
||
exposing promising results.
|
||
% We also show soundness, completeness and termination for our ASP implementation.
|
||
Another advantage is that the specification of the algorithm can be translated one on one to ASP code
|
||
leading to less errors in the respective implementation.
|
||
Unfortunately ASP can only be used for type inference for a subset of the Java type system that excludes wildcard types.
|
||
\end{abstract}
|
||
%
|
||
%
|
||
%
|
||
\section{Global Type Inference}
|
||
% Every major typed programming language uses some form of type inference.
|
||
% Rust, Java, C++, Haskell, etc... %(see: https://en.wikipedia.org/wiki/Type_inference)
|
||
% Type inference adds alot of value to a programming language.
|
||
% \begin{itemize}
|
||
% \item Code is more readable.
|
||
% \item Type inference usually does a better job at finding types than a programmer.
|
||
% \item Type inference can use types that are not denotable by the programmer.
|
||
% \item Better reusability.
|
||
% \item Allows for faster development and the programmer to focus on the actual task instead of struggling with the type system.
|
||
% \end{itemize}
|
||
|
||
% Java has adopted more and more type inference features over time.
|
||
% %TODO: list type inference additions
|
||
% Currently Java only has local type inference.
|
||
% We want to bring type inference for Java to the next level.
|
||
|
||
Java is a strictly typed programming language.
|
||
The current versions come with a local type inference feature that allows the programmer to omit
|
||
type annotations in some places like argument types of lambda expressions for example.
|
||
But methods still have to be fully typed including argument and return types.
|
||
We work at a global type inference algorithm for Java which is able to compute correct types
|
||
for a Java program with no type annotations at all.
|
||
|
||
\begin{figure}[h]
|
||
\begin{lstlisting}
|
||
class C1 {
|
||
C1 self(){
|
||
return this;
|
||
}
|
||
}
|
||
class C2 {
|
||
C2 self(){
|
||
return this;
|
||
}
|
||
}
|
||
class Example {
|
||
untypedMethod(var){
|
||
return var.self(); // (*@$\implies \set{\tv{var} \lessdot \type{C1}, \tv{ret} \doteq \type{C1} }\ ||\ \set{\tv{var} \lessdot \type{C2}, \tv{ret} \doteq \type{C2}}$@*)
|
||
}
|
||
}
|
||
\end{lstlisting}
|
||
\caption{Java program missing type annotations}\label{fig:introExample}
|
||
\end{figure}
|
||
|
||
A possible input for our algorithm is shonw in figure \ref{fig:introExample}.
|
||
Here the method \texttt{untypedMethod} is missing its argument type for \texttt{var} and its return type.
|
||
Global type inference works in two steps.
|
||
First we generate a set of subtype constraints, which are later unified resulting in a type solution consisting of type unifiers $\sigma$.
|
||
Every missing type is replaced by a type placeholder.
|
||
In this example the type placeholder $\tv{var}$ is a placeholder for the type of the \texttt{var} variable.
|
||
The $\tv{ret}$ placeholder represents the return type of the \texttt{untypedMethod} method.
|
||
Also shown as a comment behind the respective method call.
|
||
There are two types of type constraints.
|
||
A subtype constraint like $\tv{var} \lessdot \type{C1}$ meaning that the unification algorithm has to find
|
||
a type replacement for $\tv{var}$ which is a subtype of $\type{C1}$.
|
||
Due to the Java type system being reflexive one possible solution would be $\sigma(\tv{var}) = \type{C1}$.
|
||
A constraint $\tv{var} \doteq \type{C1}$ directly results in $\sigma(\tv{ret}) = \type{C1}$.
|
||
|
||
Our type inference algorithms for Java are described in a formal manner in
|
||
\cite{MartinUnify}, \cite{TIforGFJ}.
|
||
The first prototype implementation put the focus on a correct implementation rather than
|
||
fast execution times.
|
||
Depending on the input it currently takes upto several minutes or even days to compute all or even one possible type solution.
|
||
To make them usable in practice we now focus on decreasing the runtime to a feasible level.
|
||
|
||
\section{Motivation}
|
||
The nature of the global type inference algorithm causes the search space of the unification algorithm to
|
||
increase exponentially with every ambiguous method call.
|
||
Java allows overloaded methods causing our type inference algorithm to create so called Or-Constraints.
|
||
This happens if multiple classes implement a method with the same name and the same amount of parameters.
|
||
Given the input program in figure \ref{fig:introExample} we do not know, which method \texttt{self} is meant for the method call
|
||
\texttt{var.self()}, because there is no type annotation for \texttt{var}.
|
||
|
||
|
||
%\begin{figure}[h]
|
||
%\begin{minipage}{0.49\textwidth}
|
||
% \end{minipage}%
|
||
% \hfill
|
||
% \begin{minipage}{0.49\textwidth}
|
||
% \begin{lstlisting}[style=constraints]
|
||
% (*@$\tv{var} \lessdot \type{C1} | $@*),
|
||
% (*@$\sigma(\tv{xp}) = \wctype{\wildcard{A}{\type{String}}{\bot}}{List}{\rwildcard{A}}$@*)
|
||
% \end{lstlisting}
|
||
% \end{minipage}
|
||
%\end{figure}
|
||
|
||
An Or-Constraint like $\set{\tv{var} \lessdot \type{C1}, \tv{ret} \doteq \type{C1} }\ ||\ \set{\tv{var} \lessdot \type{C2}, \tv{ret} \doteq \type{C2}}$
|
||
means that either the the constraint set $\set{\tv{var} \lessdot \type{C1}, \tv{ret} \doteq \type{C1} }$
|
||
or $\set{\tv{var} \lessdot \type{C2}, \tv{ret} \doteq \type{C2}}$ has to be satisfied.
|
||
If we set the type of \texttt{var} to \texttt{C1}, then the return type of the method will be \texttt{C1} aswell.
|
||
If we set it to \texttt{C2} then also the return type will be \texttt{C2}.
|
||
There are two possibilities therefore the Or-Constraint.
|
||
|
||
If we chain multiple overloaded method calls together we end up with multiple Or-Constraints.
|
||
The type unification algorithm \unify{} only sees the supplied constraints and has to potentially try all combinations of them.
|
||
This is proven in \cite{TIforGFJ} and is the reason why type inference for Featherweight Generic Java is NP-hard.
|
||
Let's have a look at the following alternation of our example:
|
||
|
||
\begin{figure}[h]
|
||
\begin{lstlisting}
|
||
untypedMethod(var){
|
||
return var.self().self().self().self();
|
||
}
|
||
\end{lstlisting}
|
||
\caption{Stacked ambiguous method calls}\label{fig:benchmark}
|
||
\end{figure}
|
||
|
||
Now there are four chained method calls leading to four Or-Constraints:
|
||
\begin{align*}
|
||
\set{\tv{var} \lessdot \type{C1}, \tv{r1} \doteq \type{C1} }\ ||\ \set{\tv{var} \lessdot \type{C2}, \tv{r1} \doteq \type{C2}} \\
|
||
\set{\tv{r1} \lessdot \type{C1}, \tv{r2} \doteq \type{C1} }\ ||\ \set{\tv{r1} \lessdot \type{C2}, \tv{r2} \doteq \type{C2}} \\
|
||
\set{\tv{r2} \lessdot \type{C1}, \tv{r3} \doteq \type{C1} }\ ||\ \set{\tv{r2} \lessdot \type{C2}, \tv{r3} \doteq \type{C2}} \\
|
||
\set{\tv{r3} \lessdot \type{C1}, \tv{ret} \doteq \type{C1} }\ ||\ \set{\tv{r3} \lessdot \type{C2}, \tv{ret} \doteq \type{C2}} \\
|
||
\end{align*}
|
||
|
||
The placeholder $\tv{r1}$ stands for the return type of the first call to \texttt{self} and
|
||
$\tv{r2}$ for the return type of the second call and so on.
|
||
It is obvious that this constraint set only has two solutions.
|
||
The variable \texttt{var} and the return type of the method aswell as all intermediate placeholders $\tv{r1}-\tv{r3}$
|
||
get either the type \texttt{C1} or \texttt{C2}.
|
||
A first prototype implementation of the \unify{} algorithm simply created the cartesian product of all Or-Constraints,
|
||
16 possibilities in this example, and iteratively tried all of them.
|
||
This leads to a exponential runtime increase with every added overloaded method call.
|
||
Eventhough the current algorithm is equipped with various optimizations as presented in \cite{plue181} and \cite{plue231}
|
||
there are still runtime issues when dealing with many Or-Constraints.
|
||
|
||
Our global type inference algorithm should construct type solutions in real time.
|
||
Then it can effectively used as a Java compiler which can deal with large inputs of untyped Java code.
|
||
Another use case scenario is as an editor plugin which helps a Java programmer by enabling him to write untyped Java code
|
||
and letting our tool insert the missing type statements.
|
||
For both applications a short execution time is vital.
|
||
|
||
Answer Set programming promises to solve complex search problems in a highly optimized way.
|
||
The idea in this paper is to implement the algorithm presented in \cite{TIforGFJ}
|
||
as an ASP program and see how well it handles our type unification problem.
|
||
|
||
\section{ASP Implementation}
|
||
|
||
Our ASP implementation replaces the unification step of the type inference algorithm.
|
||
So the input is a set of constraints $c$ and the output is a set of unifiers $\sigma$.
|
||
An ASP program consists of implication rules and facts.
|
||
Here the facts are the input constraints and the rules are the ones depicted in chapter \ref{sec:implicationRules}.
|
||
After running the resulting program through an interpreter the output
|
||
holds the desired $\sigma(\tv{a}) = \type{T}$ literals.
|
||
|
||
\begin{figure}[h]
|
||
\center
|
||
\begin{tabular}{lcll}
|
||
$c $ &$::=$ & $\type{T} \lessdot \type{T} \mid \type{T} \doteq \type{T}$ & Constraint \\
|
||
$\type{T} $ & $::=$ & $\tv{a} \mid \type{N}$ & Type placeholder or Type \\
|
||
$\ntype{N}$ & $::=$ & $\exptype{C}{\ol{T}}$ & Class Type containing type placeholders\\
|
||
$\gtype{G}$ & $::=$ & $\exptype{C}{\ol{G}}$ & Class Type not containing type placeholders \\
|
||
\end{tabular}
|
||
\caption{Syntax of types and constraints}
|
||
\label{fig:syntax-constraints}
|
||
\end{figure}
|
||
|
||
%\subsection{Subtyping}
|
||
\begin{figure}[h]
|
||
\begin{mathpar}
|
||
\inferrule[S-Refl]{}{
|
||
\type{T} <: \type{T}
|
||
}
|
||
\and
|
||
\inferrule[S-Trans]{\type{T}_1 <: \type{T}_2 \\ \type{T}_2 <: \type{T}_3}{
|
||
\type{T}_1 <: \type{T}_3
|
||
}
|
||
\and
|
||
\inferrule[S-Var]{}{\type{A} <: \Delta(\type{A})}
|
||
\and
|
||
\inferrule[S-Class]{\texttt{class}\ \exptype{C}{\ol{X}} \triangleleft \type{N}}{
|
||
\exptype{C}{\ol{T}} <: [\ol{T}/\ol{X}]\type{N}
|
||
}
|
||
\end{mathpar}
|
||
\begin{mathpar}
|
||
\inferrule[N-Refl]{}{
|
||
\type{C} << \type{C}
|
||
}
|
||
\and
|
||
\inferrule[N-Trans]{\type{C}_1 << \type{C}_2 \\ \type{C}_2 << \type{C}_3}{
|
||
\type{C}_1 << \type{C}_3
|
||
}
|
||
\and
|
||
\inferrule[N-Class]{\texttt{class}\ \exptype{C}{\ldots} \triangleleft \exptype{D}{\ldots}}{
|
||
\type{C} << \type{D}
|
||
}
|
||
\end{mathpar}
|
||
\caption{Subtyping}\label{fig:subtyping}
|
||
\end{figure}
|
||
|
||
|
||
%Subtyping has no bounds for generic type parameters.
|
||
% but this is propably not needed
|
||
|
||
\subsection{Unification Algorithm}\label{sec:implicationRules}
|
||
|
||
%The $\tv{a} \lessdot \type{Object}$ rule has to be ensured by the incoming constraints.
|
||
%The need to have an upper bound to every type placeholder.
|
||
|
||
All the implication rules depicted in this chapter can be translated to ASP statements
|
||
described in chapter \ref{sec:translation}.
|
||
|
||
\label{sec:implicationRules}
|
||
\begin{mathpar}
|
||
\inferrule[Subst-L]{
|
||
\tv{a} \doteq \type{T}_1 \\
|
||
\tv{a} \lessdot \type{T}_2
|
||
}{
|
||
\type{T}_1 \lessdot \type{T}_2
|
||
}
|
||
\and
|
||
\inferrule[Subst-R]{
|
||
\tv{a} \doteq \type{T}_1 \\
|
||
\type{T}_2 \lessdot \tv{a}
|
||
}{
|
||
\type{T}_2 \lessdot \type{T}_1
|
||
}
|
||
\and
|
||
\inferrule[Subst-Equal]{
|
||
\tv{a} \doteq \type{T}_1 \\
|
||
\tv{a} \doteq \type{T}_2
|
||
}{
|
||
\type{T}_1 \doteq \type{T}_2
|
||
}
|
||
\and
|
||
\inferrule[Swap]{
|
||
\type{T}_1 \doteq \type{T}_2
|
||
}{
|
||
\type{T}_2 \doteq \type{T}_1
|
||
}
|
||
\and
|
||
\inferrule[Unfold]{
|
||
\tv{b} \doteq \exptype{C}{\type{T}_1 \ldots \type{T}_n}
|
||
}{
|
||
\type{T}_i \doteq \type{T}_i
|
||
}
|
||
\and
|
||
\inferrule[Unfold']{
|
||
\tv{b} \lessdot \exptype{C}{\type{T}_1 \ldots \type{T}_n}
|
||
}{
|
||
\type{T}_i \doteq \type{T}_i
|
||
}
|
||
\and
|
||
\inferrule[Subst-Param]{
|
||
\type{T}' \doteq \type{S} \\
|
||
\type{T} \doteq \exptype{C}{\type{T}_1 \ldots, \type{T}', \ldots \type{T}_n} \\
|
||
}{
|
||
\type{T} \doteq \exptype{C}{\type{T}_1, \ldots \type{S}, \ldots \type{T}_n}
|
||
}
|
||
\and
|
||
\inferrule[Subst-Param']{
|
||
\type{T}' \doteq \type{S} \\
|
||
\tv{b} \lessdot \exptype{C}{\type{T}_1 \ldots, \type{T}', \ldots \type{T}_n} \\
|
||
}{
|
||
\tv{b} \lessdot \exptype{C}{\type{T}_1, \ldots \type{S}, \ldots \type{T}_n}
|
||
}
|
||
\and
|
||
\inferrule[S-Object]{}{\tv{a} \lessdot \type{Object}}
|
||
\and
|
||
\inferrule[Match]{
|
||
\tv{a} \lessdot \type{N}_1 \\
|
||
\tv{a} \lessdot \type{N}_2 \\
|
||
\type{N}_1 << \type{N}_2
|
||
}{
|
||
\type{N}_1 \lessdot \type{N}_2
|
||
}
|
||
\and
|
||
\inferrule[Adopt]{
|
||
\tv{a} \lessdot \tv{b} \\
|
||
\tv{b} \lessdot \type{T}
|
||
}{
|
||
\tv{a} \lessdot \type{T}
|
||
}
|
||
\and
|
||
% \inferrule[Subst-Param]{
|
||
% \tv{a} \doteq \type{N} \\
|
||
% \tv{a} = \type{T}_i \\
|
||
% \exptype{C}{\type{T}_1 \ldots \type{T}_n} \lessdot \type{T} \\
|
||
% }{
|
||
% \type{T}_i \doteq \type{N} \\
|
||
% }
|
||
\and
|
||
\inferrule[Adapt]{
|
||
\type{N}_1 \lessdot \exptype{C}{\type{T}_1 \ldots \type{T}_n} \\
|
||
\type{N}_1 <: \exptype{C}{\type{S}_1 \ldots \type{S}_n} \\
|
||
}{
|
||
\exptype{C}{\type{S}_1 \ldots \type{S}_n} \doteq \exptype{C}{\type{T}_1 \ldots \type{T}_n} \\
|
||
}
|
||
\and
|
||
\inferrule[Reduce]{
|
||
\exptype{C}{\type{S}_1 \ldots \type{S}_n} \doteq \exptype{C}{\type{T}_1 \ldots \type{T}_n} \\
|
||
}{
|
||
\type{S}_i \doteq \type{T}_i \\
|
||
}
|
||
\end{mathpar}
|
||
|
||
\begin{mathpar}
|
||
\text{Apply only once per constraint:}\quad
|
||
\inferrule[Super]{
|
||
\type{N} \lessdot \tv{a}\\
|
||
\type{N} <: \type{N}'
|
||
}{
|
||
\tv{a} \doteq \type{N}'
|
||
}
|
||
\end{mathpar}
|
||
|
||
\begin{center}
|
||
Apply one or the other:
|
||
\end{center}
|
||
\begin{mathpar}
|
||
\inferrule[Split-L]{
|
||
\tv{a} \lessdot \tv{b}\\
|
||
\tv{a} \lessdot \type{N}\\
|
||
}{
|
||
\tv{b} \lessdot \type{N}
|
||
}
|
||
\quad \quad
|
||
\vline
|
||
\quad \quad
|
||
\inferrule[Split-R]{
|
||
\tv{a} \lessdot \tv{b}\\
|
||
\tv{a} \lessdot \type{N}\\
|
||
}{
|
||
\type{N} \lessdot \tv{b}
|
||
}
|
||
\end{mathpar}
|
||
|
||
Result:
|
||
\begin{mathpar}
|
||
\inferrule[Solution]{
|
||
\tv{a} \doteq \type{G}
|
||
}{
|
||
\sigma(\tv{a}) = \type{G}
|
||
}
|
||
\and
|
||
\inferrule[Solution-Gen]{
|
||
\tv{a} \lessdot \type{C}_1, \ldots, \tv{a} \lessdot \type{C}_n \\
|
||
\forall i: \type{C}_m << \type{C}_i \\
|
||
}{
|
||
\tv{a} \doteq \type{C}_m
|
||
}
|
||
% \and
|
||
% \inferrule[Solution-Gen]{
|
||
% \tv{a} \lessdot \type{C}\\
|
||
% \sigma(\tv{a}) = \emptyset
|
||
% }{
|
||
% \tv{a} \doteq \type{A} \\ \sigma'(\tv{a}) = \type{A}
|
||
% }
|
||
% \and
|
||
% \inferrule[Solution-Gen]{
|
||
% \tv{a} \lessdot \type{G} \\
|
||
% \tv{a} \lessdot \type{G}_1, \ldots, \tv{a} \lessdot \type{G}_n \\
|
||
% \forall i: \type{G} << \type{G}_i \\
|
||
% \sigma'(\tv{a}) = \type{A}
|
||
% }{
|
||
% \Delta(\type{A}) = \type{G}
|
||
% }
|
||
\end{mathpar}
|
||
|
||
Fail:
|
||
|
||
\begin{mathpar}
|
||
% \inferrule[Fail]{
|
||
% \type{T} \lessdot \type{N}\\
|
||
% \type{T} \nless : \type{N}
|
||
% }{
|
||
% \emptyset
|
||
% }
|
||
% \and
|
||
\inferrule[Fail]{
|
||
\exptype{C}{\ldots} \doteq \exptype{D}{\ldots}\\
|
||
\type{C} \neq \type{D}
|
||
}{
|
||
\emptyset
|
||
}
|
||
\and
|
||
\inferrule[Fail-Generic]{
|
||
\type{X} \doteq \type{T}\\
|
||
\type{X} \neq \type{T}
|
||
}{
|
||
\emptyset
|
||
}
|
||
\and
|
||
\inferrule[Fail-Sigma]{
|
||
\tv{a} \doteq \type{N} \\
|
||
\tv{a} \in \type{N}
|
||
}{
|
||
\emptyset
|
||
}
|
||
\and
|
||
\inferrule[Fail]{
|
||
\tv{a} \lessdot \type{N}_1 \\
|
||
\tv{a} \lessdot \type{N}_2 \\
|
||
\text{not}\ \type{N}_1 << \type{N}_2 \\
|
||
\text{not}\ \type{N}_2 << \type{N}_1
|
||
}{
|
||
\emptyset
|
||
}
|
||
\end{mathpar}
|
||
|
||
% Subst
|
||
% a =. N, a <. T, N <: T
|
||
% --------------
|
||
% N <. T
|
||
|
||
% a <. List<b>, b <. List<a>
|
||
|
||
% how to proof completeness and termination?
|
||
% TODO: how to proof termination?
|
||
|
||
The algorithm terminates if every type placeholder in the input constraint set has an assigned type.
|
||
|
||
\section{ASP Encoding}\label{sec:translation}
|
||
Answer Set Programming is a declarative way of formulating search problems.
|
||
ASP statements consist of a head and a body separated by a implication operator \textquotedbl{}\texttt{:-}\textquotedbl{}:
|
||
\texttt{head :- body}.
|
||
This statement is interpreted as an implication rule.
|
||
If the premises in the body are satisfied the facts stated in the head are deducted.
|
||
\newcommand{\aspify}{\nabla}
|
||
The implication rules defined in chapter \ref{sec:implicationRules} are formulated in a way that can be translated to an ASP program
|
||
(see figure \ref{fig:aspTransformationRules}).
|
||
|
||
% \begin{tabular}{l | r}
|
||
% $\tv{a}$ & \texttt{tph("a")}\\
|
||
% $\exptype{C}{}$ & \texttt{type("C", null)}\\
|
||
% $\exptype{C}{\ol{X}}$ & \texttt{type("C", params(\ldots))}\\
|
||
% \end{tabular}
|
||
|
||
\begin{figure}[h]
|
||
\begin{align*}
|
||
\aspify(\tv{a}) =& \texttt{tph("a")}\\
|
||
\exptype{C}{} =& \texttt{type("C", null)}\\
|
||
\exptype{C}{\type{T}_1, \type{T}_2, \ldots} =& \texttt{type("C", params(}\aspify(\type{T}_1), \aspify(\type{T}_1), \ldots\texttt{))}\\
|
||
\aspify(\type{T} \doteq \type{T}') =& \texttt{equalcons}(\aspify(\type{T}), \aspify(\type{T}')) \\
|
||
\aspify(\type{T} \lessdot \type{T}') =& \texttt{subcons}(\aspify(\type{T}), \aspify(\type{T}')) \\
|
||
\aspify{} \left(\inferrule{
|
||
c_1, c_2
|
||
}{
|
||
c
|
||
}\right) =& \aspify(c)\ \texttt{:-} \aspify(c_1), \aspify(c_2) \\
|
||
c_1 \ || \ c_2 =& \texttt{orcons}(\aspify(c_1), \aspify(c_2))
|
||
\end{align*}
|
||
\caption{Transformation to ASP code}\label{fig:aspTransformationRules}
|
||
\end{figure}
|
||
|
||
%The $\aspify{}$ function converts our formal specification of the algorithm to ASP code.
|
||
|
||
This also has to be done for the subtype rules in figure \ref{fig:subtyping}.
|
||
There the S-Class rule needs special handling because it contains a substitution.
|
||
It has to be encoded using variables.
|
||
Given a extends relation $\texttt{class}\ \exptype{C}{\type{X}} \triangleleft \exptype{D}{\type{X}}$
|
||
we generate the ASP code:\\
|
||
{\small{\texttt{subtype(type("C", params(X)), type("D", params(X))) :- subtype(type("C", params(X))).}}}
|
||
|
||
Capital letters like \texttt{X} are variables in ASP and the former statement causes any
|
||
literal like \texttt{subtype(type("C", params(tph("a"))))} to imply the literal
|
||
\texttt{subtype(type("C", params(tph("a"))), type("D", params(tph("a"))))}.
|
||
|
||
The vital part is the generation of the cartesian product of the Or-Constraints.\\
|
||
{{\texttt{subcons(A,B); orCons(C,D) :- orCons(subcons(A,B), subcons(C,D)).}}}\\
|
||
The operator \textquotedbl{}\texttt{;}\textquotedbl{} tells ASP to consider either the left or the right side.
|
||
By supplying multiple Or-Constraints this way the ASP interpreter will consider all combinations, the cartesian product of the or constraints.
|
||
|
||
The rules implying a failure $\emptyset$ are translated by leaving the head of the asp statement empty.
|
||
If the body of such rules is satisfied the algorithm considers the deduction as incorrect.
|
||
% \section{Proofs}
|
||
% \begin{lemma}{Substitution:}
|
||
% For every $\tv{a} \doteq \type{T}$ and every constraint $c$,
|
||
% there also exists a constraint $[\type{T}/\tv{a}]c$.
|
||
|
||
% \noindent
|
||
% \normalfont
|
||
% This lemma proofs that equal constraints lead to a substitution in every other constraint.
|
||
% For exmaple $\tv{b} \doteq \type{String}$ and $\tv{b} \lessdot \exptype{Comparable}{\tv{b}}$
|
||
% lead to a constraint $\type{String} \lessdot \exptype{Comparable}{\type{String}}$
|
||
% \end{lemma}
|
||
% \textit{Proof:}
|
||
% %TODO
|
||
|
||
% \section{Termination}
|
||
|
||
% The amount of different constraints is limited by the maximum amount of encapsulated generics.
|
||
% The only part that is able to add an additional nesting is the Subst-Param rule.
|
||
% Here a type placeholder inside a type parameter list is replaced by another type which possibly adds
|
||
% another layer of nesting but it also removes one type placeholder.
|
||
|
||
% There must be one substitution that does not add another type placeholder.
|
||
% Otherwise there has to be a loop and this woul lead to an incorrect constraint set due to the Fail-Sigma rule.
|
||
|
||
% The Subst-Param rule can only be applied a finite number of times.
|
||
% Due to the substitution lemma and the Sigma-Fail rule the Subst-Param rule can only be applied once per type placeholder.
|
||
|
||
% The Subst-Param rule can only be applied once per type placeholder.
|
||
% If $\type{T} \doteq \exptype{C}{\tv{a}}$ is substituted to $\type{T} \doteq \exptype{C}{\type{N}}$
|
||
% then there is no $\tv{a} \in \type{N}$.
|
||
|
||
% \section{Completeness}
|
||
% To proof completeness we have to show that every type can be replaced by a placeholder in a correct constraint set.
|
||
|
||
% Completeness -> we never exclude a solution
|
||
% Following constraints stay: $\tv{a} \lessdot \type{T}$ if $\tv{a}$ is never on a right side of another constraint.
|
||
% Every other type placeholder will be reduced to $\tv{a} \doteq \type{T}$, if there is a solution.
|
||
% Proof:
|
||
% %Induction over every possible constraint variation:
|
||
% a =. T -> induction start
|
||
% a <. T -> if no other constraint then it can stay otherwise there is either a =. T or a <. T
|
||
% in latter case: a <. T, a <. T'
|
||
|
||
% Proof that every type can be replaced by a Type Placeholder.
|
||
% % Whats with a =. T, can T be replaced by a Type Placeholder?
|
||
% % What is our finish condition? a <. T constraints stay, a =. b constraints stay.
|
||
% % Algorithm does not fail -> \emptyset if a solution exists
|
||
% % Otherwise there exists a substitution. If the algorithm succeeds we have to pick one of the possible solutions
|
||
% % by: a <. T -> a =.T
|
||
% % a =. b, b =. T -> use the solution generation from other paper
|
||
% % TODO: try to include solution generation in the algorithm and proof that this solution is valid and will always occur as long as there is a solution
|
||
|
||
% Soundness -> we never make a wrong implication
|
||
% %$\tv{a} \doteq \type{T}$ means that $\[type{T}/\tv{a}]C$ is correct
|
||
% If it succeeds then we can substitute all $\tv{a} \doteq \type{T}$
|
||
% constraints in the original constraint set and
|
||
% there exists a typing for the remaining type placeholders
|
||
% so that the constraint set is satisfied.
|
||
|
||
% \begin{theorem}{Soundness}\label{lemma:soundness}
|
||
% if $\type{T} \lessdot \type{T'}$ and $\sigma(\tv{a}) = \type{N}$
|
||
% then $[\type{N}/\tv{a}]\type{T} <: [\type{N}/\tv{a}]\type{T'}$.
|
||
% \end{theorem}
|
||
|
||
% \SetEnumitemKey{ncases}{itemindent=!,before=\let\makelabel\ncasesmakelabel}
|
||
% \newcommand*\ncasesmakelabel[1]{Case #1}
|
||
|
||
% \newenvironment{subproof}
|
||
% {\def\proofname{Subproof}%
|
||
% \def\qedsymbol{$\triangleleft$}%
|
||
% \proof}
|
||
% {\endproof}
|
||
% Due to Match there must be $\type{N}_1 \lessdot \type{N}_2 \ldots \lessdot \type{N}_n$
|
||
% \begin{proof}
|
||
% \begin{enumerate}[ncases]
|
||
% \item $\tv{a} \lessdot \exptype{C}{\ol{T}}$.
|
||
% Solution-Sub
|
||
% Let $\sigma(\tv{a}) = \type{N}$. Then $\type{N} <: \exptype{C}{[\type{N}/\tv{a}]}$
|
||
% \item $\tv{a} \doteq \type{N}$.
|
||
% Solution
|
||
% \item $\tv{a} \lessdot \tv{b}$.
|
||
% There must be a $\tv{a} \lessdot \type{N}$
|
||
% \begin{subproof}
|
||
% $\sigma(\tv{a}) = \type{Object}$,
|
||
% $\sigma(\tv{b}) = \type{Object}$.
|
||
% \end{subproof}
|
||
% \item $\type{N} \lessdot \tv{a}$.
|
||
% \begin{subproof}
|
||
% $2$
|
||
% \end{subproof}
|
||
% \end{enumerate}
|
||
% And more text.
|
||
% \end{proof}
|
||
|
||
% \begin{lemma}{Substitution}
|
||
% \begin{description}
|
||
% \item[If] $\tv{a} \doteq \type{N}$ with $\tv{a} \neq \type{N}$
|
||
% \item[Then] for every $\type{T} \doteq \type{T}$ there exists a $[\type{N}/\tv{a}]\type{T} \doteq [\type{N}/\tv{a}]\type{T}$
|
||
% \item[Then] for every $\type{T} \lessdot \type{T}$ there exists a $[\type{N}/\tv{a}]\type{T} \lessdot [\type{N}/\tv{a}]\type{T}$
|
||
% \end{description}
|
||
% \end{lemma}
|
||
% \textit{Proof:}
|
||
% TODO
|
||
|
||
% \begin{lemma} \label{lemma:subtypeOnly}
|
||
% If $\sigma(\tv{a}) = \emptyset$ then $\tv{a}$ appears only on the left side of $\tv{a} \lessdot \type{T}$ constraints.
|
||
% \end{lemma}
|
||
% Proof:
|
||
% Every type placeholder gets a solution, because there must be atleast one $\tv{a} \lessdot \type{N}$ constraint.
|
||
% Then either the Solution-Sub generates a $\sigma$ or the Solution rule can be used TODO: Proof.
|
||
|
||
% The Solution-Sub rule is always correct.
|
||
|
||
% Proof:
|
||
|
||
% \begin{theorem}{Termination}
|
||
% %jede nichtendliche Menge von Constraints bleibt endlich. Die Regeln können nicht unendlich oft angewendet werden
|
||
% %Trivial. The only possibility would be if we allow a =. C<a> constraints!
|
||
% \end{theorem}
|
||
|
||
% TODO: For completeness we have to proof that not $\tv{a} \doteq \type{N}$ only is the case if $\tv{a}$ only appears on the left side of $\tv{a} \lessdot \type{T}$ constraints.
|
||
% Problem: a <. List<a>
|
||
% a <. List<b>
|
||
% then a =. b
|
||
% Solution: Keep the a <. N constraints and apply the Step 6 from the GTFGJ paper.
|
||
% Then we have to proof that only a <. N constraints remain with sigma(a) = empty. Or Fail
|
||
|
||
% \begin{theorem}{Completeness}
|
||
% $\forall \tv{a} \in C_{input}: \sigma(\tv{a}) = \type{N}$, if there is a solution for $C_{input}$
|
||
% and every type placeholder has an upper bound $\tv{a} \lessdot \type{N}$.
|
||
% \end{theorem}
|
||
% %Problem: We do not support multiple inheritance
|
||
% \begin{proof}
|
||
% \begin{enumerate}[ncases]
|
||
% \item $\tv{a} \lessdot \type{N}$.
|
||
% Solution-Sub
|
||
% \item $\tv{a} \doteq \type{N}$.
|
||
% Solution
|
||
% \item $\tv{a} \lessdot \tv{b}$.
|
||
% There must be a $\tv{a} \lessdot \type{N}$
|
||
% \begin{subproof}
|
||
% $\sigma(\tv{a}) = \type{Object}$,
|
||
% $\sigma(\tv{b}) = \type{Object}$.
|
||
% \end{subproof}
|
||
% \item $\type{N} \lessdot \tv{a}$.
|
||
% \begin{subproof}
|
||
% $2$
|
||
% \end{subproof}
|
||
% \end{enumerate}
|
||
% And more text.
|
||
% \end{proof}
|
||
|
||
|
||
%\section{Discussion}
|
||
% We cannot use Datalog, because it cannot solve NP-Hard problems.
|
||
% See: E. Dantsin, T. Eiter, G. Gottlob, and A. Voronkov. Complexity and Expressive Power of Logic Programming.
|
||
% ACM Computing Surveys, 33(3):374–425, 2001. Available at
|
||
% http://www.kr.tuwien.ac.at/staff/eiter/et-archive/
|
||
% Source: https://www.cs.ox.ac.uk/files/1018/gglecture7.pdf
|
||
|
||
|
||
\section{Outcome and Conclusion}
|
||
ASP handles Or-Constraints surprisingly well.
|
||
We tested our ASP implementation of the unification algorithm with the constraints originating from the program in
|
||
figure \ref{fig:benchmark}.
|
||
We compared it with the current implementation of the Unification algorithm in Java
|
||
\footnote{\url{https://gitea.hb.dhbw-stuttgart.de/JavaTX/JavaCompilerCore}}.
|
||
We increased the complexity by adding more stacked method calls up to ten stacked calls and the interpreter clingo \footnote{\url{https://potassco.org/clingo/}}
|
||
was still able to handle it easily and finish computation in under 50 milliseconds.
|
||
By contrast our Java implementation already takes multiple seconds processing time for the same input.
|
||
|
||
We are using this example for a fair comparision between the two implementations because it does not include generic type parameters causing the unification algorithm
|
||
not to consider wildcard types.
|
||
The Java implementation also supports Java wildcard types whereas the ASP implementation does not.
|
||
%It is only possible to implement a type inference algorithm for Java as long as we omit wildcard types.
|
||
It is unfortunately not possible to implement the unification algorithm including wildcard support with Answer Set Programming.
|
||
The reason is that subtype checking in Java is turing complete \cite{javaTuringComplete}.
|
||
The grounding process of the ASP interpreter clingo would not terminate if the problem is turing complete
|
||
\cite{kaufmann2016grounding}.
|
||
The Java implementation is still able to spawn atleast one solution in most cases and only never terminates for specific inputs,
|
||
whereas the ASP program never terminates as soon as wildcards are involved.
|
||
|
||
|
||
%\section{Future Work}
|
||
% Benchmarks
|
||
% Integrating the ASP Unify implementation into existing Java-TX Compiler
|
||
% Checking how many programs are abel to be build without wildcards
|
||
|
||
\bibliographystyle{eptcs}
|
||
\bibliography{martin}
|
||
|
||
\end{document}
|