WLP2024/aspUnify.tex

730 lines
29 KiB
TeX
Raw Normal View History

2024-06-08 11:49:49 +00:00
% This is samplepaper.tex, a sample chapter demonstrating the
% LLNCS macro package for Springer Computer Science proceedings;
% Version 2.21 of 2022/01/12
%
2024-06-28 11:54:11 +00:00
\documentclass[submission]{eptcs}
\providecommand{\event}{Symposium in honor of Peter Thiemann's 60th birthday} % Name of the event you are submitting to
\usepackage{underscore}
2024-06-08 11:49:49 +00:00
\usepackage[T1]{fontenc}
% T1 fonts will be used to generate the final print and online PDFs,
% so please use T1 fonts in your manuscript whenever possible.
% Other font encondings may result in incorrect characters.
%
\usepackage{graphicx}
% Used for displaying a sample figure. If possible, figure files should
% be included in EPS format.
%
% If you use the hyperref package, please uncomment the following two lines
% to display URLs in blue roman font according to Springer's eBook style:
%\usepackage{color}
%\renewcommand\UrlFont{\color{blue}\rmfamily}
%\urlstyle{rm}
%
2024-06-10 07:03:46 +00:00
\include{prolog}
\usepackage{mathpartir}
\usepackage{amsmath}
\usepackage{amssymb}
2024-06-28 11:54:11 +00:00
\usepackage{amsthm}
2024-06-11 10:00:41 +00:00
\usepackage{enumitem}
2024-06-30 22:12:38 +00:00
\usepackage{xcolor}
2024-06-11 10:00:41 +00:00
%\usepackage{amsthm}
2024-06-10 07:03:46 +00:00
2024-06-28 11:54:11 +00:00
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}[theorem]{Lemma}
2024-06-30 22:12:38 +00:00
\title{Global Type Inference for Java using Answer Set Programming}
2024-06-28 11:54:11 +00:00
\author{Andreas Stadelmeier
\institute{DHBW\\ Stuttgart}
\email{a.stadelmeier@hb.dhbw-stuttgart.de}
}
\newcommand{\authorrunning}{Andreas Stadelmeier}
2024-06-30 22:12:38 +00:00
\newcommand{\titlerunning}{Type Unification with ASP}
2024-06-28 11:54:11 +00:00
\hypersetup{
bookmarksnumbered,
pdftitle = {\titlerunning},
pdfauthor = {\authorrunning},
pdfsubject = {Global Type Inference for Java}, % Consider adding a more appropriate subject or description
pdfkeywords = {typeinference, java, sat solving, answer set programming} % Uncomment and enter keywords specific to your paper
}
\begin{document}
2024-06-08 11:49:49 +00:00
%
% First names are abbreviated in the running head.
% If there are more than two authors, 'et al.' is used.
%
%
\maketitle % typeset the header of the contribution
%
\begin{abstract}
2024-06-28 13:52:42 +00:00
Global type inference for Java is able to compute correct types for an input program
2024-06-30 22:12:38 +00:00
which has no type annotations at all, but turns out to be NP-hard.
Former implementations of the algorithm therefore struggled with bad runtime performance.
In this paper we translate the type inference algorithm for Featherweight Generic Java to an
2024-06-28 13:52:42 +00:00
Answer Set Program.
Answer Set Programming (ASP) promises to solve complex computational search problems
as long as they are finite domain.
2024-07-01 15:20:04 +00:00
This paper shows that it is possible to implement global type inference for Featherweight Generic Java (FGJ) as an ASP program.
Afterwards we compared the ASP implementation with our own implementation in Java
exposing promising results.
2024-06-30 22:12:38 +00:00
% We also show soundness, completeness and termination for our ASP implementation.
Another advantage is that the specification of the algorithm can be translated one on one to ASP code
leading to less errors in the respective implementation.
2024-07-01 15:20:04 +00:00
Unfortunately ASP can only be used for type inference for a subset of the Java type system that excludes wildcard types.
2024-06-08 11:49:49 +00:00
\end{abstract}
%
%
%
2024-07-01 15:20:04 +00:00
\section{Global Type Inference}
% Every major typed programming language uses some form of type inference.
% Rust, Java, C++, Haskell, etc... %(see: https://en.wikipedia.org/wiki/Type_inference)
% Type inference adds alot of value to a programming language.
% \begin{itemize}
% \item Code is more readable.
% \item Type inference usually does a better job at finding types than a programmer.
% \item Type inference can use types that are not denotable by the programmer.
% \item Better reusability.
% \item Allows for faster development and the programmer to focus on the actual task instead of struggling with the type system.
% \end{itemize}
% Java has adopted more and more type inference features over time.
% %TODO: list type inference additions
% Currently Java only has local type inference.
% We want to bring type inference for Java to the next level.
Java is a strictly typed programming language.
The current versions come with a local type inference feature that allows the programmer to omit
type annotations in some places like argument types of lambda expressions for example.
But methods still have to be fully typed including argument and return types.
We work at a global type inference algorithm for Java which is able to compute correct types
for a Java program with no type annotations at all.
\begin{figure}[h]
2024-06-30 22:12:38 +00:00
\begin{lstlisting}
class C1 {
C1 self(){
return this;
}
}
class C2 {
C2 self(){
return this;
}
}
class Example {
untypedMethod(var){
2024-07-01 15:20:04 +00:00
return var.self(); // (*@$\implies \set{\tv{var} \lessdot \type{C1}, \tv{ret} \doteq \type{C1} }\ ||\ \set{\tv{var} \lessdot \type{C2}, \tv{ret} \doteq \type{C2}}$@*)
2024-06-30 22:12:38 +00:00
}
}
\end{lstlisting}
2024-07-01 15:20:04 +00:00
\caption{Java program missing type annotations}\label{fig:introExample}
\end{figure}
A possible input for our algorithm is shonw in figure \ref{fig:introExample}.
Here the method \texttt{untypedMethod} is missing its argument type for \texttt{var} and its return type.
Global type inference works in two steps.
First we generate a set of subtype constraints, which are later unified resulting in a type solution consisting of type unifiers $\sigma$.
Every missing type is replaced by a type placeholder.
In this example the type placeholder $\tv{var}$ is a placeholder for the type of the \texttt{var} variable.
The $\tv{ret}$ placeholder represents the return type of the \texttt{untypedMethod} method.
Also shown as a comment behind the respective method call.
There are two types of type constraints.
A subtype constraint like $\tv{var} \lessdot \type{C1}$ meaning that the unification algorithm has to find
a type replacement for $\tv{var}$ which is a subtype of $\type{C1}$.
Due to the Java type system being reflexive one possible solution would be $\sigma(\tv{var}) = \type{C1}$.
A constraint $\tv{var} \doteq \type{C1}$ directly results in $\sigma(\tv{ret}) = \type{C1}$.
Our type inference algorithms for Java are described in a formal manner in
\cite{MartinUnify}, \cite{TIforGFJ}.
The first prototype implementation put the focus on a correct implementation rather than
fast execution times.
Depending on the input it currently takes upto several minutes or even days to compute all or even one possible type solution.
To make them usable in practice we now focus on decreasing the runtime to a feasible level.
\section{Motivation}
The nature of the global type inference algorithm causes the search space of the unification algorithm to
increase exponentially with every ambiguous method call.
Java allows overloaded methods causing our type inference algorithm to create so called Or-Constraints.
This happens if multiple classes implement a method with the same name and the same amount of parameters.
Given the input program in figure \ref{fig:introExample} we do not know, which method \texttt{self} is meant for the method call
\texttt{var.self()}, because there is no type annotation for \texttt{var}.
%\begin{figure}[h]
%\begin{minipage}{0.49\textwidth}
2024-06-30 22:12:38 +00:00
% \end{minipage}%
% \hfill
% \begin{minipage}{0.49\textwidth}
% \begin{lstlisting}[style=constraints]
% (*@$\tv{var} \lessdot \type{C1} | $@*),
% (*@$\sigma(\tv{xp}) = \wctype{\wildcard{A}{\type{String}}{\bot}}{List}{\rwildcard{A}}$@*)
% \end{lstlisting}
% \end{minipage}
%\end{figure}
An Or-Constraint like $\set{\tv{var} \lessdot \type{C1}, \tv{ret} \doteq \type{C1} }\ ||\ \set{\tv{var} \lessdot \type{C2}, \tv{ret} \doteq \type{C2}}$
means that either the the constraint set $\set{\tv{var} \lessdot \type{C1}, \tv{ret} \doteq \type{C1} }$
or $\set{\tv{var} \lessdot \type{C2}, \tv{ret} \doteq \type{C2}}$ has to be satisfied.
If we set the type of \texttt{var} to \texttt{C1}, then the return type of the method will be \texttt{C1} aswell.
If we set it to \texttt{C2} then also the return type will be \texttt{C2}.
There are two possibilities therefore the Or-Constraint.
If we chain multiple overloaded method calls together we end up with multiple Or-Constraints.
The type unification algorithm \unify{} only sees the supplied constraints and has to potentially try all combinations of them.
This is proven in \cite{TIforGFJ} and is the reason why type inference for Featherweight Generic Java is NP-hard.
Let's have a look at the following alternation of our example:
2024-07-01 15:20:04 +00:00
\begin{figure}[h]
2024-06-30 22:12:38 +00:00
\begin{lstlisting}
untypedMethod(var){
return var.self().self().self().self();
}
\end{lstlisting}
2024-07-01 15:20:04 +00:00
\caption{Stacked ambiguous method calls}\label{fig:benchmark}
\end{figure}
2024-06-30 22:12:38 +00:00
Now there are four chained method calls leading to four Or-Constraints:
\begin{align*}
\set{\tv{var} \lessdot \type{C1}, \tv{r1} \doteq \type{C1} }\ ||\ \set{\tv{var} \lessdot \type{C2}, \tv{r1} \doteq \type{C2}} \\
\set{\tv{r1} \lessdot \type{C1}, \tv{r2} \doteq \type{C1} }\ ||\ \set{\tv{r1} \lessdot \type{C2}, \tv{r2} \doteq \type{C2}} \\
\set{\tv{r2} \lessdot \type{C1}, \tv{r3} \doteq \type{C1} }\ ||\ \set{\tv{r2} \lessdot \type{C2}, \tv{r3} \doteq \type{C2}} \\
\set{\tv{r3} \lessdot \type{C1}, \tv{ret} \doteq \type{C1} }\ ||\ \set{\tv{r3} \lessdot \type{C2}, \tv{ret} \doteq \type{C2}} \\
\end{align*}
The placeholder $\tv{r1}$ stands for the return type of the first call to \texttt{self} and
$\tv{r2}$ for the return type of the second call and so on.
It is obvious that this constraint set only has two solutions.
The variable \texttt{var} and the return type of the method aswell as all intermediate placeholders $\tv{r1}-\tv{r3}$
get either the type \texttt{C1} or \texttt{C2}.
A first prototype implementation of the \unify{} algorithm simply created the cartesian product of all Or-Constraints,
16 possibilities in this example, and iteratively tried all of them.
This leads to a exponential runtime increase with every added overloaded method call.
Eventhough the current algorithm is equipped with various optimizations as presented in \cite{plue181} and \cite{plue231}
there are still runtime issues when dealing with many Or-Constraints.
2024-06-30 22:12:38 +00:00
Our global type inference algorithm should construct type solutions in real time.
Then it can effectively used as a Java compiler which can deal with large inputs of untyped Java code.
Another use case scenario is as an editor plugin which helps a Java programmer by enabling him to write untyped Java code
and letting our tool insert the missing type statements.
For both applications a short execution time is vital.
Answer Set programming promises to solve complex search problems in a highly optimized way.
The idea in this paper is to implement the algorithm presented in \cite{TIforGFJ}
as an ASP program and see how well it handles our type unification problem.
2024-07-01 15:20:04 +00:00
\section{ASP Implementation}
Our ASP implementation replaces the unification step of the type inference algorithm.
So the input is a set of constraints $c$ and the output is a set of unifiers $\sigma$.
An ASP program consists of implication rules and facts.
Here the facts are the input constraints and the rules are the ones depicted in chapter \ref{sec:implicationRules}.
After running the resulting program through an interpreter the output
holds the desired $\sigma(\tv{a}) = \type{T}$ literals.
\begin{figure}[h]
\center
\begin{tabular}{lcll}
$c $ &$::=$ & $\type{T} \lessdot \type{T} \mid \type{T} \doteq \type{T}$ & Constraint \\
$\type{T} $ & $::=$ & $\tv{a} \mid \type{N}$ & Type placeholder or Type \\
$\ntype{N}$ & $::=$ & $\exptype{C}{\ol{T}}$ & Class Type containing type placeholders\\
$\gtype{G}$ & $::=$ & $\exptype{C}{\ol{G}}$ & Class Type not containing type placeholders \\
\end{tabular}
\caption{Syntax of types and constraints}
\label{fig:syntax-constraints}
\end{figure}
2024-06-30 22:12:38 +00:00
2024-07-01 15:20:04 +00:00
%\subsection{Subtyping}
\begin{figure}[h]
2024-06-11 09:49:50 +00:00
\begin{mathpar}
2024-07-01 10:32:56 +00:00
\inferrule[S-Refl]{}{
2024-06-11 09:49:50 +00:00
\type{T} <: \type{T}
}
\and
2024-07-01 10:32:56 +00:00
\inferrule[S-Trans]{\type{T}_1 <: \type{T}_2 \\ \type{T}_2 <: \type{T}_3}{
2024-06-11 09:49:50 +00:00
\type{T}_1 <: \type{T}_3
}
\and
2024-07-01 10:32:56 +00:00
\inferrule[S-Var]{}{\type{A} <: \Delta(\type{A})}
\and
2024-07-01 10:32:56 +00:00
\inferrule[S-Class]{\texttt{class}\ \exptype{C}{\ol{X}} \triangleleft \type{N}}{
2024-06-11 09:49:50 +00:00
\exptype{C}{\ol{T}} <: [\ol{T}/\ol{X}]\type{N}
}
\end{mathpar}
\begin{mathpar}
\inferrule[N-Refl]{}{
\type{C} << \type{C}
}
\and
2024-06-19 21:16:12 +00:00
\inferrule[N-Trans]{\type{C}_1 << \type{C}_2 \\ \type{C}_2 << \type{C}_3}{
\type{C}_1 << \type{C}_3
}
\and
\inferrule[N-Class]{\texttt{class}\ \exptype{C}{\ldots} \triangleleft \exptype{D}{\ldots}}{
\type{C} << \type{D}
}
\end{mathpar}
2024-07-01 15:20:04 +00:00
\caption{Subtyping}\label{fig:subtyping}
\end{figure}
2024-06-30 22:12:38 +00:00
%Subtyping has no bounds for generic type parameters.
% but this is propably not needed
2024-07-01 15:20:04 +00:00
\subsection{Unification Algorithm}\label{sec:implicationRules}
%The $\tv{a} \lessdot \type{Object}$ rule has to be ensured by the incoming constraints.
%The need to have an upper bound to every type placeholder.
2024-07-01 15:20:04 +00:00
All the implication rules depicted in this chapter can be translated to ASP statements
described in chapter \ref{sec:translation}.
2024-06-28 13:52:42 +00:00
\label{sec:implicationRules}
2024-06-10 07:03:46 +00:00
\begin{mathpar}
\inferrule[Subst-L]{
2024-06-30 22:12:38 +00:00
\tv{a} \doteq \type{T}_1 \\
\tv{a} \lessdot \type{T}_2
2024-06-10 07:03:46 +00:00
}{
\type{T}_1 \lessdot \type{T}_2
2024-06-10 07:03:46 +00:00
}
\and
\inferrule[Subst-R]{
2024-06-30 22:12:38 +00:00
\tv{a} \doteq \type{T}_1 \\
\type{T}_2 \lessdot \tv{a}
2024-06-10 07:03:46 +00:00
}{
\type{T}_2 \lessdot \type{T}_1
2024-06-10 07:03:46 +00:00
}
\and
\inferrule[Subst-Equal]{
2024-06-30 22:12:38 +00:00
\tv{a} \doteq \type{T}_1 \\
\tv{a} \doteq \type{T}_2
2024-06-10 07:03:46 +00:00
}{
\type{T}_1 \doteq \type{T}_2
2024-06-10 07:03:46 +00:00
}
\and
\inferrule[Swap]{
\type{T}_1 \doteq \type{T}_2
}{
\type{T}_2 \doteq \type{T}_1
}
\and
\inferrule[Unfold]{
\tv{b} \doteq \exptype{C}{\type{T}_1 \ldots \type{T}_n}
}{
\type{T}_i \doteq \type{T}_i
}
\and
\inferrule[Subst-Param]{
2024-08-30 07:08:18 +00:00
\type{T}' \doteq \type{S} \\
\type{T} \doteq \exptype{C}{\type{T}_1 \ldots, \type{T}', \ldots \type{T}_n} \\
}{
\type{T} \doteq \exptype{C}{\type{T}_1, \ldots \type{S}, \ldots \type{T}_n}
}
\and
2024-06-30 22:12:38 +00:00
\inferrule[Subst-Param']{
2024-08-30 07:08:18 +00:00
\type{T}' \doteq \type{S} \\
\tv{b} \lessdot \exptype{C}{\type{T}_1 \ldots, \type{T}', \ldots \type{T}_n} \\
2024-06-30 22:12:38 +00:00
}{
\tv{b} \lessdot \exptype{C}{\type{T}_1, \ldots \type{S}, \ldots \type{T}_n}
}
\and
2024-06-12 12:37:35 +00:00
\inferrule[S-Object]{}{\tv{a} \lessdot \type{Object}}
\and
\inferrule[Match]{
\tv{a} \lessdot \type{N}_1 \\
\tv{a} \lessdot \type{N}_2 \\
\type{N}_1 << \type{N}_2
2024-06-10 07:03:46 +00:00
}{
2024-08-27 15:47:11 +00:00
\type{N}_1 \lessdot \type{N}_2
2024-06-10 07:03:46 +00:00
}
\and
\inferrule[Adopt]{
\tv{a} \lessdot \tv{b} \\
\tv{b} \lessdot \type{T}
}{
\tv{a} \lessdot \type{T}
}
\and
2024-06-12 12:37:35 +00:00
% \inferrule[Subst-Param]{
% \tv{a} \doteq \type{N} \\
% \tv{a} = \type{T}_i \\
% \exptype{C}{\type{T}_1 \ldots \type{T}_n} \lessdot \type{T} \\
% }{
% \type{T}_i \doteq \type{N} \\
% }
\and
\inferrule[Adapt]{
\type{N}_1 \lessdot \exptype{C}{\type{T}_1 \ldots \type{T}_n} \\
\type{N}_1 <: \exptype{C}{\type{S}_1 \ldots \type{S}_n} \\
}{
\exptype{C}{\type{S}_1 \ldots \type{S}_n} \doteq \exptype{C}{\type{T}_1 \ldots \type{T}_n} \\
}
\and
\inferrule[Reduce]{
\exptype{C}{\type{S}_1 \ldots \type{S}_n} \doteq \exptype{C}{\type{T}_1 \ldots \type{T}_n} \\
}{
\type{S}_i \doteq \type{T}_i \\
}
\end{mathpar}
\begin{mathpar}
\text{Apply only once per constraint:}\quad
\inferrule[Super]{
2024-06-30 22:12:38 +00:00
\type{N} \lessdot \tv{a}\\
\type{N} <: \type{N}'
}{
2024-06-30 22:12:38 +00:00
\tv{a} \doteq \type{N}'
}
\end{mathpar}
\begin{center}
Apply one or the other:
\end{center}
\begin{mathpar}
\inferrule[Split-L]{
\tv{a} \lessdot \tv{b}\\
\tv{a} \lessdot \type{N}\\
}{
\tv{b} \lessdot \type{N}
}
\quad \quad
\vline
\quad \quad
\inferrule[Split-R]{
\tv{a} \lessdot \tv{b}\\
\tv{a} \lessdot \type{N}\\
}{
\type{N} \lessdot \tv{b}
}
\end{mathpar}
2024-06-12 12:37:35 +00:00
Result:
\begin{mathpar}
\inferrule[Solution]{
\tv{a} \doteq \type{G}
2024-06-12 12:37:35 +00:00
}{
2024-06-21 01:32:11 +00:00
\sigma(\tv{a}) = \type{G}
2024-06-12 12:37:35 +00:00
}
2024-06-30 22:12:38 +00:00
\and
\inferrule[Solution-Gen]{
\tv{a} \lessdot \type{C}_1, \ldots, \tv{a} \lessdot \type{C}_n \\
\forall i: \type{C}_m << \type{C}_i \\
}{
\tv{a} \doteq \type{C}_m
}
2024-06-19 21:16:12 +00:00
% \and
2024-06-30 22:12:38 +00:00
% \inferrule[Solution-Gen]{
% \tv{a} \lessdot \type{C}\\
% \sigma(\tv{a}) = \emptyset
% }{
2024-06-30 22:12:38 +00:00
% \tv{a} \doteq \type{A} \\ \sigma'(\tv{a}) = \type{A}
% }
2024-06-19 21:16:12 +00:00
% \and
% \inferrule[Solution-Gen]{
2024-06-30 22:12:38 +00:00
% \tv{a} \lessdot \type{G} \\
2024-06-19 21:16:12 +00:00
% \tv{a} \lessdot \type{G}_1, \ldots, \tv{a} \lessdot \type{G}_n \\
2024-06-30 22:12:38 +00:00
% \forall i: \type{G} << \type{G}_i \\
% \sigma'(\tv{a}) = \type{A}
2024-06-19 21:16:12 +00:00
% }{
2024-06-30 22:12:38 +00:00
% \Delta(\type{A}) = \type{G}
2024-06-19 21:16:12 +00:00
% }
2024-06-12 12:37:35 +00:00
\end{mathpar}
Fail:
\begin{mathpar}
% \inferrule[Fail]{
% \type{T} \lessdot \type{N}\\
% \type{T} \nless : \type{N}
% }{
% \emptyset
% }
% \and
\inferrule[Fail]{
\exptype{C}{\ldots} \doteq \exptype{D}{\ldots}\\
\type{C} \neq \type{D}
}{
\emptyset
}
\and
2024-06-30 22:12:38 +00:00
\inferrule[Fail-Generic]{
\type{X} \doteq \type{T}\\
\type{X} \neq \type{T}
}{
\emptyset
}
\and
2024-06-11 14:29:34 +00:00
\inferrule[Fail-Sigma]{
2024-06-30 22:12:38 +00:00
\tv{a} \doteq \type{N} \\
2024-06-11 14:29:34 +00:00
\tv{a} \in \type{N}
}{
\emptyset
}
\and
\inferrule[Fail]{
\tv{a} \lessdot \type{N}_1 \\
\tv{a} \lessdot \type{N}_2 \\
\text{not}\ \type{N}_1 << \type{N}_2 \\
\text{not}\ \type{N}_2 << \type{N}_1
}{
\emptyset
}
2024-06-10 07:03:46 +00:00
\end{mathpar}
2024-06-11 14:29:34 +00:00
2024-06-10 07:03:46 +00:00
% Subst
% a =. N, a <. T, N <: T
% --------------
% N <. T
2024-06-12 12:37:35 +00:00
% a <. List<b>, b <. List<a>
2024-06-10 07:03:46 +00:00
% how to proof completeness and termination?
% TODO: how to proof termination?
2024-06-12 12:37:35 +00:00
The algorithm terminates if every type placeholder in the input constraint set has an assigned type.
2024-07-01 15:20:04 +00:00
\section{ASP Encoding}\label{sec:translation}
Answer Set Programming is a declarative way of formulating search problems.
ASP statements consist of a head and a body separated by a implication operator \textquotedbl{}\texttt{:-}\textquotedbl{}:
\texttt{head :- body}.
This statement is interpreted as an implication rule.
If the premises in the body are satisfied the facts stated in the head are deducted.
2024-07-01 15:20:04 +00:00
\newcommand{\aspify}{\nabla}
The implication rules defined in chapter \ref{sec:implicationRules} are formulated in a way that can be translated to an ASP program
2024-07-01 15:20:04 +00:00
(see figure \ref{fig:aspTransformationRules}).
2024-06-28 13:52:42 +00:00
2024-07-01 15:20:04 +00:00
% \begin{tabular}{l | r}
% $\tv{a}$ & \texttt{tph("a")}\\
% $\exptype{C}{}$ & \texttt{type("C", null)}\\
% $\exptype{C}{\ol{X}}$ & \texttt{type("C", params(\ldots))}\\
% \end{tabular}
2024-06-30 22:12:38 +00:00
2024-07-01 15:20:04 +00:00
\begin{figure}[h]
2024-06-30 22:12:38 +00:00
\begin{align*}
\aspify(\tv{a}) =& \texttt{tph("a")}\\
\exptype{C}{} =& \texttt{type("C", null)}\\
\exptype{C}{\type{T}_1, \type{T}_2, \ldots} =& \texttt{type("C", params(}\aspify(\type{T}_1), \aspify(\type{T}_1), \ldots\texttt{))}\\
\aspify(\type{T} \doteq \type{T}') =& \texttt{equalcons}(\aspify(\type{T}), \aspify(\type{T}')) \\
\aspify(\type{T} \lessdot \type{T}') =& \texttt{subcons}(\aspify(\type{T}), \aspify(\type{T}')) \\
\aspify{} \left(\inferrule{
c_1, c_2
}{
c
}\right) =& \aspify(c)\ \texttt{:-} \aspify(c_1), \aspify(c_2) \\
c_1 \ || \ c_2 =& \texttt{orcons}(\aspify(c_1), \aspify(c_2))
2024-06-30 22:12:38 +00:00
\end{align*}
2024-07-01 15:20:04 +00:00
\caption{Transformation to ASP code}\label{fig:aspTransformationRules}
\end{figure}
%The $\aspify{}$ function converts our formal specification of the algorithm to ASP code.
2024-07-01 15:20:04 +00:00
This also has to be done for the subtype rules in figure \ref{fig:subtyping}.
There the S-Class rule needs special handling because it contains a substitution.
It has to be encoded using variables.
2024-06-30 22:12:38 +00:00
Given a extends relation $\texttt{class}\ \exptype{C}{\type{X}} \triangleleft \exptype{D}{\type{X}}$
we generate the ASP code:\\
{\small{\texttt{subtype(type("C", params(X)), type("D", params(X))) :- subtype(type("C", params(X))).}}}
2024-06-29 17:46:03 +00:00
2024-06-30 22:12:38 +00:00
Capital letters like \texttt{X} are variables in ASP and the former statement causes any
2024-07-01 15:20:04 +00:00
literal like \texttt{subtype(type("C", params(tph("a"))))} to imply the literal
2024-06-30 22:12:38 +00:00
\texttt{subtype(type("C", params(tph("a"))), type("D", params(tph("a"))))}.
2024-06-29 17:46:03 +00:00
The vital part is the generation of the cartesian product of the Or-Constraints.\\
{{\texttt{subcons(A,B); orCons(C,D) :- orCons(subcons(A,B), subcons(C,D)).}}}\\
The operator \textquotedbl{}\texttt{;}\textquotedbl{} tells ASP to consider either the left or the right side.
By supplying multiple Or-Constraints this way the ASP interpreter will consider all combinations, the cartesian product of the or constraints.
2024-06-30 22:12:38 +00:00
The rules implying a failure $\emptyset$ are translated by leaving the head of the asp statement empty.
If the body of such rules is satisfied the algorithm considers the deduction as incorrect.
2024-07-01 15:20:04 +00:00
% \section{Proofs}
% \begin{lemma}{Substitution:}
% For every $\tv{a} \doteq \type{T}$ and every constraint $c$,
% there also exists a constraint $[\type{T}/\tv{a}]c$.
% \noindent
% \normalfont
% This lemma proofs that equal constraints lead to a substitution in every other constraint.
% For exmaple $\tv{b} \doteq \type{String}$ and $\tv{b} \lessdot \exptype{Comparable}{\tv{b}}$
% lead to a constraint $\type{String} \lessdot \exptype{Comparable}{\type{String}}$
% \end{lemma}
% \textit{Proof:}
% %TODO
% \section{Termination}
% The amount of different constraints is limited by the maximum amount of encapsulated generics.
% The only part that is able to add an additional nesting is the Subst-Param rule.
% Here a type placeholder inside a type parameter list is replaced by another type which possibly adds
% another layer of nesting but it also removes one type placeholder.
% There must be one substitution that does not add another type placeholder.
% Otherwise there has to be a loop and this woul lead to an incorrect constraint set due to the Fail-Sigma rule.
% The Subst-Param rule can only be applied a finite number of times.
% Due to the substitution lemma and the Sigma-Fail rule the Subst-Param rule can only be applied once per type placeholder.
% The Subst-Param rule can only be applied once per type placeholder.
% If $\type{T} \doteq \exptype{C}{\tv{a}}$ is substituted to $\type{T} \doteq \exptype{C}{\type{N}}$
% then there is no $\tv{a} \in \type{N}$.
% \section{Completeness}
% To proof completeness we have to show that every type can be replaced by a placeholder in a correct constraint set.
% Completeness -> we never exclude a solution
% Following constraints stay: $\tv{a} \lessdot \type{T}$ if $\tv{a}$ is never on a right side of another constraint.
% Every other type placeholder will be reduced to $\tv{a} \doteq \type{T}$, if there is a solution.
% Proof:
% %Induction over every possible constraint variation:
% a =. T -> induction start
% a <. T -> if no other constraint then it can stay otherwise there is either a =. T or a <. T
% in latter case: a <. T, a <. T'
% Proof that every type can be replaced by a Type Placeholder.
% % Whats with a =. T, can T be replaced by a Type Placeholder?
% % What is our finish condition? a <. T constraints stay, a =. b constraints stay.
% % Algorithm does not fail -> \emptyset if a solution exists
% % Otherwise there exists a substitution. If the algorithm succeeds we have to pick one of the possible solutions
% % by: a <. T -> a =.T
% % a =. b, b =. T -> use the solution generation from other paper
% % TODO: try to include solution generation in the algorithm and proof that this solution is valid and will always occur as long as there is a solution
% Soundness -> we never make a wrong implication
% %$\tv{a} \doteq \type{T}$ means that $\[type{T}/\tv{a}]C$ is correct
% If it succeeds then we can substitute all $\tv{a} \doteq \type{T}$
% constraints in the original constraint set and
% there exists a typing for the remaining type placeholders
% so that the constraint set is satisfied.
% \begin{theorem}{Soundness}\label{lemma:soundness}
% if $\type{T} \lessdot \type{T'}$ and $\sigma(\tv{a}) = \type{N}$
% then $[\type{N}/\tv{a}]\type{T} <: [\type{N}/\tv{a}]\type{T'}$.
% \end{theorem}
% \SetEnumitemKey{ncases}{itemindent=!,before=\let\makelabel\ncasesmakelabel}
% \newcommand*\ncasesmakelabel[1]{Case #1}
% \newenvironment{subproof}
% {\def\proofname{Subproof}%
% \def\qedsymbol{$\triangleleft$}%
% \proof}
% {\endproof}
% Due to Match there must be $\type{N}_1 \lessdot \type{N}_2 \ldots \lessdot \type{N}_n$
% \begin{proof}
% \begin{enumerate}[ncases]
% \item $\tv{a} \lessdot \exptype{C}{\ol{T}}$.
% Solution-Sub
% Let $\sigma(\tv{a}) = \type{N}$. Then $\type{N} <: \exptype{C}{[\type{N}/\tv{a}]}$
% \item $\tv{a} \doteq \type{N}$.
% Solution
% \item $\tv{a} \lessdot \tv{b}$.
% There must be a $\tv{a} \lessdot \type{N}$
% \begin{subproof}
% $\sigma(\tv{a}) = \type{Object}$,
% $\sigma(\tv{b}) = \type{Object}$.
% \end{subproof}
% \item $\type{N} \lessdot \tv{a}$.
% \begin{subproof}
% $2$
% \end{subproof}
% \end{enumerate}
% And more text.
% \end{proof}
% \begin{lemma}{Substitution}
% \begin{description}
% \item[If] $\tv{a} \doteq \type{N}$ with $\tv{a} \neq \type{N}$
% \item[Then] for every $\type{T} \doteq \type{T}$ there exists a $[\type{N}/\tv{a}]\type{T} \doteq [\type{N}/\tv{a}]\type{T}$
% \item[Then] for every $\type{T} \lessdot \type{T}$ there exists a $[\type{N}/\tv{a}]\type{T} \lessdot [\type{N}/\tv{a}]\type{T}$
% \end{description}
% \end{lemma}
% \textit{Proof:}
% TODO
% \begin{lemma} \label{lemma:subtypeOnly}
% If $\sigma(\tv{a}) = \emptyset$ then $\tv{a}$ appears only on the left side of $\tv{a} \lessdot \type{T}$ constraints.
% \end{lemma}
% Proof:
% Every type placeholder gets a solution, because there must be atleast one $\tv{a} \lessdot \type{N}$ constraint.
% Then either the Solution-Sub generates a $\sigma$ or the Solution rule can be used TODO: Proof.
% The Solution-Sub rule is always correct.
% Proof:
% \begin{theorem}{Termination}
% %jede nichtendliche Menge von Constraints bleibt endlich. Die Regeln können nicht unendlich oft angewendet werden
% %Trivial. The only possibility would be if we allow a =. C<a> constraints!
% \end{theorem}
% TODO: For completeness we have to proof that not $\tv{a} \doteq \type{N}$ only is the case if $\tv{a}$ only appears on the left side of $\tv{a} \lessdot \type{T}$ constraints.
% Problem: a <. List<a>
% a <. List<b>
% then a =. b
% Solution: Keep the a <. N constraints and apply the Step 6 from the GTFGJ paper.
% Then we have to proof that only a <. N constraints remain with sigma(a) = empty. Or Fail
% \begin{theorem}{Completeness}
% $\forall \tv{a} \in C_{input}: \sigma(\tv{a}) = \type{N}$, if there is a solution for $C_{input}$
% and every type placeholder has an upper bound $\tv{a} \lessdot \type{N}$.
% \end{theorem}
% %Problem: We do not support multiple inheritance
% \begin{proof}
% \begin{enumerate}[ncases]
% \item $\tv{a} \lessdot \type{N}$.
% Solution-Sub
% \item $\tv{a} \doteq \type{N}$.
% Solution
% \item $\tv{a} \lessdot \tv{b}$.
% There must be a $\tv{a} \lessdot \type{N}$
% \begin{subproof}
% $\sigma(\tv{a}) = \type{Object}$,
% $\sigma(\tv{b}) = \type{Object}$.
% \end{subproof}
% \item $\type{N} \lessdot \tv{a}$.
% \begin{subproof}
% $2$
% \end{subproof}
% \end{enumerate}
% And more text.
% \end{proof}
%\section{Discussion}
2024-07-01 10:32:56 +00:00
% We cannot use Datalog, because it cannot solve NP-Hard problems.
% See: E. Dantsin, T. Eiter, G. Gottlob, and A. Voronkov. Complexity and Expressive Power of Logic Programming.
% ACM Computing Surveys, 33(3):374425, 2001. Available at
% http://www.kr.tuwien.ac.at/staff/eiter/et-archive/
% Source: https://www.cs.ox.ac.uk/files/1018/gglecture7.pdf
\section{Outcome and Conclusion}
ASP handles Or-Constraints surprisingly well.
2024-07-01 15:20:04 +00:00
We tested our ASP implementation of the unification algorithm with the constraints originating from the program in
figure \ref{fig:benchmark}.
We compared it with the current implementation of the Unification algorithm in Java
\footnote{\url{https://gitea.hb.dhbw-stuttgart.de/JavaTX/JavaCompilerCore}}.
We increased the complexity by adding more stacked method calls up to ten stacked calls and the interpreter clingo \footnote{\url{https://potassco.org/clingo/}}
was still able to handle it easily and finish computation in under 50 milliseconds.
By contrast our Java implementation already takes multiple seconds processing time for the same input.
We are using this example for a fair comparision between the two implementations because it does not include generic type parameters causing the unification algorithm
not to consider wildcard types.
The Java implementation also supports Java wildcard types whereas the ASP implementation does not.
%It is only possible to implement a type inference algorithm for Java as long as we omit wildcard types.
It is unfortunately not possible to implement the unification algorithm including wildcard support with Answer Set Programming.
The reason is that subtype checking in Java is turing complete \cite{javaTuringComplete}.
The grounding process of the ASP interpreter clingo would not terminate if the problem is turing complete
\cite{kaufmann2016grounding}.
The Java implementation is still able to spawn atleast one solution in most cases and only never terminates for specific inputs,
whereas the ASP program never terminates as soon as wildcards are involved.
2024-07-01 10:32:56 +00:00
2024-07-01 15:20:04 +00:00
%\section{Future Work}
2024-07-01 10:32:56 +00:00
% Benchmarks
% Integrating the ASP Unify implementation into existing Java-TX Compiler
% Checking how many programs are abel to be build without wildcards
2024-06-28 11:54:11 +00:00
\bibliographystyle{eptcs}
\bibliography{martin}
2024-06-08 11:49:49 +00:00
\end{document}