135 Commits

Author SHA1 Message Date
ab7077d8f0 add public private to typecheck 2024-07-04 16:59:52 +02:00
53061fb73d add public private to parser 2024-07-04 16:59:37 +02:00
bac2a534b6 adjust AST for Modifier 2024-07-04 16:03:38 +02:00
574628d3f7 parser add optional parameters 2024-07-04 14:21:37 +02:00
30288731d3 mmooore Tests 2024-07-03 16:32:54 +02:00
f7d135cdd6 update pdf 2024-07-01 17:43:21 +02:00
c05aa302a4 extend parser doc 2024-07-01 17:41:47 +02:00
36d9b74df4 Merge remote-tracking branch 'origin/master' into create-parser 2024-07-01 17:41:27 +02:00
1348cea93f Merge pull request 'bytecode' (#9) from bytecode into master
Reviewed-on: #9
2024-07-01 15:05:30 +00:00
mrab
d648e5dd05 Merge branch 'master' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-07-01 17:05:07 +02:00
2b45d66ac1 extend doku 2024-07-01 17:00:37 +02:00
Matthias Raba
fa285a4af2 documentation goes LaTex 2024-06-28 11:01:17 +02:00
Matthias Raba
0e1f31080e fixed method/constructor overloading in bytecode generator 2024-06-28 09:20:03 +02:00
Matthias Raba
dee1fcb2df Merge branch 'typedAST' of https://gitea.hb.dhbw-stuttgart.de/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-28 08:51:19 +02:00
c7e72dbde3 move defaultconstructor injection to the beginning befor typechecking the classes 2024-06-28 08:51:35 +02:00
9657731a93 remove example.hs because its deprecated 2024-06-28 08:44:22 +02:00
b5efc76c17 add default constructor if none are present 2024-06-28 08:44:04 +02:00
Matthias Raba
25dd0802ad Merge branch 'typedAST' of https://gitea.hb.dhbw-stuttgart.de/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-28 08:44:03 +02:00
Matthias Raba
0a53ea14cf Merge branch 'typedAST' of https://gitea.hb.dhbw-stuttgart.de/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-28 07:37:02 +02:00
Matthias Raba
4c5dbd16f9 Merge branch 'create-parser' of https://gitea.hb.dhbw-stuttgart.de/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-28 07:36:52 +02:00
d0bf34d331 update documentation 2024-06-27 19:02:19 +02:00
fe4ef2614f add new constructortypecheck with overloading 2024-06-27 18:53:01 +02:00
2154f8fd62 add method overloading 2024-06-27 18:52:18 +02:00
504e26dcdd fix objects not comparable to null 2024-06-27 18:08:36 +02:00
f09e6ad09e adjust parser for constructordeclaration 2024-06-27 08:33:41 +02:00
3e18efc097 add constructodeclarations to class in AST 2024-06-26 18:09:00 +02:00
mrab
cb462b5e75 added constructordeclaration 2024-06-26 17:58:52 +02:00
mrab
3275b045cb added list of language features 2024-06-26 17:53:37 +02:00
ae45251189 Merge pull request 'create-parser' (#8) from create-parser into master
Reviewed-on: #8
2024-06-26 15:13:42 +00:00
94ed7b5056 Merge pull request 'bytecode' (#7) from bytecode into master
Reviewed-on: #7
2024-06-26 15:13:23 +00:00
3d352035ee doku 2024-06-26 17:10:38 +02:00
mrab
6b4b9b496d fixed local variable declaration being ignore inside for loop 2024-06-26 09:42:07 +02:00
mrab
c8b3caa2af Merge branch 'create-parser' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-26 09:22:49 +02:00
05a6de4a0d Merge remote-tracking branch 'origin/master' into create-parser 2024-06-26 09:18:11 +02:00
1eaeffb9a4 parser add for statements 2024-06-26 09:17:21 +02:00
Matthias Raba
87f629282f fixed order of compound AST types 2024-06-24 11:30:29 +02:00
Matthias Raba
6346cb237b removed redundant definitions for opcodeEncodingLength 2024-06-24 11:20:24 +02:00
Matthias Raba
74f52c3c35 fixed small typo in documentation 2024-06-24 07:47:43 +02:00
Matthias Raba
711620bdd9 Merge branch 'bytecode' of https://gitea.hb.dhbw-stuttgart.de/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-24 07:44:42 +02:00
Matthias Raba
946a1f374c fixed stack op depth for getfield 2024-06-24 07:44:04 +02:00
9ecadc8946 updated docs again 2024-06-23 17:39:40 +02:00
09dd11bb34 updated docs 2024-06-23 17:38:52 +02:00
86aa87515b Merge pull request 'bytecode' (#6) from bytecode into master
Reviewed-on: #6
2024-06-21 07:06:07 +00:00
Matthias Raba
98735fd6ba updated bytecode.md 2024-06-21 09:03:47 +02:00
Matthias Raba
8eb9c16c7a typos, formatting, comments 2024-06-21 08:49:55 +02:00
Matthias Raba
4435f7aac8 Changed internal type of boolean, because it's obviously Z not B haha 2024-06-21 08:07:20 +02:00
Matthias Raba
79ddafbf9a Fixed class offset in constant pool when newly inserted 2024-06-21 07:37:25 +02:00
Matthias Raba
29faab5112 maxStack calculation 2024-06-20 15:07:02 +02:00
Matthias Raba
8c508e6d32 Merge branch 'typedAST' of https://gitea.hb.dhbw-stuttgart.de/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-20 12:35:34 +02:00
faf3d1674e fix constructor not handling nullliterals correctly 2024-06-20 11:52:35 +02:00
bcbec9209a fix nullliterals having the type null instead of their corresponding object types 2024-06-20 11:26:38 +02:00
Matthias Raba
6547ad04f5 Merge branch 'typedAST' of https://gitea.hb.dhbw-stuttgart.de/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-20 08:40:38 +02:00
8a6dca4e36 fix null not accepted for object method parameter 2024-06-20 08:38:56 +02:00
Matthias Raba
ee302bb245 maxLocals calculation 2024-06-18 07:37:01 +02:00
361643a85a Merge pull request 'bytecode' (#5) from bytecode into master
Reviewed-on: #5
2024-06-17 17:29:59 +00:00
mrab
d2554c9b22 updated recursive test using this. syntax 2024-06-17 19:25:24 +02:00
mrab
5269971334 Merge branch 'create-parser' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-14 16:05:46 +02:00
e4693729dc fix assignment with this 2024-06-14 15:12:59 +02:00
Matthias Raba
d1bef2193e adjusted usage example in README 2024-06-14 10:00:47 +02:00
561a0b37d8 Merge pull request 'documentation' (#4) from documentation into master
Reviewed-on: #4
2024-06-14 07:59:04 +00:00
Matthias Raba
6e400ebb9d Merge branch 'master' of https://gitea.hb.dhbw-stuttgart.de/MisterChaos69/MiniJavaCompiler into documentation 2024-06-14 09:57:48 +02:00
dc2f845049 Merge pull request 'bytecode' (#3) from bytecode into master
Reviewed-on: #3
2024-06-14 07:56:36 +00:00
36c48d013a add typecheck documentation 2024-06-14 09:56:33 +02:00
Matthias Raba
07fcda5827 Merge branch 'master' of https://gitea.hb.dhbw-stuttgart.de/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-14 09:54:47 +02:00
b1735c6300 Merge pull request 'Add initial typechecker for AST' (#2) from typedAST into master
Reviewed-on: #2
2024-06-14 07:53:29 +00:00
Matthias Raba
fa9f8c3425 added Bytecode documentation 2024-06-14 09:49:27 +02:00
Matthias Raba
2c928ad69b renamed Generator -> Builder 2024-06-14 08:54:12 +02:00
Matthias Raba
b47da4633d refactored assemblers & builders to individual files 2024-06-14 08:47:45 +02:00
Matthias Raba
3fc804e899 malicious tests 2024-06-14 08:11:58 +02:00
Matthias Raba
807aea112e added Test classes, fixed assignment dup missing 2024-06-14 07:57:38 +02:00
mrab
9e43b015b7 injecting initializers into all constructors, multiple classes per file supported 2024-06-13 22:25:35 +02:00
mrab
79a989eecf Merge branch 'typedAST' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-13 22:22:52 +02:00
f02226bca8 add missing typeCheckVariableDeclaration 2024-06-13 22:06:10 +02:00
mrab
3acbce8afc fixed invalid dup depth for postinc/dec 2024-06-13 21:02:00 +02:00
mrab
44c6d74afb Merge branch 'typedAST' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-13 20:51:19 +02:00
mrab
3f6eb68e91 implemented arbitrarily nested increment operators 2024-06-13 20:51:10 +02:00
7e13b3fac3 fix variable redefinition in scope not working 2024-06-13 20:35:00 +02:00
mrab
613a280079 Method & constructor calls fully working 2024-06-13 20:17:23 +02:00
mrab
fbd76deca3 Merge branch 'typedAST' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-13 17:36:09 +02:00
2139e7832c fix missing Typed Expression for local and field variable 2024-06-13 17:33:30 +02:00
mrab
9a9c508fc7 Merge branch 'typedAST' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-13 17:16:29 +02:00
baf9362634 make implicit this to explicit this for field variables 2024-06-13 15:49:59 +02:00
Matthias Raba
4def6e5804 name resolution for fields 2024-06-13 15:35:42 +02:00
2d6c7b1a06 fix external methocall 2024-06-13 12:27:57 +02:00
Matthias Raba
3f78cdaa2d Merge branch 'typedAST' of https://gitea.hb.dhbw-stuttgart.de/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-13 09:27:19 +02:00
710ec43959 add using standard constructor for constructor call 2024-06-13 09:26:05 +02:00
mrab
b41a77ba33 boolean AND/OR, if/else goto fixed 2024-06-12 19:57:09 +02:00
mrab
7317895800 Merge branch 'typedAST' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-06-12 16:55:31 +02:00
mrab
06dad4d7f9 resolved circular imports 2024-06-12 16:55:19 +02:00
b525d14192 add typechecking for returns, fix finding of constructors, fix if statement 2024-06-11 20:04:59 +02:00
a62fe50a0d Merge remote-tracking branch 'origin/create-parser' into typedAST 2024-06-11 18:50:39 +02:00
7c52084bbe fix test 2024-06-10 17:19:56 +02:00
fanoll
98b02446ba remove unused thisMeth type. Returns are combined and already checked against return Type 2024-06-10 12:53:59 +02:00
82b2b4a6e1 fix intliteral 0 2024-05-31 17:39:56 +02:00
3d351ee02b fix false error message 2024-05-31 17:11:46 +02:00
05b599b8ff fix if typecheck 2024-05-31 17:10:50 +02:00
060321f323 Merge remote-tracking branch 'origin/create-parser' into typedAST 2024-05-31 16:26:14 +02:00
fc96eba52e Merge remote-tracking branch 'origin/bytecode' into typedAST 2024-05-31 16:26:08 +02:00
af093fa3bb parser add increment statement 2024-05-31 12:10:00 +02:00
666856b33a parser add constructor call 2024-05-31 12:06:14 +02:00
mrab
578f959d7c Merge branch 'create-parser' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-31 11:44:59 +02:00
mrab
bbe0d86670 partial revert 2024-05-31 11:44:53 +02:00
30365d76bd Revert "parser add preincrement and decrement conversion"
This reverts commit 2acba0f283.
2024-05-31 11:44:10 +02:00
mrab
c0caa7ce01 Merge branch 'create-parser' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-31 11:24:26 +02:00
mrab
2b7d217e8a moved expression statement and expressionstatement 2024-05-31 11:24:11 +02:00
2acba0f283 parser add preincrement and decrement conversion 2024-05-31 11:23:25 +02:00
84613fabe0 parser add field subaccess 2024-05-31 11:07:19 +02:00
408111df51 parser implement field access 2024-05-31 11:04:23 +02:00
mrab
6abb9ae8ba Merge branch 'typedAST' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-31 10:37:07 +02:00
mrab
5aa193bc08 Merge branch 'bytecode' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-31 09:44:45 +02:00
mrab
2e7c28812b Merge branch 'create-parser' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-31 09:42:16 +02:00
mrab
2ac3b60e79 Merge branch 'typedAST' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-31 09:41:37 +02:00
mrab
6f4143a60a increment/decrement 2024-05-16 11:42:48 +02:00
mrab
fadbdf1035 Merge branch 'create-parser' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-16 11:19:46 +02:00
mrab
d0d2cbd081 restructured bytecode code 2024-05-16 10:39:27 +02:00
mrab
ddab04f063 Merge branch 'typedAST' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-16 09:55:03 +02:00
mrab
09f70ca789 While assembly 2024-05-16 09:53:36 +02:00
mrab
d13e24c216 Merge branch 'create-parser' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-16 09:33:16 +02:00
mrab
53e1afc9e4 Merge branch 'typedAST' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-16 09:28:18 +02:00
mrab
535a6891ad Local variable assignment 2024-05-16 09:28:00 +02:00
e572975bda add: inject default constructor 2024-05-15 11:45:41 +02:00
mrab
207fb5c5f3 local variables working 2024-05-15 11:23:40 +02:00
b095678769 add: local variable assembler 2024-05-15 10:01:01 +02:00
mrab
a4fff37b07 Bugfix: else assembly 2024-05-14 14:09:24 +02:00
mrab
c02de8f9b2 Merge branch 'create-parser' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-14 13:57:25 +02:00
mrab
a179dec3ea else Assembly 2024-05-14 13:57:20 +02:00
mrab
95178366a2 Merge branch 'create-parser' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-14 13:21:12 +02:00
mrab
f9f984568f block and if assembly (no else) 2024-05-14 13:20:37 +02:00
mrab
5284d6ecba Merge branch 'create-parser' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-14 12:34:11 +02:00
mrab
d1d9a5d6e1 constructor and method call using this 2024-05-14 12:29:55 +02:00
mrab
86e15b5856 method assembly is actually used (yay) 2024-05-14 11:12:12 +02:00
mrab
393253c9bb Merge branch 'create-parser' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-14 10:29:27 +02:00
mrab
09469e0e45 added boolean expression assembly 2024-05-14 10:28:57 +02:00
mrab
d54c7cd7e6 Working pipeline for whole compiler 2024-05-08 15:23:18 +02:00
mrab
9d7af6effb Merge branch 'create-parser' of ssh://gitea.hb.dhbw-stuttgart.de:2222/MisterChaos69/MiniJavaCompiler into bytecode 2024-05-08 14:35:45 +02:00
40 changed files with 1869 additions and 800 deletions

6
.gitignore vendored
View File

@@ -8,7 +8,6 @@ cabal-dev
*.chs.h
*.dyn_o
*.dyn_hi
*.java
*.class
*.local~*
src/Parser/JavaParser.hs
@@ -27,3 +26,8 @@ cabal.project.local
cabal.project.local~
.HTF/
.ghc.environment.*
texput.log
doc/output/
doc/*.aux
doc/*.log
doc/*.out

View File

@@ -5,7 +5,7 @@ Written in Haskell.
# Cabal Commands
run main
```
cabal run
cabal run compiler <FILENAME>
```
run tests

View File

@@ -0,0 +1,64 @@
// compile all test files using:
// ls Test/JavaSources/*.java | grep -v ".*Main.java" | xargs -I {} cabal run compiler {}
// compile (in project root) using:
// pushd Test/JavaSources; javac -g:none Main.java; popd
// afterwards, run using
// java -ea -cp Test/JavaSources/ Main
public class Main {
public static void main(String[] args)
{
TestEmpty empty = new TestEmpty();
TestFields fields = new TestFields();
TestConstructor constructor = new TestConstructor(42);
TestArithmetic arithmetic = new TestArithmetic();
TestMultipleClasses multipleClasses = new TestMultipleClasses();
TestRecursion recursion = new TestRecursion(10);
TestMalicious malicious = new TestMalicious();
TestLoop loop = new TestLoop();
TestMethodOverload overload = new TestMethodOverload();
TestShenanigance shenanigance = new TestShenanigance();
TestOptionalParameter optionalParameter = new TestOptionalParameter();
// constructing a basic class works
assert empty != null;
// initializers (and default initializers to 0/null) work
assert fields.a == 0 && fields.b == 42;
// constructor parameters override initializers
assert constructor.a == 42;
// basic arithmetics
assert arithmetic.basic(1, 2, 3) == 2;
// we have boolean logic as well
assert arithmetic.logic(false, false, true) == true;
// multiple classes within one file work. Referencing another classes fields/methods works.
assert multipleClasses.a.a == 42;
// self-referencing classes work.
assert recursion.child.child.child.child.child.value == 5;
// self-referencing methods work.
assert recursion.fibonacci(15) == 610;
assert loop.factorial(5) == 120;
assert loop.weirdFor() == 5;
// methods with the same name but different parameters work
assert overload.MethodOverload() == 42;
assert overload.MethodOverload(15) == 42 + 15;
// constructor overloading works, too.
assert (new TestConstructorOverload()).a == 42;
assert (new TestConstructorOverload(12)).a == 12;
// intentionally dodgy expressions work
assert malicious.assignNegativeIncrement(42) == -42;
assert malicious.tripleAddition(1, 2, 3) == 6;
for(int i = 0; i < 3; i++)
{
assert malicious.cursedFormatting(i) == i;
}
// other syntactic sugar
assert shenanigance.testAssignment() == 5;
assert shenanigance.divEqual() == 234_343_000 / 4;
assert shenanigance.testIf(5);
// optional parameters
assert optionalParameter.oneOptional() == 1;
assert optionalParameter.oneOptional(2) == 2;
assert optionalParameter.normalAndOptional(1) == 6;
assert optionalParameter.normalAndOptional(1, 0) == 4;
}
}

View File

@@ -0,0 +1,11 @@
public class TestArithmetic {
public int basic(int a, int b, int c)
{
return a + b - c * a / b % c;
}
public boolean logic(boolean a, boolean b, boolean c)
{
return !a && (c || b);
}
}

View File

@@ -0,0 +1,9 @@
public class TestConstructor
{
public int a = -1;
public TestConstructor(int initial_value)
{
a = initial_value;
}
}

View File

@@ -0,0 +1,12 @@
public class TestConstructorOverload {
public int a = 42;
TestConstructorOverload() {
// nothing here, so a will assume the default value 42.
}
TestConstructorOverload(int a) {
this.a = a;
}
}

View File

@@ -0,0 +1,4 @@
public class TestEmpty
{
}

View File

@@ -0,0 +1,5 @@
public class TestFields
{
public int a;
public int b = 42;
}

View File

@@ -0,0 +1,19 @@
public class TestLoop {
public int factorial(int n)
{
int tally = 1;
for(int i = 1; i <= n; i++)
{
tally *= i;
}
return tally;
}
int weirdFor() {
int k = 0;
for (; k < 5; k++) {
}
return k;
}
}

View File

@@ -0,0 +1,41 @@
public class TestMalicious {
public int assignNegativeIncrement(int n)
{
return n=-++n+1;
}
public int tripleAddition(int a, int b, int c)
{
return a+++b+++c++;
}
public int cursedFormatting(int n)
{
if
(n == 0)
{
return ((((0))));
}
else
if(n ==
1)
{
return
1;
}else {
return
2
;
}
}
}

View File

@@ -0,0 +1,10 @@
public class TestMethodOverload {
public int MethodOverload() {
return 42;
}
public int MethodOverload(int a) {
return 42 + a;
}
}

View File

@@ -0,0 +1,9 @@
public class TestMultipleClasses
{
public AnotherTestClass a = new AnotherTestClass();
}
class AnotherTestClass
{
public int a = 42;
}

View File

@@ -0,0 +1,10 @@
class TestOptionalParameter {
int oneOptional(int p = 1) {
return p;
}
int normalAndOptional(int a, int b = 2, int c = 3) {
return a + b + c;
}
}

View File

@@ -0,0 +1,34 @@
public class TestRecursion {
public int value = 0;
public TestRecursion child = null;
public TestRecursion(int n)
{
this.value = n;
if(n > 0)
{
child = new TestRecursion(n - 1);
}
}
public int fibonacci(int n)
{
if(n < 2)
{
return n;
}
else
{
return fibonacci(n - 1) + this.fibonacci(n - 2);
}
}
public int ackermann(int m, int n)
{
if (m == 0) return n + 1;
if (n == 0) return ackermann(m - 1, 1);
return ackermann(m - 1, ackermann(m, n - 1));
}
}

View File

@@ -0,0 +1,25 @@
class TestShenanigance {
int testAssignment() {
int x = 1;
int y = x = 5;
return y;
}
int divEqual() {
int x = 234_343_000;
x /= 4;
return x;
}
boolean testIf(int x) {
if (true && x < 8) {
char f = 'c';
return f > x ;
}
ifn't {
return false;
}
}
}

View File

@@ -0,0 +1,15 @@
public class TestSingleton {
TestSingleton instance;
TestSingleton() {
}
public TestSingleton getInstance() {
if (instance == null) {
instance = new TestSingleton();
}
return instance;
}
}

View File

@@ -1,116 +0,0 @@
module TestByteCodeGenerator where
import Test.HUnit
import ByteCode.ClassFile.Generator
import ByteCode.ClassFile
import ByteCode.Constants
import Ast
nakedClass = Class "Testklasse" [] []
expectedClass = ClassFile {
constantPool = [
ClassInfo 4,
MethodRefInfo 1 3,
NameAndTypeInfo 5 6,
Utf8Info "java/lang/Object",
Utf8Info "<init>",
Utf8Info "()V",
Utf8Info "Code",
ClassInfo 9,
Utf8Info "Testklasse"
],
accessFlags = accessPublic,
thisClass = 8,
superClass = 1,
fields = [],
methods = [],
attributes = []
}
classWithFields = Class "Testklasse" [] [VariableDeclaration "int" "testvariable" Nothing]
expectedClassWithFields = ClassFile {
constantPool = [
ClassInfo 4,
MethodRefInfo 1 3,
NameAndTypeInfo 5 6,
Utf8Info "java/lang/Object",
Utf8Info "<init>",
Utf8Info "()V",
Utf8Info "Code",
ClassInfo 9,
Utf8Info "Testklasse",
FieldRefInfo 8 11,
NameAndTypeInfo 12 13,
Utf8Info "testvariable",
Utf8Info "I"
],
accessFlags = accessPublic,
thisClass = 8,
superClass = 1,
fields = [
MemberInfo {
memberAccessFlags = accessPublic,
memberNameIndex = 12,
memberDescriptorIndex = 13,
memberAttributes = []
}
],
methods = [],
attributes = []
}
method = MethodDeclaration "int" "add_two_numbers" [
ParameterDeclaration "int" "a",
ParameterDeclaration "int" "b" ]
(Block [Return (Just (BinaryOperation Addition (Reference "a") (Reference "b")))])
classWithMethod = Class "Testklasse" [method] []
expectedClassWithMethod = ClassFile {
constantPool = [
ClassInfo 4,
MethodRefInfo 1 3,
NameAndTypeInfo 5 6,
Utf8Info "java/lang/Object",
Utf8Info "<init>",
Utf8Info "()V",
Utf8Info "Code",
ClassInfo 9,
Utf8Info "Testklasse",
FieldRefInfo 8 11,
NameAndTypeInfo 12 13,
Utf8Info "add_two_numbers",
Utf8Info "(II)I"
],
accessFlags = accessPublic,
thisClass = 8,
superClass = 1,
fields = [],
methods = [
MemberInfo {
memberAccessFlags = accessPublic,
memberNameIndex = 12,
memberDescriptorIndex = 13,
memberAttributes = [
CodeAttribute {
attributeMaxStack = 420,
attributeMaxLocals = 420,
attributeCode = [Opiadd]
}
]
}
],
attributes = []
}
testBasicConstantPool = TestCase $ assertEqual "basic constant pool" expectedClass $ classBuilder nakedClass emptyClassFile
testFields = TestCase $ assertEqual "fields in constant pool" expectedClassWithFields $ classBuilder classWithFields emptyClassFile
testMethodDescriptor = TestCase $ assertEqual "method descriptor" "(II)I" (methodDescriptor method)
testMethodAssembly = TestCase $ assertEqual "method assembly" expectedClassWithMethod (classBuilder classWithMethod emptyClassFile)
tests = TestList [
TestLabel "Basic constant pool" testBasicConstantPool,
TestLabel "Fields constant pool" testFields,
TestLabel "Method descriptor building" testMethodDescriptor,
TestLabel "Method assembly" testMethodAssembly
]

View File

@@ -7,56 +7,56 @@ import Ast
testSingleEmptyClass = TestCase $
assertEqual "expect single empty class hello" [Class "Hello" [] []] $
assertEqual "expect single empty class hello" [Class "Hello" [] [] []] $
parse [CLASS, IDENTIFIER "Hello", LBRACKET, RBRACKET]
testTwoEmptyClasses = TestCase $
assertEqual "expect two empty classes" [Class "Class1" [] [], Class "Class2" [] []] $
assertEqual "expect two empty classes" [Class "Class1" [] [] [], Class "Class2" [] [] []] $
parse [CLASS,IDENTIFIER "Class1",LBRACKET,RBRACKET,CLASS,IDENTIFIER "Class2",LBRACKET,RBRACKET]
testBooleanField = TestCase $
assertEqual "expect class with boolean field" [Class "WithBool" [] [VariableDeclaration "boolean" "value" Nothing]] $
assertEqual "expect class with boolean field" [Class "WithBool" [] [] [VariableDeclaration "boolean" "value" Nothing]] $
parse [CLASS,IDENTIFIER "WithBool",LBRACKET,BOOLEAN,IDENTIFIER "value",SEMICOLON,RBRACKET]
testIntField = TestCase $
assertEqual "expect class with int field" [Class "WithInt" [] [VariableDeclaration "int" "value" Nothing]] $
assertEqual "expect class with int field" [Class "WithInt" [] [] [VariableDeclaration "int" "value" Nothing]] $
parse [CLASS,IDENTIFIER "WithInt",LBRACKET,INT,IDENTIFIER "value",SEMICOLON,RBRACKET]
testCustomTypeField = TestCase $
assertEqual "expect class with foo field" [Class "WithFoo" [] [VariableDeclaration "Foo" "value" Nothing]] $
assertEqual "expect class with foo field" [Class "WithFoo" [] [] [VariableDeclaration "Foo" "value" Nothing]] $
parse [CLASS,IDENTIFIER "WithFoo",LBRACKET,IDENTIFIER "Foo",IDENTIFIER "value",SEMICOLON,RBRACKET]
testMultipleDeclarationSameLine = TestCase $
assertEqual "expect class with two int fields" [Class "TwoInts" [] [VariableDeclaration "int" "num1" Nothing, VariableDeclaration "int" "num2" Nothing]] $
assertEqual "expect class with two int fields" [Class "TwoInts" [] [] [VariableDeclaration "int" "num1" Nothing, VariableDeclaration "int" "num2" Nothing]] $
parse [CLASS,IDENTIFIER "TwoInts",LBRACKET,INT,IDENTIFIER "num1",COMMA,IDENTIFIER "num2",SEMICOLON,RBRACKET]
testMultipleDeclarations = TestCase $
assertEqual "expect class with int and char field" [Class "Multiple" [] [VariableDeclaration "int" "value" Nothing, VariableDeclaration "char" "letter" Nothing]] $
assertEqual "expect class with int and char field" [Class "Multiple" [] [] [VariableDeclaration "int" "value" Nothing, VariableDeclaration "char" "letter" Nothing]] $
parse [CLASS,IDENTIFIER "Multiple",LBRACKET,INT,IDENTIFIER "value",SEMICOLON,CHAR,IDENTIFIER "letter",SEMICOLON,RBRACKET]
testWithModifier = TestCase $
assertEqual "expect class with int field" [Class "WithInt" [] [VariableDeclaration "int" "value" Nothing]] $
assertEqual "expect class with int field" [Class "WithInt" [] [] [VariableDeclaration "int" "value" Nothing]] $
parse [ABSTRACT,CLASS,IDENTIFIER "WithInt",LBRACKET,PUBLIC,INT,IDENTIFIER "value",SEMICOLON,RBRACKET]
testEmptyMethod = TestCase $
assertEqual "expect class with method" [Class "WithMethod" [MethodDeclaration "int" "foo" [] (Block [])] []] $
assertEqual "expect class with method" [Class "WithMethod" [] [MethodDeclaration "int" "foo" [] (Block [])] []] $
parse [CLASS,IDENTIFIER "WithMethod",LBRACKET,INT,IDENTIFIER "foo",LBRACE,RBRACE,SEMICOLON,RBRACKET]
testEmptyPrivateMethod = TestCase $
assertEqual "expect class with method" [Class "WithMethod" [MethodDeclaration "int" "foo" [] (Block [])] []] $
assertEqual "expect class with method" [Class "WithMethod" [] [MethodDeclaration "int" "foo" [] (Block [])] []] $
parse [CLASS,IDENTIFIER "WithMethod",LBRACKET,PRIVATE,INT,IDENTIFIER "foo",LBRACE,RBRACE,LBRACKET,RBRACKET,RBRACKET]
testEmptyVoidMethod = TestCase $
assertEqual "expect class with method" [Class "WithMethod" [MethodDeclaration "void" "foo" [] (Block [])] []] $
assertEqual "expect class with method" [Class "WithMethod" [] [MethodDeclaration "void" "foo" [] (Block [])] []] $
parse [CLASS,IDENTIFIER "WithMethod",LBRACKET,VOID,IDENTIFIER "foo",LBRACE,RBRACE,LBRACKET,RBRACKET,RBRACKET]
testEmptyMethodWithParam = TestCase $
assertEqual "expect class with method with param" [Class "WithParam" [MethodDeclaration "void" "foo" [ParameterDeclaration "int" "param"] (Block [])] []] $
assertEqual "expect class with method with param" [Class "WithParam" [] [MethodDeclaration "void" "foo" [ParameterDeclaration "int" "param"] (Block [])] []] $
parse [CLASS,IDENTIFIER "WithParam",LBRACKET,VOID,IDENTIFIER "foo",LBRACE,INT,IDENTIFIER "param",RBRACE,SEMICOLON,RBRACKET]
testEmptyMethodWithParams = TestCase $
assertEqual "expect class with multiple params" [Class "WithParams" [MethodDeclaration "void" "foo" [ParameterDeclaration "int" "p1",ParameterDeclaration "Custom" "p2"] (Block [])] []] $
assertEqual "expect class with multiple params" [Class "WithParams" [] [MethodDeclaration "void" "foo" [ParameterDeclaration "int" "p1",ParameterDeclaration "Custom" "p2"] (Block [])] []] $
parse [CLASS,IDENTIFIER "WithParams",LBRACKET,VOID,IDENTIFIER "foo",LBRACE,INT,IDENTIFIER "p1",COMMA,IDENTIFIER "Custom",IDENTIFIER "p2",RBRACE,SEMICOLON,RBRACKET]
testClassWithMethodAndField = TestCase $
assertEqual "expect class with method and field" [Class "WithMethodAndField" [MethodDeclaration "void" "foo" [] (Block []), MethodDeclaration "int" "bar" [] (Block [])] [VariableDeclaration "int" "value" Nothing]] $
assertEqual "expect class with method and field" [Class "WithMethodAndField" [] [MethodDeclaration "void" "foo" [] (Block []), MethodDeclaration "int" "bar" [] (Block [])] [VariableDeclaration "int" "value" Nothing]] $
parse [CLASS,IDENTIFIER "WithMethodAndField",LBRACKET,VOID,IDENTIFIER "foo",LBRACE,RBRACE,LBRACKET,RBRACKET,INT,IDENTIFIER "value",SEMICOLON,INT,IDENTIFIER "bar",LBRACE,RBRACE,SEMICOLON,RBRACKET]
testClassWithConstructor = TestCase $
assertEqual "expect class with constructor" [Class "WithConstructor" [MethodDeclaration "void" "<init>" [] (Block [])] []] $
assertEqual "expect class with constructor" [Class "WithConstructor" [ConstructorDeclaration "WithConstructor" [] (Block [])] [] []] $
parse [CLASS,IDENTIFIER "WithConstructor",LBRACKET,IDENTIFIER "WithConstructor",LBRACE,RBRACE,LBRACKET,RBRACKET,RBRACKET]
testConstructorWithParams = TestCase $
assertEqual "expect constructor with params" [Class "WithParams" [MethodDeclaration "void" "<init>" [ParameterDeclaration "int" "p1"] (Block [])] []] $
assertEqual "expect constructor with params" [Class "WithParams" [ConstructorDeclaration "WithParams" [ParameterDeclaration "int" "p1"] (Block [])] [] []] $
parse [CLASS,IDENTIFIER "WithParams",LBRACKET,IDENTIFIER "WithParams",LBRACE,INT,IDENTIFIER "p1",RBRACE,LBRACKET,RBRACKET,RBRACKET]
testConstructorWithStatements = TestCase $
assertEqual "expect constructor with statement" [Class "WithConstructor" [MethodDeclaration "void" "<init>" [] (Block [Return Nothing])] []] $
assertEqual "expect constructor with statement" [Class "WithConstructor" [ConstructorDeclaration "WithConstructor" [] (Block [Return Nothing])] [] []] $
parse [CLASS,IDENTIFIER "WithConstructor",LBRACKET,IDENTIFIER "WithConstructor",LBRACE,RBRACE,LBRACKET,RETURN,SEMICOLON,RBRACKET,RBRACKET]
@@ -78,13 +78,13 @@ testExpressionIntLiteral = TestCase $
assertEqual "expect IntLiteral" (IntegerLiteral 3) $
parseExpression [INTEGERLITERAL 3]
testFieldWithInitialization = TestCase $
assertEqual "expect Class with initialized field" [Class "WithInitField" [] [VariableDeclaration "int" "number" $ Just $ IntegerLiteral 3]] $
assertEqual "expect Class with initialized field" [Class "WithInitField" [] [] [VariableDeclaration "int" "number" $ Just $ IntegerLiteral 3]] $
parse [CLASS,IDENTIFIER "WithInitField",LBRACKET,INT,IDENTIFIER "number",ASSIGN,INTEGERLITERAL 3,SEMICOLON,RBRACKET]
testLocalBoolWithInitialization = TestCase $
assertEqual "expect block with with initialized local var" [Block [LocalVariableDeclaration $ VariableDeclaration "boolean" "b" $ Just $ BooleanLiteral False]] $
parseStatement [LBRACKET,BOOLEAN,IDENTIFIER "b",ASSIGN,BOOLLITERAL False,SEMICOLON,RBRACKET]
testFieldNullWithInitialization = TestCase $
assertEqual "expect Class with initialized field" [Class "WithInitField" [] [VariableDeclaration "Object" "bar" $ Just NullLiteral]] $
assertEqual "expect Class with initialized field" [Class "WithInitField" [] [] [VariableDeclaration "Object" "bar" $ Just NullLiteral]] $
parse [CLASS,IDENTIFIER "WithInitField",LBRACKET,IDENTIFIER "Object",IDENTIFIER "bar",ASSIGN,NULLLITERAL,SEMICOLON,RBRACKET]
testReturnVoid = TestCase $
assertEqual "expect block with return nothing" [Block [Return Nothing]] $
@@ -191,6 +191,25 @@ testExpressionThisMethodCall = TestCase $
testExpressionThisMethodCallParam = TestCase $
assertEqual "expect this methocall" (StatementExpressionExpression (MethodCall (Reference "this") "foo" [Reference "x"])) $
parseExpression [THIS,DOT,IDENTIFIER "foo",LBRACE,IDENTIFIER "x",RBRACE]
testExpressionFieldAccess = TestCase $
assertEqual "expect NameResolution" (BinaryOperation NameResolution (Reference "this") (Reference "b")) $
parseExpression [THIS,DOT,IDENTIFIER "b"]
testExpressionSimpleFieldAccess = TestCase $
assertEqual "expect Reference" (Reference "a") $
parseExpression [IDENTIFIER "a"]
testExpressionFieldSubAccess = TestCase $
assertEqual "expect NameResolution without this" (BinaryOperation NameResolution (Reference "a") (Reference "b")) $
parseExpression [IDENTIFIER "a",DOT,IDENTIFIER "b"]
testExpressionConstructorCall = TestCase $
assertEqual "expect constructor call" (StatementExpressionExpression (ConstructorCall "Foo" [])) $
parseExpression [NEW,IDENTIFIER "Foo",LBRACE,RBRACE]
testExpresssionExternalMethodCall = TestCase $
assertEqual "expect method call on sub" (StatementExpressionExpression (MethodCall (Reference "Obj") "foo" [])) $
parseExpression [IDENTIFIER "Obj",DOT,IDENTIFIER "foo",LBRACE,RBRACE]
testExpressionAssignWithThis = TestCase $
assertEqual "expect assignment on Field" (StatementExpressionExpression (Assignment (BinaryOperation NameResolution (Reference "this") (Reference "x")) (Reference "y"))) $
parseExpression [THIS,DOT,IDENTIFIER "x",ASSIGN,IDENTIFIER "y"]
testStatementIfThen = TestCase $
assertEqual "expect empty ifthen" [If (Reference "a") (Block [Block []]) Nothing] $
@@ -208,6 +227,49 @@ testStatementAssign = TestCase $
testStatementMethodCallNoParams = TestCase $
assertEqual "expect methodcall statement no params" [StatementExpressionStatement (MethodCall (Reference "this") "foo" [])] $
parseStatement [IDENTIFIER "foo",LBRACE,RBRACE,SEMICOLON]
testStatementConstructorCall = TestCase $
assertEqual "expect constructor call" [StatementExpressionStatement (ConstructorCall "Foo" [])] $
parseStatement [NEW,IDENTIFIER "Foo",LBRACE,RBRACE,SEMICOLON]
testStatementConstructorCallWithArgs = TestCase $
assertEqual "expect constructor call" [StatementExpressionStatement (ConstructorCall "Foo" [Reference "b"])] $
parseStatement [NEW,IDENTIFIER "Foo",LBRACE,IDENTIFIER "b",RBRACE,SEMICOLON]
testStatementPreIncrement = TestCase $
assertEqual "expect increment" [StatementExpressionStatement $ PostIncrement $ Reference "a"] $
parseStatement [IDENTIFIER "a",INCREMENT,SEMICOLON]
testForLoop = TestCase $
assertEqual "expect for loop" [Block [
LocalVariableDeclaration (VariableDeclaration "int" "i" (Just (IntegerLiteral 0))),
While (BinaryOperation CompareLessThan (Reference "i") (IntegerLiteral 3)) (Block [Block [], StatementExpressionStatement (PostIncrement (Reference "i"))])
]] $
parseStatement [FOR,LBRACE,INT,IDENTIFIER "i",ASSIGN,INTEGERLITERAL 0,SEMICOLON,IDENTIFIER "i",LESS,INTEGERLITERAL 3,SEMICOLON,IDENTIFIER "i",INCREMENT,RBRACE,LBRACKET,RBRACKET]
testForLoopExpressionlistInInit = TestCase $
assertEqual "expect expressionlist in init part of for loop" [Block [
StatementExpressionStatement (PostIncrement (Reference "i")),
While (BinaryOperation CompareLessThan (Reference "i") (IntegerLiteral 3)) (Block [Block [], StatementExpressionStatement (PostIncrement (Reference "i"))])
]] $
parseStatement [FOR,LBRACE,IDENTIFIER "i",INCREMENT,SEMICOLON,IDENTIFIER "i",LESS,INTEGERLITERAL 3,SEMICOLON,IDENTIFIER "i",INCREMENT,RBRACE,LBRACKET,RBRACKET]
testForLoopMultipleUpdateExpressions = TestCase $
assertEqual "expect for loop with multiple update statements" [Block [
LocalVariableDeclaration (VariableDeclaration "int" "i" (Just (IntegerLiteral 0))),
While (BinaryOperation CompareLessThan (Reference "i") (IntegerLiteral 3)) (Block [Block [], StatementExpressionStatement (PostIncrement (Reference "i")), StatementExpressionStatement (PostIncrement (Reference "k"))])
]] $
parseStatement [FOR,LBRACE,INT,IDENTIFIER "i",ASSIGN,INTEGERLITERAL 0,SEMICOLON,IDENTIFIER "i",LESS,INTEGERLITERAL 3,SEMICOLON,IDENTIFIER "i",INCREMENT,COMMA,IDENTIFIER "k",INCREMENT,RBRACE,LBRACKET,RBRACKET]
testForLoopEmptyFirstPart = TestCase $
assertEqual "expect for loop with empty init part" [Block [
While (BinaryOperation CompareLessThan (Reference "i") (IntegerLiteral 3)) (Block [Block [], StatementExpressionStatement (PostIncrement (Reference "i"))])
]] $
parseStatement [FOR,LBRACE,SEMICOLON,IDENTIFIER "i",LESS,INTEGERLITERAL 3,SEMICOLON,IDENTIFIER "i",INCREMENT,RBRACE,LBRACKET,RBRACKET]
testForLoopEmtpySecondPart = TestCase $
assertEqual "expect for loop with empty expresion part" [Block [
While (BooleanLiteral True) (Block [Block [], StatementExpressionStatement (PostIncrement (Reference "i"))])
]] $
parseStatement [FOR,LBRACE,SEMICOLON,SEMICOLON,IDENTIFIER "i",INCREMENT,RBRACE,LBRACKET,RBRACKET]
testForLoopEmtpy = TestCase $
assertEqual "expect empty for loop" [Block [While (BooleanLiteral True) (Block [Block []])]] $
parseStatement [FOR,LBRACE,SEMICOLON,SEMICOLON,RBRACE,LBRACKET,RBRACKET]
@@ -267,9 +329,24 @@ tests = TestList [
testExpressionMethodCallTwoParams,
testExpressionThisMethodCall,
testExpressionThisMethodCallParam,
testExpressionFieldAccess,
testExpressionSimpleFieldAccess,
testExpressionFieldSubAccess,
testExpressionConstructorCall,
testExpresssionExternalMethodCall,
testExpressionAssignWithThis,
testStatementIfThen,
testStatementIfThenElse,
testStatementWhile,
testStatementAssign,
testStatementMethodCallNoParams
testStatementMethodCallNoParams,
testStatementConstructorCall,
testStatementConstructorCallWithArgs,
testStatementPreIncrement,
testForLoop,
testForLoopExpressionlistInInit,
testForLoopMultipleUpdateExpressions,
testForLoopEmptyFirstPart,
testForLoopEmtpySecondPart,
testForLoopEmtpy
]

View File

@@ -2,13 +2,11 @@ module Main where
import Test.HUnit
import TestLexer
import TestByteCodeGenerator
import TestParser
tests = TestList [
TestLabel "TestLexer" TestLexer.tests,
TestLabel "TestParser" TestParser.tests,
TestLabel "TestByteCodeGenerator" TestByteCodeGenerator.tests]
TestLabel "TestLexer" TestLexer.tests,
TestLabel "TestParser" TestParser.tests
]
main = do runTestTTAndExit Main.tests

107
doc/bytecode.tex Normal file
View File

@@ -0,0 +1,107 @@
\section{Bytecodegenerierung}
Die Bytecodegenerierung ist letztendlich eine zweistufige Transformation:
\vspace{20px}
\texttt{Getypter AST -> [ClassFile] -> [[Word8]]}
\vspace{20px}
Vom AST, der bereits den Typcheck durchlaufen hat, wird zunächst eine Abbildung in die einzelnen ClassFiles vorgenommen.
Diese ClassFiles werden anschließend in deren Byte-Repräsentation serialisiert.
\subsection{Codegenerierung}
Für die erste der beiden Transformationen (\texttt{Getypter AST -> [ClassFile]}) werden die Konzepte der ``Builder'' und ``Assembler'' eingeführt.
Sie sind wie folgt definiert:
\vspace{20px}
\begin{lstlisting}[language=haskell]
type ClassFileBuilder a = a -> ClassFile -> ClassFile
type Assembler a = ([ConstantInfo], [Operation], [String]) -> a
-> ([ConstantInfo], [Operation], [String])
\end{lstlisting}
\vspace{20px}
Die Idee hinter beiden ist, dass sie jeweils zwei Inputs haben, wobei der Rückgabewert immer den gleichen Typ hat wie einer der Inputs.
Das erlaubt es, eine Faltung durchzuführen. Ein ClassFileBuilder z.B bekommt als ersten Parameter den AST,
und als zweiten Parameter (und Rückgabewert) eine ClassFile. Soll nun eine Klasse gebaut werden,
wird der ClassFileBuilder mit dem AST und einer leeren ClassFile aufgerufen.
Der Zustand dieser anfangs leeren ClassFile wird durch alle folgenden Builder/Assembler durchgeschleift, was es erlaubt,
nach und nach kleinere Transformationen auf sie anzuwenden. Der Nutzer ruft beispielsweise die Funktion \texttt{classBuilder} auf.
Diese wendet nach und nach folgende Transformationen an:
\vspace{20px}
\begin{enumerate}
\item Allen Konstruktoren werden Initialisierer aller Felder hinzugefügt
\item Für jedes Feld der Klasse wird ein Eintrag im Konstantenpool \& der Classfile erstellt
\item Für jede Methode wird ein Eintrag im Konstantenpool \& der Classfile erstellt
\item Allen Methoden wird der zugehörige Bytecode erstellt und zugewiesen
\item Allen Konstruktoren wird der zugehörige Bytecode erstellt und zugewiesen
\end{enumerate}
\vspace{20px}
Die Unterteilung von Deklaration der Methoden/Konstruktoren und Bytecodeerzeugung ist deswegen notwendig,
weil der Code einer Methode auch eine andere, erst nachher deklarierte Methode aufrufen kann.
Nach dem Hinzufügen der Deklarationen sind alle Methoden/Konstruktoren der Klasse bekannt.
Wie oben beschrieben wird auch hier der Zustand über alle Faltungen mitgenommen.
Jeder Schritt hat Zugriff auf alle Daten, die aus dem vorherigen Schritt bleiben. Sukzessive wird eine korrekte ClassFile aufgebaut.
Besonders interessant sind hierbei die beiden letzten Schritte. Dort wird das Verhalten jeder einzelnen Methode/Konstruktor in Bytecode übersetzt.
In diesem Schritt werden zusätzlich zu den \texttt{Buildern} noch die \texttt{Assembler} verwendet (Definition siehe oben.).
Die Assembler funktionieren ähnlich wie die Builder, arbeiten allerdings nicht auf einer ClassFile, sondern auf dem Inhalt einer Methode;
Sie verarbeiten jeweils ein Tupel der Form:
\vspace{20px}
\texttt{([ConstantInfo], [Operation], [String])}
\vspace{20px}
Dieses repräsentiert:
\vspace{20px}
\texttt{(Konstantenpool, Bytecode, Lokale Variablen)}
\vspace{20px}
In der Praxis werden meist nur Bytecode und Konstanten hinzugefügt. Prinzipiell können Assembler auch Code/Konstanten entfernen oder modifizieren.
Als Beispiel dient hier der Assembler \texttt{assembleExpression}:
\vspace{20px}
\begin{lstlisting}[language=haskell]
assembleExpression (constants, ops, lvars) (TypedExpression _ NullLiteral)
= (constants, ops ++ [Opaconst_null], lvars)
\end{lstlisting}
\vspace{20px}
Hier werden die Konstanten und lokalen Variablen des Inputs nicht berührt, dem Bytecode wird lediglich die Operation \texttt{aconst\_null} hinzugefügt.
Damit ist das Verhalten des gematchten Inputs - eines Nullliterals - abgebildet.
Die Assembler rufen sich teilweise rekursiv selbst auf, da ja auch der AST verschachteltes Verhalten abbilden kann.
Der Startpunkt für die Assembly einer Methode ist der Builder \texttt{methodAssembler}. Dieser entspricht Schritt 3 in der obigen Übersicht.
\subsection{Serialisierung}
Damit Bytecode generiert werden kann, braucht es Strukturen, die die Daten halten, die letztendlich serialisiert werden.
Die JVM erwartet den kompilierten Code in handliche Pakete verpackt.
Die Struktur dieser Pakete ist \href{https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html}{hier dokumentiert}.
Jede Struktur, die in dieser übergreifenden Class File vorkommt, haben wir in Haskell abgebildet.
Es gibt z.B die Struktur "ClassFile", die wiederum weitere Strukturen wie z.B Informationen über Felder oder Methoden der Klasse beinhaltet.
Alle diese Strukturen implementieren folgende TypeClass:
\vspace{20px}
\begin{lstlisting}[language=haskell]
class Serializable a where
serialize :: a -> [Word8]
\end{lstlisting}
\vspace{20px}
Hier ist ein Beispiel anhand der Serialisierung der einzelnen Operationen:
\vspace{20px}
\begin{lstlisting}[language=haskell]
instance Serializable Operation where
serialize Opiadd = [0x60]
serialize Opisub = [0x64]
serialize Opimul = [0x68]
...
serialize (Opgetfield index) = 0xB4 : unpackWord16 index
\end{lstlisting}
\vspace{20px}
Die Struktur ClassFile ruft für deren Kinder rekursiv diese \texttt{serialize} Funktion auf und konkateniert die Ergebnisse.
Am Ende bleibt eine flache Word8-Liste übrig, die Serialisierung ist damit abgeschlossen.
Da der Typecheck sicherstellt, dass alle referenzierten Methoden/Felder gültig sind,
kann die Übersetzung der einzelnen Klassen voneinander unabhängig geschehen.

BIN
doc/documentation.pdf Normal file

Binary file not shown.

34
doc/documentation.tex Normal file
View File

@@ -0,0 +1,34 @@
\documentclass[12pt, parskip=half, headheight=12pt, BCOR=8mm, footheight=16pt]{extarticle}
% General document formatting
\usepackage[margin=1.0in]{geometry}
\usepackage[parfill]{parskip}
\usepackage[utf8]{inputenc}
\usepackage[german]{babel}
\usepackage{enumitem}
\usepackage{listings}
\usepackage{hyperref}
\renewcommand\descriptionlabel[1]{$\bullet$ \textbf{#1}}
\hypersetup{
colorlinks=true,
linkcolor=blue,
filecolor=magenta,
urlcolor=cyan,
}
\let\clearpage\relax
\begin{document}
\include{features}
\newpage
\include{parser}
\newpage
\include{typecheck}
\newpage
\include{bytecode}
\newpage
\include{whodunit}
\newpage
\end{document}

34
doc/features.tex Normal file
View File

@@ -0,0 +1,34 @@
\section{Sprach-Features}
\begin{itemize}
\item Klassen
\item Felder
\item Methoden (mit Parametern)
\item Konstruktoren (mit Parametern)
\item Standardkonstruktoren
\item Lokale Variablen
\item Zuweisungen (Feld- und lokale Variablen)
\item Arithmetik (\texttt{+, -, *, /, \%,} Klammern, Korrekte Operations-Präzedenz)
\item Arithmetische Zuweisungen (\texttt{+=, -=, *=, /=, \%=, \&=, |=, \^{}=})
\item Vergleichsoperationen (\texttt{<, >, <=, >=, ==, !=})
\item Boolsche Operationen (\texttt{||, \&\&})
\item Unäre Operationen (\texttt{-, ~})
\item Binar-Operationen (\texttt{\&, |, \^})
\item Pre/Post-Inkrement \& Dekrement
\item Kontrollflussstrukturen:
\begin{itemize}
\item If/Else
\item While
\item For
\item Return (mit/ohne Rückgabewert)
\end{itemize}
\item Default-Werte für alle Klassenfelder
\item Mehrere Klassen in einer Datei
\item Implizites \texttt{this}
\item Beliebig verschachtelte Namensketten
\item Beliebige Deklarationsreihenfolge
\item Literale für Integer, Characters, Booleans
\item Platzhalter/Separatoren in Integerliteralen (z.B. \texttt{1\_000\_000})
\item Deklaration und Zuweisung in einer Anweisung
\item Beliebig verschachtelte Blöcke
\item Überladung von Methoden \& Konstruktoren
\end{itemize}

1
doc/generate.sh Normal file
View File

@@ -0,0 +1 @@
pdflatex documentation.tex

65
doc/parser.tex Normal file
View File

@@ -0,0 +1,65 @@
\section{Lexer \& Parser}
\subsection{Lexer}
Der Lexer wurde mit dem Alex tool implementiert. Dieser ist dafür zuständig den langen String in einzelne Tokens umzuwandeln. In der Alex Datei gibt es für jedes Token einen regulären Ausdruck. Bei den meisten Tokens ist das einfach das Schlüsselwort. Etwas komplexer waren Identifier, Integerliterale Strings und Chars. Für die Definition wurde sich eng an die offizielle Java Language Specification gehalten. Es ist beispielsweise auch möglich Unterstriche in Integerliterale einzubauen (Bsp.: \verb|234_343_000|) Es sind fast alle Schlüsselwörter von Java im Lexer implementiert, auch wenn nicht alle davon vom Parser geparst werden können. Whitespace und Kommentare werden direkt ignoriert und verworfen. Für Charliterale und Integerliterale gibt es auch spezielle Fehlermeldungen. Die meisten Tokens haben nur die Information, zu welchem Keyword sie gehören. Eine Ausnahme bilden der Identifier und die Literale. Für den Identifier wird noch der Name gespeichert und für die Literale der entsprechende Wert. Mit der Funktion alexScanTokens kann dann ein beliebiger String in Tokens umgewandelt werden.
Die komplexeren Tokens haben Unittests, welche mit dem Testframework HUnit geschrieben wurden. Es gibt Tests für Kommentare, Identifier, Literale und ein paar weitere Tokens.
\subsection{Parser}
Der Parser wurde mit dem Happy tool implementiert. Er baut aus einer Liste von Tokens einen ungetypten AST. Wir haben bereits eine Grammatik bekommen und mussten diese noch in den AST umwandeln.
Um den Parser aufzubauen wurde zuerst ein Großteil der Grammatik auskommentiert und Stück für Stück wurden die Umwandlungen hinzugefügt. Immer wenn ein neues Feature umgesetzt wurde, wurde dafür ein weiterer Unit Test geschrieben. Es gibt also für jede komplexe Ableitungsregel mindestens einen Unittest.
\subsubsection{Klassenaufbau}
Als erstes wurden leere Konstruktoren Methoden und Felder umgesetzt. Da in Java Konstruktoren, Methoden und Felder durcheinander vorkommen können geben die Ableitungsregeln einen Datentyp namens `MemberDeclaration` zurück. Die classbodydeclarations Regel baut dann einen 3-Tupel mit einer Liste aus Konstruktoren, einer aus Methoden und einer aus Feldern. Über pattern matching werden diese Listen dann erweitert und in der darüberliegenden Regel schließlich extrahiert.
Bei folgender Klasse:
\begin{lstlisting}[language=Java]
class TestClass {
int field;
TestClass() {}
void foo() {}
}
\end{lstlisting}
würde die Regel folgendes Tupel zurückgeben:
\begin{lstlisting}[language=Haskell]
(
[ConstructorDeclaration "TestClass" [] (Block [])],
[MethodDeclaration "void" "foo" [] (Block [])],
[VariableDeclaration "int" "field" Nothing]
)
\end{lstlisting}
und folgende Klasse wird erstellt
\begin{lstlisting}[language=Haskell]
Class "TestClass"
[ConstructorDeclaration "TestClass" [] (Block [])]
[MethodDeclaration "void" "foo" [] (Block [])]
[VariableDeclaration "int" "field" Nothing]
\end{lstlisting}
Das Nothing ist in diesem Fall ein Platzhalter für eine Zuweisung, da unser Compiler auch Zuweisung bei der Felddeklaration unterstützt.
In Java ist es möglich mehrere Variablen in einer Zeile zu deklarieren (Bsp.: `int x, y;`). Beim Parsen ergiebt sich dann die Schwierigkeit, dass man in dem Moment, wo man die Variable parst nicht weiß welchen Datentyp diese hat. Aus diesem Grund gibt es den Datentyp Declarator, welcher nur den Identifier und eventuell eine Zuweisung enthält. In den darüberliegenden Regeln fielddeclaration und localvariabledeclaration wird dann die Typinformation hinzugefügt mithilfe der Funktion convertDeclarator.
\subsubsection{Syntactic Sugar}
Für die Zuweisung wird auch die Kombination mit Rechenoperatoren unterstützt. Das ganze ist durch Syntactic Sugar im Parser umgesetzt. Wenn es einen Zuweisungsoperator gibt, dann wird der Ausdruck in eine Zuweisung und Rechnung aufgeteilt. Bsp.: \verb|x += 3;| wird umgewandelt in \verb|x = x + 3|.
For-Schleifen wurde auch rein im Parser durch Syntactic Sugar implementiert. Eine For-Schleife wird dabei in eine While-Schleife umgewandelt. Dafür wird zuerst ein Block erstellt, sodass die deklarierten Variablen auch nur für den Bereich der Schleife gültig sind. Die Bedingung der For-Schleife kann in die While-Schleife übernommen werden. Innerhalb der While-Schleife folgen zuerst die Statements, die im Block der For-Schleife waren und danach die Update-Statements.
\begin{lstlisting}[language=Java]
for (int i = 0; i < 9; i++) {
foo();
}
\end{lstlisting}
wird umgewandelt in:
\begin{lstlisting}[language=Java]
{
int i = 0;
while (i < 9) {
foo();
i++;
}
}
\end{lstlisting}

105
doc/typecheck.tex Normal file
View File

@@ -0,0 +1,105 @@
\section{Typecheck}
\subsection{Überblick \& Struktur}
Die Typprüfung beginnt mit der Funktion \texttt{typeCheckCompilationUnit}, die eine Kompilationseinheit als Eingabe erhält.
Diese Kompilationseinheit besteht aus einer Liste von Klassen. Jede Klasse wird einzeln durch die Funktion \texttt{typeCheckClass} überprüft.
Innerhalb dieser Funktion wird eine Symboltabelle erstellt, die den Namen der Klasse als Typ und \texttt{this} als Identifier enthält.
Diese Symboltabelle wird verwendet, um Typinformationen nach dem Lokalitätsprinzip während der Typprüfung zugänglich zu machen und zu verwalten.
Die Typprüfung einer Klasse umfasst die Überprüfung aller Konstruktoren, Methoden und Felder.
Die Methode \texttt{typeCheckConstructorDeclaration} ist für die Typprüfung einzelner Konstruktordeklarationen verantwortlich,
während \texttt{typeCheckMethodDeclaration} für die Typprüfung einzelner Methodendeklarationen zuständig ist.
Beide Funktionen überprüfen die Parameter und den Rumpf der jeweiligen Konstruktoren bzw. Methoden.
Der Rumpf wird durch rekursive Aufrufe von \texttt{typeCheckStatement} überprüft, die verschiedene Arten von Anweisungen wie If-Anweisungen,
While-Schleifen, Rückgabeanweisungen und Blockanweisungen behandelt.
\subsection{Ablauf \& Symboltabellen}
Eine zentrale Komponente des Typecheckers ist die Symboltabelle ("symtab"), die Informationen über die Bezeichner und ihre zugehörigen Datentypen speichert.
Die Symboltabelle wird kontinuierlich angepasst, während der Typechecker die verschiedenen Teile des Programms durchläuft.
\subsubsection{Anpassung der Symboltabelle}
\begin{description}
\item[Klassenkontext] Beim Typcheck einer Klasse wird eine initiale Symboltabelle erstellt, die die \texttt{this}-Referenz enthält.
Dies geschieht in der Funktion \texttt{typeCheckClass}.
\item[Konstruktorkontext] Innerhalb eines Konstruktors wird die Symboltabelle um die Parameter des Konstruktors erweitert.
Dies geschieht in \texttt{typeCheckConstructorDeclaration}. Der Rückgabetyp eines Konstruktors ist implizit \texttt{void},
was überprüft wird, um sicherzustellen, dass kein Wert zurückgegeben wird.
\item[Methodenkontext] Innerhalb einer Methode wird die Symboltabelle um die Parameter der Methode erweitert sowie den Rückgabetyp der Methode,
um die einzelnen Returns dagegen zu prüfen. Dies geschieht in \texttt{typeCheckMethodDeclaration}.
\item[Blockkontext] Bei der Überprüfung eines Blocks (\texttt{typeCheckStatement} für Block) wird die Symboltabelle für jede Anweisung
innerhalb des Blocks aktualisiert. Lokale Variablen, die innerhalb des Blocks deklariert werden, werden zur Symboltabelle hinzugefügt.
Das bedeutet, dass automatisch, sobald der Block zu Ende ist, alle dort deklarierten Variablen danach nicht mehr zugänglich sind.
\end{description}
\subsubsection{Unterscheidung zwischen lokalen und Feldvariablen}
Bei der Typprüfung von Referenzen (\texttt{typeCheckExpression} für Reference) wird zuerst in der Symboltabelle nach dem Bezeichner gesucht.
Sollte dieser gefunden werden, handelt es sich um eine lokale Variable. Wenn der Bezeichner nicht gefunden wird, wird angenommen,
dass es sich um eine Feldvariable handelt. In diesem Fall wird die Klasse, zu der die \texttt{this}-Referenz gehört, durchsucht,
um die Feldvariable zu finden. Dies ermöglicht die Unterscheidung zwischen lokalen Variablen und Feldvariablen.
Dies ist auch nur möglich, da wir die Feldvariablen und Methoden nicht in die Symboltabelle gelegt haben und stattdessen nur die \texttt{this}-Referenz.
\subsection{Fehlerbehandlung}
Ein zentraler Aspekt des Typecheckers ist die Fehlerbehandlung. Bei Typinkonsistenzen oder ungültigen Operationen werden
aussagekräftige Fehlermeldungen generiert. Beispiele für solche Fehlermeldungen sind:
\begin{description}
\item[Typinkonsistenzen] Wenn der Rückgabetyp einer Methode nicht mit dem deklarierten Rückgabetyp übereinstimmt oder die Anzahl der Parameter nicht übereinstimmt.
\item[Ungültige Operationen] Wenn eine arithmetische Operation auf inkompatiblen Typen durchgeführt wird.
\item[Nicht gefundene Bezeichner] Wenn eine Referenz auf eine nicht definierte Variable verweist.
\end{description}
Diese Fehlermeldungen helfen Entwicklern, die Ursachen von Typfehlern schnell zu identifizieren und zu beheben.
Generell sind diese oftmals sehr spezifisch, was das Problem recht schnell identifizieren sollte.
Z.B. falsche Reihenfolge / falsche Typen der Parameter beim Methodenaufruf sind direkt erkennbar.
\subsection{Typprüfung von Kontrollstrukturen und Blöcken}
\subsubsection{If-Anweisungen}
Bei der Typprüfung einer If-Anweisung (\texttt{typeCheckStatement} für If) wird zuerst der Typ der Bedingung überprüft, um sicherzustellen,
dass es sich um einen booleschen Ausdruck handelt. Anschließend werden die Then- und Else-Zweige geprüft.
Der Typ der If-Anweisung selbst wird durch die Vereinheitlichung der Typen der Then- und Else-Zweige bestimmt.
Falls einer der Zweige keinen Rückgabewert hat, wird angenommen, dass der Rückgabewert \texttt{void} ist.
Dies wurde so gelöst, um im Typchecker feststellen zu können, ob beide Zweige einen Return haben.
Wenn nur einer der Zweige ein Return hat, wird im umliegenden Block ein weiteres benötigt, was durch den Typ \texttt{void} erzwungen wird.
Dadurch weiß der Typchecker Bescheid.
\subsubsection{Block-Anweisungen}
Die Typprüfung eines Blocks erfolgt in \texttt{typeCheckStatement} für Block.
Jede Anweisung im Block wird nacheinander überprüft und die Symboltabelle wird entsprechend aktualisiert.
Der Typ des Blocks wird durch die Vereinheitlichung der Typen aller Anweisungen im Block bestimmt.
Wenn der Block keine Anweisungen hat, wird der Typ \texttt{void} angenommen.
\subsubsection{Rückgabeanweisungen}
Die Typprüfung einer Rückgabeanweisung (\texttt{typeCheckStatement} für Return) überprüft,
ob der Rückgabewert der Anweisung mit dem deklarierten Rückgabetyp der Methode übereinstimmt.
Dafür wurde zu Beginn der Methodentypprüfung der Rückgabetyp der Methode in die Symboltabelle eingetragen. Wenn der Rückgabewert \texttt{null} ist,
wird überprüft, ob der deklarierte Rückgabetyp ein Objekttyp ist. Dies stellt sicher, dass Methoden immer den korrekten Typ zurückgeben.
Generell wird bei der Prüfung nach dem UpperBound geschaut und ebenfalls wird nachgeschaut, ob, wenn der Rückgabetyp \texttt{Object} ist,
der Return-Wert auch eine tatsächlich existierende Klasse ist, indem in die Klassentabelle geschaut wird.
\subsubsection{Konstruktorüberladung und -prüfung}
Die Typprüfung unterstützt Konstruktorüberladung. Bei der Typprüfung von Konstruktoraufrufen (\texttt{typeCheckStatementExpression}
für \texttt{ConstructorCall}) wird überprüft, ob es mehrere Konstruktoren mit derselben Anzahl von Parametern gibt.
Falls mehrere passende Konstruktoren gefunden werden, wird ein Fehler gemeldet.
\begin{description}
\item[Parameterabgleich] Die Parameter eines Konstruktors werden gegen die Argumente des Aufrufs abgeglichen.
Dies umfasst die Prüfung der Typen und, falls es sich um \texttt{null} handelt, die Überprüfung, ob der Parameter ein Objekttyp ist.
\item[Fehlerbehandlung] Wenn kein passender Konstruktor gefunden wird, wird eine detaillierte Fehlermeldung generiert,
die die erwarteten Signaturen und die tatsächlichen Argumenttypen anzeigt. Wenn mehrere passende Konstruktoren gefunden werden,
wird ebenfalls ein Fehler gemeldet.
\end{description}
\subsubsection{Methodenüberladung und -prüfung}
Die Typprüfung unterstützt auch Methodenüberladung. Bei der Typprüfung von Methodenaufrufen (\texttt{typeCheckStatementExpression} für \texttt{MethodCall})
wird überprüft, ob es mehrere Methoden mit demselben Namen, aber unterschiedlichen Parametertypen gibt.
\begin{description}
\item[Parameterabgleich] Die Parameter einer Methode werden gegen die Argumente des Aufrufs abgeglichen.
Dies umfasst die Prüfung der Typen und, falls es sich um \texttt{null} handelt, die Überprüfung, ob der Parameter ein Objekttyp ist.
\item[Fehlerbehandlung] Wenn keine passende Methode gefunden wird, wird eine detaillierte Fehlermeldung generiert,
die die erwarteten Signaturen und die tatsächlichen Argumenttypen anzeigt. Wenn mehrere passende Methoden gefunden werden,
wird ebenfalls ein Fehler gemeldet.
\end{description}

19
doc/whodunit.tex Normal file
View File

@@ -0,0 +1,19 @@
\section{Aufgabenverteilung}
\begin{description}
\item[Marvin Schlegel] Parser \& Lexer
\item[Fabian Noll] Semantik- \& Typcheck
\item[Christian Brier] Bytecodegenerierung
\item[Matthias Raba] Bytecodegenerierung
\end{description}
\vspace{20px}
Marvin Schlegel und Fabian Noll haben ihre Teilaufgaben eigenständig bearbeitet.
Die Bytecodegenerierung wurde von Matthias Raba und Christian Brier im Stile des Pair Programmings zu zweit erarbeitet.
Durch bisher gute Erfahrungen in vorherigen Projekten, sowie dem Interesse, alle Teile der Bytecodegenerierung zu sehen,
wurde diese Programmierungsform als die Beste ausgewählt.
Während der Implementierungsphase wurde viel zwischen den 3 einzelnen Teams kommuniziert.
Wurden Fehler in einer der Komponenten gefunden, wurden die jeweiligen Verantwortlichen informiert um das Problem zu beheben.
Jedes der Teams arbeitete auf einem eigenen Branch, die einzelnen Beiträge wurde regelmäßig auf dem master-Branch zusammengeführt.
Insgesamt lief die Implementierungsphase wie geplant und ohne weitere Komplikationen ab.

View File

@@ -10,20 +10,20 @@ executable compiler
array,
HUnit,
utf8-string,
bytestring
bytestring,
filepath
default-language: Haskell2010
hs-source-dirs: src,
src/ByteCode,
src/ByteCode/ClassFile
hs-source-dirs: src
build-tool-depends: alex:alex, happy:happy
other-modules: Parser.Lexer,
Parser.JavaParser
Parser.JavaParser,
Ast,
Example,
Typecheck,
ByteCode.Util,
ByteCode.ByteUtil,
ByteCode.ClassFile,
ByteCode.ClassFile.Generator,
ByteCode.Assembler,
ByteCode.Builder,
ByteCode.Constants
test-suite tests
@@ -34,15 +34,11 @@ test-suite tests
array,
HUnit,
utf8-string,
bytestring
bytestring,
filepath
build-tool-depends: alex:alex, happy:happy
other-modules: Parser.Lexer,
Parser.JavaParser,
Ast,
TestLexer,
TestParser,
TestByteCodeGenerator,
ByteCode.ByteUtil,
ByteCode.ClassFile,
ByteCode.ClassFile.Generator,
ByteCode.Constants
TestParser

View File

@@ -2,12 +2,14 @@ module Ast where
type CompilationUnit = [Class]
type DataType = String
type Identifier = String
type Identifier = String
type Modifier = String
data ParameterDeclaration = ParameterDeclaration DataType Identifier deriving (Show, Eq)
data VariableDeclaration = VariableDeclaration DataType Identifier (Maybe Expression) deriving (Show, Eq)
data Class = Class DataType [MethodDeclaration] [VariableDeclaration] deriving (Show, Eq)
data MethodDeclaration = MethodDeclaration DataType Identifier [ParameterDeclaration] Statement deriving (Show, Eq)
data VariableDeclaration = VariableDeclaration DataType Modifier Identifier (Maybe Expression) deriving (Show, Eq)
data Class = Class DataType Modifier [ConstructorDeclaration] [MethodDeclaration] [VariableDeclaration] deriving (Show, Eq)
data MethodDeclaration = MethodDeclaration DataType Modifier Identifier [ParameterDeclaration] Statement deriving (Show, Eq)
data ConstructorDeclaration = ConstructorDeclaration Modifier Identifier [ParameterDeclaration] Statement deriving (Show, Eq)
data Statement
= If Expression Statement (Maybe Statement)
@@ -23,11 +25,11 @@ data StatementExpression
= Assignment Expression Expression
| ConstructorCall DataType [Expression]
| MethodCall Expression Identifier [Expression]
| TypedStatementExpression DataType StatementExpression
| PostIncrement Expression
| PostDecrement Expression
| PreIncrement Expression
| PreDecrement Expression
| TypedStatementExpression DataType StatementExpression
deriving (Show, Eq)
data BinaryOperator

275
src/ByteCode/Assembler.hs Normal file
View File

@@ -0,0 +1,275 @@
module ByteCode.Assembler where
import ByteCode.Constants
import ByteCode.ClassFile (ClassFile (..), ConstantInfo (..), MemberInfo(..), Operation(..), Attribute(..), opcodeEncodingLength)
import ByteCode.Util
import Ast
import Data.Char
import Data.List
import Data.Word
type Assembler a = ([ConstantInfo], [Operation], [String]) -> a -> ([ConstantInfo], [Operation], [String])
assembleExpression :: Assembler Expression
assembleExpression (constants, ops, lvars) (TypedExpression _ (BinaryOperation op a b))
| op `elem` [Addition, Subtraction, Multiplication, Division, Modulo, BitwiseAnd, BitwiseOr, BitwiseXor, And, Or] = let
(aConstants, aOps, _) = assembleExpression (constants, ops, lvars) a
(bConstants, bOps, _) = assembleExpression (aConstants, aOps, lvars) b
in
(bConstants, bOps ++ [binaryOperation op], lvars)
| op `elem` [CompareEqual, CompareNotEqual, CompareLessThan, CompareLessOrEqual, CompareGreaterThan, CompareGreaterOrEqual] = let
(aConstants, aOps, _) = assembleExpression (constants, ops, lvars) a
(bConstants, bOps, _) = assembleExpression (aConstants, aOps, lvars) b
cmp_op = comparisonOperation op 9
cmp_ops = [cmp_op, Opsipush 0, Opgoto 6, Opsipush 1]
in
(bConstants, bOps ++ cmp_ops, lvars)
assembleExpression (constants, ops, lvars) (TypedExpression _ (BinaryOperation NameResolution (TypedExpression atype a) (TypedExpression btype (FieldVariable b)))) = let
(fConstants, fieldIndex) = getFieldIndex constants (atype, b, datatypeDescriptor btype)
(aConstants, aOps, _) = assembleExpression (fConstants, ops, lvars) (TypedExpression atype a)
in
(aConstants, aOps ++ [Opgetfield (fromIntegral fieldIndex)], lvars)
assembleExpression (constants, ops, lvars) (TypedExpression _ (CharacterLiteral literal)) =
(constants, ops ++ [Opsipush (fromIntegral (ord literal))], lvars)
assembleExpression (constants, ops, lvars) (TypedExpression _ (BooleanLiteral literal)) =
(constants, ops ++ [Opsipush (if literal then 1 else 0)], lvars)
assembleExpression (constants, ops, lvars) (TypedExpression _ (IntegerLiteral literal))
| literal <= 32767 && literal >= -32768 = (constants, ops ++ [Opsipush (fromIntegral literal)], lvars)
| otherwise = (constants ++ [IntegerInfo (fromIntegral literal)], ops ++ [Opldc_w (fromIntegral (1 + length constants))], lvars)
assembleExpression (constants, ops, lvars) (TypedExpression _ NullLiteral) =
(constants, ops ++ [Opaconst_null], lvars)
assembleExpression (constants, ops, lvars) (TypedExpression etype (UnaryOperation Not expr)) = let
(exprConstants, exprOps, _) = assembleExpression (constants, ops, lvars) expr
newConstant = fromIntegral (1 + length exprConstants)
in case etype of
"int" -> (exprConstants ++ [IntegerInfo 0x7FFFFFFF], exprOps ++ [Opldc_w newConstant, Opixor], lvars)
"char" -> (exprConstants, exprOps ++ [Opsipush 0xFFFF, Opixor], lvars)
"boolean" -> (exprConstants, exprOps ++ [Opsipush 0x01, Opixor], lvars)
assembleExpression (constants, ops, lvars) (TypedExpression _ (UnaryOperation Minus expr)) = let
(exprConstants, exprOps, _) = assembleExpression (constants, ops, lvars) expr
in
(exprConstants, exprOps ++ [Opineg], lvars)
assembleExpression (constants, ops, lvars) (TypedExpression dtype (LocalVariable name))
| name == "this" = (constants, ops ++ [Opaload 0], lvars)
| otherwise = let
localIndex = elemIndex name lvars
isPrimitive = elem dtype ["char", "boolean", "int"]
in case localIndex of
Just index -> (constants, ops ++ if isPrimitive then [Opiload (fromIntegral index)] else [Opaload (fromIntegral index)], lvars)
Nothing -> error ("No such local variable found in local variable pool: " ++ name)
assembleExpression (constants, ops, lvars) (TypedExpression dtype (StatementExpressionExpression stmtexp)) =
assembleStatementExpression (constants, ops, lvars) stmtexp
assembleExpression _ expr = error ("Unknown expression: " ++ show expr)
assembleNameChain :: Assembler Expression
assembleNameChain input (TypedExpression _ (BinaryOperation NameResolution (TypedExpression atype a) (TypedExpression _ (FieldVariable _)))) =
assembleExpression input (TypedExpression atype a)
assembleNameChain input expr = assembleExpression input expr
assembleStatementExpression :: Assembler StatementExpression
assembleStatementExpression
(constants, ops, lvars)
(TypedStatementExpression _ (Assignment (TypedExpression dtype receiver) expr)) = let
target = resolveNameChain (TypedExpression dtype receiver)
in case target of
(TypedExpression dtype (LocalVariable name)) -> let
localIndex = elemIndex name lvars
(constants_a, ops_a, _) = assembleExpression (constants, ops, lvars) expr
isPrimitive = elem dtype ["char", "boolean", "int"]
in case localIndex of
Just index -> (constants_a, ops_a ++ if isPrimitive then [Opdup, Opistore (fromIntegral index)] else [Opastore (fromIntegral index)], lvars)
Nothing -> error ("No such local variable found in local variable pool: " ++ name)
(TypedExpression dtype (FieldVariable name)) -> let
owner = resolveNameChainOwner (TypedExpression dtype receiver)
in case owner of
(TypedExpression otype _) -> let
(constants_f, fieldIndex) = getFieldIndex constants (otype, name, datatypeDescriptor dtype)
(constants_r, ops_r, _) = assembleNameChain (constants_f, ops, lvars) (TypedExpression dtype receiver)
(constants_a, ops_a, _) = assembleExpression (constants_r, ops_r, lvars) expr
in
(constants_a, ops_a ++ [Opdup_x1, Opputfield (fromIntegral fieldIndex)], lvars)
something_else -> error ("Expected TypedExpression, but got: " ++ show something_else)
assembleStatementExpression
(constants, ops, lvars)
(TypedStatementExpression _ (PreIncrement (TypedExpression dtype receiver))) = let
target = resolveNameChain (TypedExpression dtype receiver)
in case target of
(TypedExpression dtype (LocalVariable name)) -> let
localIndex = elemIndex name lvars
expr = TypedExpression dtype (LocalVariable name)
(exprConstants, exprOps, _) = assembleExpression (constants, ops, lvars) expr
in case localIndex of
Just index -> (exprConstants, exprOps ++ [Opsipush 1, Opiadd, Opdup, Opistore (fromIntegral index)], lvars)
Nothing -> error ("No such local variable found in local variable pool: " ++ name)
(TypedExpression dtype (FieldVariable name)) -> let
owner = resolveNameChainOwner (TypedExpression dtype receiver)
in case owner of
(TypedExpression otype _) -> let
(constants_f, fieldIndex) = getFieldIndex constants (otype, name, datatypeDescriptor dtype)
(constants_r, ops_r, _) = assembleNameChain (constants_f, ops, lvars) (TypedExpression dtype receiver)
in
(constants_r, ops_r ++ [Opdup, Opgetfield (fromIntegral fieldIndex), Opsipush 1, Opiadd, Opdup_x1, Opputfield (fromIntegral fieldIndex)], lvars)
something_else -> error ("Expected TypedExpression, but got: " ++ show something_else)
assembleStatementExpression
(constants, ops, lvars)
(TypedStatementExpression _ (PreDecrement (TypedExpression dtype receiver))) = let
target = resolveNameChain (TypedExpression dtype receiver)
in case target of
(TypedExpression dtype (LocalVariable name)) -> let
localIndex = elemIndex name lvars
expr = TypedExpression dtype (LocalVariable name)
(exprConstants, exprOps, _) = assembleExpression (constants, ops, lvars) expr
in case localIndex of
Just index -> (exprConstants, exprOps ++ [Opsipush 1, Opisub, Opdup, Opistore (fromIntegral index)], lvars)
Nothing -> error ("No such local variable found in local variable pool: " ++ name)
(TypedExpression dtype (FieldVariable name)) -> let
owner = resolveNameChainOwner (TypedExpression dtype receiver)
in case owner of
(TypedExpression otype _) -> let
(constants_f, fieldIndex) = getFieldIndex constants (otype, name, datatypeDescriptor dtype)
(constants_r, ops_r, _) = assembleNameChain (constants_f, ops, lvars) (TypedExpression dtype receiver)
in
(constants_r, ops_r ++ [Opdup, Opgetfield (fromIntegral fieldIndex), Opsipush 1, Opisub, Opdup_x1, Opputfield (fromIntegral fieldIndex)], lvars)
something_else -> error ("Expected TypedExpression, but got: " ++ show something_else)
assembleStatementExpression
(constants, ops, lvars)
(TypedStatementExpression _ (PostIncrement (TypedExpression dtype receiver))) = let
target = resolveNameChain (TypedExpression dtype receiver)
in case target of
(TypedExpression dtype (LocalVariable name)) -> let
localIndex = elemIndex name lvars
expr = TypedExpression dtype (LocalVariable name)
(exprConstants, exprOps, _) = assembleExpression (constants, ops, lvars) expr
in case localIndex of
Just index -> (exprConstants, exprOps ++ [Opdup, Opsipush 1, Opiadd, Opistore (fromIntegral index)], lvars)
Nothing -> error ("No such local variable found in local variable pool: " ++ name)
(TypedExpression dtype (FieldVariable name)) -> let
owner = resolveNameChainOwner (TypedExpression dtype receiver)
in case owner of
(TypedExpression otype _) -> let
(constants_f, fieldIndex) = getFieldIndex constants (otype, name, datatypeDescriptor dtype)
(constants_r, ops_r, _) = assembleNameChain (constants_f, ops, lvars) (TypedExpression dtype receiver)
in
(constants_r, ops_r ++ [Opdup, Opgetfield (fromIntegral fieldIndex), Opdup_x1, Opsipush 1, Opiadd, Opputfield (fromIntegral fieldIndex)], lvars)
something_else -> error ("Expected TypedExpression, but got: " ++ show something_else)
assembleStatementExpression
(constants, ops, lvars)
(TypedStatementExpression _ (PostDecrement (TypedExpression dtype receiver))) = let
target = resolveNameChain (TypedExpression dtype receiver)
in case target of
(TypedExpression dtype (LocalVariable name)) -> let
localIndex = elemIndex name lvars
expr = TypedExpression dtype (LocalVariable name)
(exprConstants, exprOps, _) = assembleExpression (constants, ops, lvars) expr
in case localIndex of
Just index -> (exprConstants, exprOps ++ [Opdup, Opsipush 1, Opisub, Opistore (fromIntegral index)], lvars)
Nothing -> error ("No such local variable found in local variable pool: " ++ name)
(TypedExpression dtype (FieldVariable name)) -> let
owner = resolveNameChainOwner (TypedExpression dtype receiver)
in case owner of
(TypedExpression otype _) -> let
(constants_f, fieldIndex) = getFieldIndex constants (otype, name, datatypeDescriptor dtype)
(constants_r, ops_r, _) = assembleNameChain (constants_f, ops, lvars) (TypedExpression dtype receiver)
in
(constants_r, ops_r ++ [Opdup, Opgetfield (fromIntegral fieldIndex), Opdup_x1, Opsipush 1, Opisub, Opputfield (fromIntegral fieldIndex)], lvars)
something_else -> error ("Expected TypedExpression, but got: " ++ show something_else)
assembleStatementExpression
(constants, ops, lvars)
(TypedStatementExpression rtype (MethodCall (TypedExpression otype receiver) name params)) = let
(constants_r, ops_r, lvars_r) = assembleExpression (constants, ops, lvars) (TypedExpression otype receiver)
(constants_p, ops_p, lvars_p) = foldl assembleExpression (constants_r, ops_r, lvars_r) params
(constants_m, methodIndex) = getMethodIndex constants_p (otype, name, methodDescriptorFromParamlist params rtype)
in
(constants_m, ops_p ++ [Opinvokevirtual (fromIntegral methodIndex)], lvars_p)
assembleStatementExpression
(constants, ops, lvars)
(TypedStatementExpression rtype (ConstructorCall name params)) = let
(constants_c, classIndex) = getClassIndex constants name
(constants_p, ops_p, lvars_p) = foldl assembleExpression (constants_c, ops ++ [Opnew (fromIntegral classIndex), Opdup], lvars) params
(constants_m, methodIndex) = getMethodIndex constants_p (name, "<init>", methodDescriptorFromParamlist params "void")
in
(constants_m, ops_p ++ [Opinvokespecial (fromIntegral methodIndex)], lvars_p)
assembleStatement :: Assembler Statement
assembleStatement (constants, ops, lvars) (TypedStatement stype (Return expr)) = case expr of
Nothing -> (constants, ops ++ [Opreturn], lvars)
Just expr -> let
(expr_constants, expr_ops, _) = assembleExpression (constants, ops, lvars) expr
in
(expr_constants, expr_ops ++ [returnOperation stype], lvars)
assembleStatement (constants, ops, lvars) (TypedStatement _ (Block statements)) =
foldl assembleStatement (constants, ops, lvars) statements
assembleStatement (constants, ops, lvars) (TypedStatement dtype (If expr if_stmt else_stmt)) = let
(constants_cmp, ops_cmp, _) = assembleExpression (constants, [], lvars) expr
(constants_ifa, ops_ifa, lvars_ifa) = assembleStatement (constants_cmp, [], lvars) if_stmt
(constants_elsea, ops_elsea, _) = case else_stmt of
Nothing -> (constants_ifa, [], lvars_ifa)
Just stmt -> assembleStatement (constants_ifa, [], lvars_ifa) stmt
-- +6 because we insert 2 gotos, one for if, one for else
if_length = sum (map opcodeEncodingLength ops_ifa)
-- +3 because we need to account for the goto in the if statement.
else_length = sum (map opcodeEncodingLength ops_elsea)
in case dtype of
"void" -> (constants_ifa, ops ++ ops_cmp ++ [Opsipush 0, Opif_icmpeq (if_length + 6)] ++ ops_ifa ++ [Opgoto (else_length + 3)] ++ ops_elsea, lvars_ifa)
_ -> (constants_ifa, ops ++ ops_cmp ++ [Opsipush 0, Opif_icmpeq (if_length + 3)] ++ ops_ifa ++ ops_elsea, lvars_ifa)
assembleStatement (constants, ops, lvars) (TypedStatement _ (While expr stmt)) = let
(constants_cmp, ops_cmp, _) = assembleExpression (constants, [], lvars) expr
(constants_stmta, ops_stmta, lvars_stmta) = assembleStatement (constants_cmp, [], lvars) stmt
-- +3 because we insert 2 gotos, one for the comparison, one for the goto back to the comparison
stmt_length = sum (map opcodeEncodingLength ops_stmta) + 6
entire_length = stmt_length + sum (map opcodeEncodingLength ops_cmp)
in
(constants_stmta, ops ++ ops_cmp ++ [Opsipush 0, Opif_icmpeq stmt_length] ++ ops_stmta ++ [Opgoto (-entire_length)], lvars_stmta)
assembleStatement (constants, ops, lvars) (TypedStatement _ (LocalVariableDeclaration (VariableDeclaration dtype name expr))) = let
isPrimitive = elem dtype ["char", "boolean", "int"]
(constants_init, ops_init, _) = case expr of
Just exp -> assembleExpression (constants, ops, lvars) exp
Nothing -> (constants, ops ++ if isPrimitive then [Opsipush 0] else [Opaconst_null], lvars)
localIndex = fromIntegral (length lvars)
storeLocal = if isPrimitive then [Opistore localIndex] else [Opastore localIndex]
in
(constants_init, ops_init ++ storeLocal, lvars ++ [name])
assembleStatement (constants, ops, lvars) (TypedStatement _ (StatementExpressionStatement expr)) = let
(constants_e, ops_e, lvars_e) = assembleStatementExpression (constants, ops, lvars) expr
in
(constants_e, ops_e ++ [Oppop], lvars_e)
assembleStatement _ stmt = error ("Unknown statement: " ++ show stmt)
assembleMethod :: Assembler MethodDeclaration
assembleMethod (constants, ops, lvars) (MethodDeclaration returntype name _ (TypedStatement _ (Block statements)))
| name == "<init>" = let
(constants_a, ops_a, lvars_a) = foldl assembleStatement (constants, ops, lvars) statements
in
(constants_a, [Opaload 0, Opinvokespecial 2] ++ ops_a ++ [Opreturn], lvars_a)
| otherwise = case returntype of
"void" -> let
(constants_a, ops_a, lvars_a) = foldl assembleStatement (constants, ops, lvars) statements
in
(constants_a, ops_a ++ [Opreturn], lvars_a)
_ -> foldl assembleStatement (constants, ops, lvars) statements
assembleMethod _ (MethodDeclaration _ _ _ stmt) = error ("Typed block expected for method body, got: " ++ show stmt)

128
src/ByteCode/Builder.hs Normal file
View File

@@ -0,0 +1,128 @@
module ByteCode.Builder where
import ByteCode.Constants
import ByteCode.ClassFile (ClassFile (..), ConstantInfo (..), MemberInfo(..), Operation(..), Attribute(..), opcodeEncodingLength)
import ByteCode.Assembler
import ByteCode.Util
import Ast
import Data.Char
import Data.List
import Data.Word
type ClassFileBuilder a = a -> ClassFile -> ClassFile
fieldBuilder :: ClassFileBuilder VariableDeclaration
fieldBuilder (VariableDeclaration datatype name _) input = let
baseIndex = 1 + length (constantPool input)
constants = [
FieldRefInfo (fromIntegral (thisClass input)) (fromIntegral (baseIndex + 1)),
NameAndTypeInfo (fromIntegral (baseIndex + 2)) (fromIntegral (baseIndex + 3)),
Utf8Info name,
Utf8Info (datatypeDescriptor datatype)
]
field = MemberInfo {
memberAccessFlags = accessPublic,
memberNameIndex = fromIntegral (baseIndex + 2),
memberDescriptorIndex = fromIntegral (baseIndex + 3),
memberAttributes = []
}
in
input {
constantPool = constantPool input ++ constants,
fields = fields input ++ [field]
}
methodBuilder :: ClassFileBuilder MethodDeclaration
methodBuilder (MethodDeclaration returntype name parameters statement) input = let
baseIndex = 1 + length (constantPool input)
constants = [
MethodRefInfo (fromIntegral (thisClass input)) (fromIntegral (baseIndex + 1)),
NameAndTypeInfo (fromIntegral (baseIndex + 2)) (fromIntegral (baseIndex + 3)),
Utf8Info name,
Utf8Info (methodDescriptor (MethodDeclaration returntype name parameters (Block [])))
]
method = MemberInfo {
memberAccessFlags = accessPublic,
memberNameIndex = fromIntegral (baseIndex + 2),
memberDescriptorIndex = fromIntegral (baseIndex + 3),
memberAttributes = []
}
in
input {
constantPool = constantPool input ++ constants,
methods = methods input ++ [method]
}
constructorBuilder :: ClassFileBuilder ConstructorDeclaration
constructorBuilder (ConstructorDeclaration name parameters statement) = methodBuilder (MethodDeclaration "void" "<init>" parameters statement)
methodAssembler :: ClassFileBuilder MethodDeclaration
methodAssembler (MethodDeclaration returntype name parameters statement) input = let
methodConstantIndex = findMethodIndex input (MethodDeclaration returntype name parameters statement)
in case methodConstantIndex of
Nothing -> error ("Cannot find method entry in method pool for method: " ++ name)
Just index -> let
declaration = MethodDeclaration returntype name parameters statement
paramNames = "this" : [name | ParameterDeclaration _ name <- parameters]
in case splitAt index (methods input) of
(pre, []) -> input
(pre, method : post) -> let
(constants, bytecode, aParamNames) = assembleMethod (constantPool input, [], paramNames) declaration
assembledMethod = method {
memberAttributes = [
CodeAttribute {
attributeMaxStack = fromIntegral $ maxStackDepth constants bytecode,
attributeMaxLocals = fromIntegral $ length aParamNames,
attributeCode = bytecode
}
]
}
in
input {
constantPool = constants,
methods = pre ++ (assembledMethod : post)
}
constructorAssembler :: ClassFileBuilder ConstructorDeclaration
constructorAssembler (ConstructorDeclaration name parameters statement) = methodAssembler (MethodDeclaration "void" "<init>" parameters statement)
classBuilder :: ClassFileBuilder Class
classBuilder (Class name constructors methods fields) _ = let
baseConstants = [
ClassInfo 4,
MethodRefInfo 1 3,
NameAndTypeInfo 5 6,
Utf8Info "java/lang/Object",
Utf8Info "<init>",
Utf8Info "()V",
Utf8Info "Code",
ClassInfo 9,
Utf8Info name
]
nakedClassFile = ClassFile {
constantPool = baseConstants,
accessFlags = accessPublic,
thisClass = 8,
superClass = 1,
fields = [],
methods = [],
attributes = []
}
-- for every constructor, prepend all initialization assignments for fields.
constructorsWithInitializers = injectFieldInitializers name fields constructors
-- add fields, then method bodies, then constructor bodies to the classfile. After all referable names are known,
-- assemble the methods and constructors into bytecode.
fieldsAdded = foldr fieldBuilder nakedClassFile fields
methodsAdded = foldr methodBuilder fieldsAdded methods
constructorsAdded = foldr constructorBuilder methodsAdded constructorsWithInitializers
methodsAssembled = foldr methodAssembler constructorsAdded methods
constructorsAssembled = foldr constructorAssembler methodsAssembled constructorsWithInitializers
in
constructorsAssembled

View File

@@ -1,7 +1,6 @@
module ByteCode.ByteUtil(unpackWord16, unpackWord32) where
module ByteCode.ByteUtil where
import Data.Word ( Word8, Word16, Word32 )
import Data.Int
import Data.Bits
unpackWord16 :: Word16 -> [Word8]

View File

@@ -1,19 +1,11 @@
module ByteCode.ClassFile(
ConstantInfo(..),
Attribute(..),
MemberInfo(..),
ClassFile(..),
Operation(..),
serialize,
emptyClassFile
) where
module ByteCode.ClassFile where
import Data.Word
import Data.Int
import Data.ByteString (unpack)
import Data.ByteString.UTF8 (fromString)
import ByteCode.ByteUtil
import ByteCode.Constants
import ByteCode.ByteUtil
data ConstantInfo = ClassInfo Word16
| FieldRefInfo Word16 Word16
@@ -27,10 +19,13 @@ data Operation = Opiadd
| Opisub
| Opimul
| Opidiv
| Opirem
| Opiand
| Opior
| Opixor
| Opineg
| Opdup
| Opnew Word16
| Opif_icmplt Word16
| Opif_icmple Word16
| Opif_icmpgt Word16
@@ -41,6 +36,11 @@ data Operation = Opiadd
| Opreturn
| Opireturn
| Opareturn
| Opdup_x1
| Oppop
| Opinvokespecial Word16
| Opinvokevirtual Word16
| Opgoto Word16
| Opsipush Word16
| Opldc_w Word16
| Opaload Word16
@@ -48,7 +48,7 @@ data Operation = Opiadd
| Opastore Word16
| Opistore Word16
| Opputfield Word16
| OpgetField Word16
| Opgetfield Word16
deriving (Show, Eq)
@@ -87,6 +87,17 @@ emptyClassFile = ClassFile {
attributes = []
}
className :: ClassFile -> String
className classFile = let
classInfo = constantPool classFile !! fromIntegral (thisClass classFile)
in case classInfo of
Utf8Info className -> className
unexpected_element -> error ("expected Utf8Info but got: " ++ show unexpected_element)
opcodeEncodingLength :: Operation -> Word16
opcodeEncodingLength op = fromIntegral . length . serialize $ op
class Serializable a where
serialize :: a -> [Word8]
@@ -108,41 +119,49 @@ instance Serializable MemberInfo where
++ concatMap serialize (memberAttributes member)
instance Serializable Operation where
serialize Opiadd = [0x60]
serialize Opisub = [0x64]
serialize Opimul = [0x68]
serialize Opidiv = [0x6C]
serialize Opiand = [0x7E]
serialize Opior = [0x80]
serialize Opixor = [0x82]
serialize Opineg = [0x74]
serialize (Opif_icmplt branch) = 0xA1 : unpackWord16 branch
serialize (Opif_icmple branch) = 0xA4 : unpackWord16 branch
serialize (Opif_icmpgt branch) = 0xA3 : unpackWord16 branch
serialize (Opif_icmpge branch) = 0xA2 : unpackWord16 branch
serialize (Opif_icmpeq branch) = 0x9F : unpackWord16 branch
serialize (Opif_icmpne branch) = 0xA0 : unpackWord16 branch
serialize Opaconst_null = [0x01]
serialize Opreturn = [0xB1]
serialize Opireturn = [0xAC]
serialize Opareturn = [0xB0]
serialize (Opsipush index) = 0x11 : unpackWord16 index
serialize (Opldc_w index) = 0x13 : unpackWord16 index
serialize (Opaload index) = [0xC4, 0x19] ++ unpackWord16 index
serialize (Opiload index) = [0xC4, 0x15] ++ unpackWord16 index
serialize (Opastore index) = [0xC4, 0x3A] ++ unpackWord16 index
serialize (Opistore index) = [0xC4, 0x36] ++ unpackWord16 index
serialize (Opputfield index) = 0xB5 : unpackWord16 index
serialize (OpgetField index) = 0xB4 : unpackWord16 index
serialize Opiadd = [0x60]
serialize Opisub = [0x64]
serialize Opimul = [0x68]
serialize Opidiv = [0x6C]
serialize Opirem = [0x70]
serialize Opiand = [0x7E]
serialize Opior = [0x80]
serialize Opixor = [0x82]
serialize Opineg = [0x74]
serialize Opdup = [0x59]
serialize (Opnew index) = 0xBB : unpackWord16 index
serialize (Opif_icmplt branch) = 0xA1 : unpackWord16 branch
serialize (Opif_icmple branch) = 0xA4 : unpackWord16 branch
serialize (Opif_icmpgt branch) = 0xA3 : unpackWord16 branch
serialize (Opif_icmpge branch) = 0xA2 : unpackWord16 branch
serialize (Opif_icmpeq branch) = 0x9F : unpackWord16 branch
serialize (Opif_icmpne branch) = 0xA0 : unpackWord16 branch
serialize Opaconst_null = [0x01]
serialize Opreturn = [0xB1]
serialize Opireturn = [0xAC]
serialize Opareturn = [0xB0]
serialize Opdup_x1 = [0x5A]
serialize Oppop = [0x57]
serialize (Opinvokespecial index) = 0xB7 : unpackWord16 index
serialize (Opinvokevirtual index) = 0xB6 : unpackWord16 index
serialize (Opgoto index) = 0xA7 : unpackWord16 index
serialize (Opsipush index) = 0x11 : unpackWord16 index
serialize (Opldc_w index) = 0x13 : unpackWord16 index
serialize (Opaload index) = [0xC4, 0x19] ++ unpackWord16 index
serialize (Opiload index) = [0xC4, 0x15] ++ unpackWord16 index
serialize (Opastore index) = [0xC4, 0x3A] ++ unpackWord16 index
serialize (Opistore index) = [0xC4, 0x36] ++ unpackWord16 index
serialize (Opputfield index) = 0xB5 : unpackWord16 index
serialize (Opgetfield index) = 0xB4 : unpackWord16 index
instance Serializable Attribute where
serialize (CodeAttribute { attributeMaxStack = maxStack,
attributeMaxLocals = maxLocals,
attributeCode = code }) = let
assembledCode = concat (map serialize code)
assembledCode = concatMap serialize code
in
unpackWord16 7 -- attribute_name_index
++ unpackWord32 (12 + (fromIntegral (length assembledCode))) -- attribute_length
++ unpackWord32 (12 + fromIntegral (length assembledCode)) -- attribute_length
++ unpackWord16 maxStack -- max_stack
++ unpackWord16 maxLocals -- max_locals
++ unpackWord32 (fromIntegral (length assembledCode)) -- code_length
@@ -151,7 +170,7 @@ instance Serializable Attribute where
++ unpackWord16 0 -- attributes_count
instance Serializable ClassFile where
serialize classfile = unpackWord32 0xC0FEBABE -- magic
serialize classfile = unpackWord32 0xCAFEBABE -- magic
++ unpackWord16 0 -- minor version
++ unpackWord16 49 -- major version
++ unpackWord16 (fromIntegral (1 + length (constantPool classfile))) -- constant pool count

View File

@@ -1,169 +0,0 @@
module ByteCode.ClassFile.Generator(
classBuilder,
datatypeDescriptor,
methodParameterDescriptor,
methodDescriptor,
) where
import ByteCode.Constants
import ByteCode.ClassFile (ClassFile (..), ConstantInfo (..), MemberInfo(..), Operation(..), Attribute(..))
import Ast
import Data.Char
type ClassFileBuilder a = a -> ClassFile -> ClassFile
datatypeDescriptor :: String -> String
datatypeDescriptor "void" = "V"
datatypeDescriptor "int" = "I"
datatypeDescriptor "char" = "C"
datatypeDescriptor "boolean" = "B"
datatypeDescriptor x = "L" ++ x
methodParameterDescriptor :: String -> String
methodParameterDescriptor "void" = "V"
methodParameterDescriptor "int" = "I"
methodParameterDescriptor "char" = "C"
methodParameterDescriptor "boolean" = "B"
methodParameterDescriptor x = "L" ++ x ++ ";"
methodDescriptor :: MethodDeclaration -> String
methodDescriptor (MethodDeclaration returntype _ parameters _) = let
parameter_types = [datatype | ParameterDeclaration datatype _ <- parameters]
in
"("
++ (concat (map methodParameterDescriptor parameter_types))
++ ")"
++ datatypeDescriptor returntype
classBuilder :: ClassFileBuilder Class
classBuilder (Class name methods fields) _ = let
baseConstants = [
ClassInfo 4,
MethodRefInfo 1 3,
NameAndTypeInfo 5 6,
Utf8Info "java/lang/Object",
Utf8Info "<init>",
Utf8Info "()V",
Utf8Info "Code"
]
nameConstants = [ClassInfo 9, Utf8Info name]
nakedClassFile = ClassFile {
constantPool = baseConstants ++ nameConstants,
accessFlags = accessPublic,
thisClass = 8,
superClass = 1,
fields = [],
methods = [],
attributes = []
}
in
foldr methodBuilder (foldr fieldBuilder nakedClassFile fields) methods
fieldBuilder :: ClassFileBuilder VariableDeclaration
fieldBuilder (VariableDeclaration datatype name _) input = let
baseIndex = 1 + length (constantPool input)
constants = [
FieldRefInfo (fromIntegral (thisClass input)) (fromIntegral (baseIndex + 1)),
NameAndTypeInfo (fromIntegral (baseIndex + 2)) (fromIntegral (baseIndex + 3)),
Utf8Info name,
Utf8Info (datatypeDescriptor datatype)
]
field = MemberInfo {
memberAccessFlags = accessPublic,
memberNameIndex = (fromIntegral (baseIndex + 2)),
memberDescriptorIndex = (fromIntegral (baseIndex + 3)),
memberAttributes = []
}
in
input {
constantPool = (constantPool input) ++ constants,
fields = (fields input) ++ [field]
}
methodBuilder :: ClassFileBuilder MethodDeclaration
methodBuilder (MethodDeclaration returntype name parameters statement) input = let
baseIndex = 1 + length (constantPool input)
constants = [
FieldRefInfo (fromIntegral (thisClass input)) (fromIntegral (baseIndex + 1)),
NameAndTypeInfo (fromIntegral (baseIndex + 2)) (fromIntegral (baseIndex + 3)),
Utf8Info name,
Utf8Info (methodDescriptor (MethodDeclaration returntype name parameters (Block [])))
]
--code = assembleByteCode statement
method = MemberInfo {
memberAccessFlags = accessPublic,
memberNameIndex = (fromIntegral (baseIndex + 2)),
memberDescriptorIndex = (fromIntegral (baseIndex + 3)),
memberAttributes = [
CodeAttribute {
attributeMaxStack = 420,
attributeMaxLocals = 420,
attributeCode = [Opiadd]
}
]
}
in
input {
constantPool = (constantPool input) ++ constants,
methods = (fields input) ++ [method]
}
type Assembler a = a -> ([ConstantInfo], [Operation]) -> ([ConstantInfo], [Operation])
returnOperation :: DataType -> Operation
returnOperation dtype
| elem dtype ["int", "char", "boolean"] = Opireturn
| otherwise = Opareturn
binaryOperation :: BinaryOperator -> Operation
binaryOperation Addition = Opiadd
binaryOperation Subtraction = Opisub
binaryOperation Multiplication = Opimul
binaryOperation Division = Opidiv
binaryOperation BitwiseAnd = Opiand
binaryOperation BitwiseOr = Opior
binaryOperation BitwiseXor = Opixor
assembleMethod :: Assembler MethodDeclaration
assembleMethod (MethodDeclaration _ _ _ (Block statements)) (constants, ops) =
foldr assembleStatement (constants, ops) statements
assembleStatement :: Assembler Statement
assembleStatement (TypedStatement stype (Return expr)) (constants, ops) = case expr of
Nothing -> (constants, ops ++ [Opreturn])
Just expr -> let
(expr_constants, expr_ops) = assembleExpression expr (constants, ops)
in
(expr_constants, expr_ops ++ [returnOperation stype])
assembleExpression :: Assembler Expression
assembleExpression (TypedExpression _ (BinaryOperation op a b)) (constants, ops)
| elem op [Addition, Subtraction, Multiplication, Division, BitwiseAnd, BitwiseOr, BitwiseXor] = let
(aConstants, aOps) = assembleExpression a (constants, ops)
(bConstants, bOps) = assembleExpression b (aConstants, aOps)
in
(bConstants, bOps ++ [binaryOperation op])
assembleExpression (TypedExpression _ (CharacterLiteral literal)) (constants, ops) =
(constants, ops ++ [Opsipush (fromIntegral (ord literal))])
assembleExpression (TypedExpression _ (BooleanLiteral literal)) (constants, ops) =
(constants, ops ++ [Opsipush (if literal then 1 else 0)])
assembleExpression (TypedExpression _ (IntegerLiteral literal)) (constants, ops)
| literal <= 32767 && literal >= -32768 = (constants, ops ++ [Opsipush (fromIntegral literal)])
| otherwise = (constants ++ [IntegerInfo (fromIntegral literal)], ops ++ [Opldc_w (fromIntegral (1 + length constants))])
assembleExpression (TypedExpression _ NullLiteral) (constants, ops) =
(constants, ops ++ [Opaconst_null])
assembleExpression (TypedExpression etype (UnaryOperation Not expr)) (constants, ops) = let
(exprConstants, exprOps) = assembleExpression expr (constants, ops)
newConstant = fromIntegral (1 + length exprConstants)
in case etype of
"int" -> (exprConstants ++ [IntegerInfo 0x7FFFFFFF], exprOps ++ [Opldc_w newConstant, Opixor])
"char" -> (exprConstants, exprOps ++ [Opsipush 0xFFFF, Opixor])
"boolean" -> (exprConstants, exprOps ++ [Opsipush 0x01, Opixor])
assembleExpression (TypedExpression _ (UnaryOperation Minus expr)) (constants, ops) = let
(exprConstants, exprOps) = assembleExpression expr (constants, ops)
in
(exprConstants, exprOps ++ [Opineg])

291
src/ByteCode/Util.hs Normal file
View File

@@ -0,0 +1,291 @@
{-# OPTIONS_GHC -Wno-unrecognised-pragmas #-}
{-# HLINT ignore "Use lambda-case" #-}
module ByteCode.Util where
import Data.Int
import Ast
import ByteCode.ClassFile
import Data.List
import Data.Maybe (mapMaybe, isJust)
import Data.Word (Word8, Word16, Word32)
-- walks the name resolution chain. returns the innermost Just LocalVariable/FieldVariable or Nothing.
resolveNameChain :: Expression -> Expression
resolveNameChain (TypedExpression _ (BinaryOperation NameResolution a b)) = resolveNameChain b
resolveNameChain (TypedExpression dtype (LocalVariable name)) = TypedExpression dtype (LocalVariable name)
resolveNameChain (TypedExpression dtype (FieldVariable name)) = TypedExpression dtype (FieldVariable name)
resolveNameChain invalidExpression = error ("expected a NameResolution or Local/Field Variable but got: " ++ show invalidExpression)
-- walks the name resolution chain. returns the second-to-last item of the namechain.
resolveNameChainOwner :: Expression -> Expression
resolveNameChainOwner (TypedExpression _ (BinaryOperation NameResolution a (TypedExpression dtype (FieldVariable name)))) = a
resolveNameChainOwner (TypedExpression _ (BinaryOperation NameResolution a b)) = resolveNameChain b
resolveNameChainOwner invalidExpression = error ("expected a NameResolution or Local/Field Variable but got: " ++ show invalidExpression)
methodDescriptor :: MethodDeclaration -> String
methodDescriptor (MethodDeclaration returntype _ parameters _) = let
parameter_types = [datatype | ParameterDeclaration datatype _ <- parameters]
in
"("
++ concatMap datatypeDescriptor parameter_types
++ ")"
++ datatypeDescriptor returntype
methodDescriptorFromParamlist :: [Expression] -> String -> String
methodDescriptorFromParamlist parameters returntype = let
parameter_types = [datatype | TypedExpression datatype _ <- parameters]
in
"("
++ concatMap datatypeDescriptor parameter_types
++ ")"
++ datatypeDescriptor returntype
-- recursively parses a given type signature into a list of parameter types and the method return type.
-- As an initial parameter, you can supply ([], "void").
parseMethodType :: ([String], String) -> String -> ([String], String)
parseMethodType (params, returnType) ('(' : descriptor) = parseMethodType (params, returnType) descriptor
parseMethodType (params, returnType) ('I' : descriptor) = parseMethodType (params ++ ["I"], returnType) descriptor
parseMethodType (params, returnType) ('C' : descriptor) = parseMethodType (params ++ ["C"], returnType) descriptor
parseMethodType (params, returnType) ('Z' : descriptor) = parseMethodType (params ++ ["Z"], returnType) descriptor
parseMethodType (params, returnType) ('L' : descriptor) = let
typeLength = elemIndex ';' descriptor
in case typeLength of
Just length -> let
(typeName, semicolon : restOfDescriptor) = splitAt length descriptor
in
parseMethodType (params ++ [typeName], returnType) restOfDescriptor
Nothing -> error $ "unterminated class type in function signature: " ++ show descriptor
parseMethodType (params, _) (')' : descriptor) = (params, descriptor)
parseMethodType _ descriptor = error $ "expected start of type name (L, I, C, Z) but got: " ++ descriptor
-- given a method index (constant pool index),
-- returns the full type of the method. (i.e (LSomething;II)V)
methodTypeFromIndex :: [ConstantInfo] -> Int -> String
methodTypeFromIndex constants index = case constants !! fromIntegral (index - 1) of
MethodRefInfo _ nameAndTypeIndex -> case constants !! fromIntegral (nameAndTypeIndex - 1) of
NameAndTypeInfo _ typeIndex -> case constants !! fromIntegral (typeIndex - 1) of
Utf8Info typeLiteral -> typeLiteral
unexpectedElement -> error "Expected Utf8Info but got: " ++ show unexpectedElement
unexpectedElement -> error "Expected NameAndTypeInfo but got: " ++ show unexpectedElement
unexpectedElement -> error "Expected MethodRefInfo but got: " ++ show unexpectedElement
methodParametersFromIndex :: [ConstantInfo] -> Int -> ([String], String)
methodParametersFromIndex constants index = parseMethodType ([], "V") (methodTypeFromIndex constants index)
memberInfoIsMethod :: [ConstantInfo] -> MemberInfo -> Bool
memberInfoIsMethod constants info = '(' `elem` memberInfoDescriptor constants info
datatypeDescriptor :: String -> String
datatypeDescriptor "void" = "V"
datatypeDescriptor "int" = "I"
datatypeDescriptor "char" = "C"
datatypeDescriptor "boolean" = "Z"
datatypeDescriptor x = "L" ++ x ++ ";"
memberInfoDescriptor :: [ConstantInfo] -> MemberInfo -> String
memberInfoDescriptor constants MemberInfo { memberDescriptorIndex = descriptorIndex } = let
descriptor = constants !! (fromIntegral descriptorIndex - 1)
in case descriptor of
Utf8Info descriptorText -> descriptorText
_ -> "Invalid Item at Constant pool index " ++ show descriptorIndex
memberInfoName :: [ConstantInfo] -> MemberInfo -> String
memberInfoName constants MemberInfo { memberNameIndex = nameIndex } = let
name = constants !! (fromIntegral nameIndex - 1)
in case name of
Utf8Info nameText -> nameText
_ -> "Invalid Item at Constant pool index " ++ show nameIndex
returnOperation :: DataType -> Operation
returnOperation dtype
| dtype `elem` ["int", "char", "boolean"] = Opireturn
| otherwise = Opareturn
binaryOperation :: BinaryOperator -> Operation
binaryOperation Addition = Opiadd
binaryOperation Subtraction = Opisub
binaryOperation Multiplication = Opimul
binaryOperation Division = Opidiv
binaryOperation Modulo = Opirem
binaryOperation BitwiseAnd = Opiand
binaryOperation BitwiseOr = Opior
binaryOperation BitwiseXor = Opixor
binaryOperation And = Opiand
binaryOperation Or = Opior
comparisonOperation :: BinaryOperator -> Word16 -> Operation
comparisonOperation CompareEqual branchLocation = Opif_icmpeq branchLocation
comparisonOperation CompareNotEqual branchLocation = Opif_icmpne branchLocation
comparisonOperation CompareLessThan branchLocation = Opif_icmplt branchLocation
comparisonOperation CompareLessOrEqual branchLocation = Opif_icmple branchLocation
comparisonOperation CompareGreaterThan branchLocation = Opif_icmpgt branchLocation
comparisonOperation CompareGreaterOrEqual branchLocation = Opif_icmpge branchLocation
comparisonOffset :: Operation -> Maybe Int
comparisonOffset (Opif_icmpeq offset) = Just $ fromIntegral offset
comparisonOffset (Opif_icmpne offset) = Just $ fromIntegral offset
comparisonOffset (Opif_icmplt offset) = Just $ fromIntegral offset
comparisonOffset (Opif_icmple offset) = Just $ fromIntegral offset
comparisonOffset (Opif_icmpgt offset) = Just $ fromIntegral offset
comparisonOffset (Opif_icmpge offset) = Just $ fromIntegral offset
comparisonOffset anything_else = Nothing
isComparisonOperation :: Operation -> Bool
isComparisonOperation op = isJust (comparisonOffset op)
findMethodIndex :: ClassFile -> MethodDeclaration -> Maybe Int
findMethodIndex classFile (MethodDeclaration rtype name params stmt) = let
constants = constantPool classFile
descriptor = methodDescriptor (MethodDeclaration rtype name params stmt)
in
findIndex (\method -> memberInfoIsMethod constants method && memberInfoName constants method == name && memberInfoDescriptor constants method == descriptor) (methods classFile)
findClassIndex :: [ConstantInfo] -> String -> Maybe Int
findClassIndex constants name = let
classNameIndices = [(index, constants!!(fromIntegral nameIndex - 1)) | (index, ClassInfo nameIndex) <- zip [1..] constants]
classNames = map (\(index, nameInfo) -> case nameInfo of
Utf8Info className -> (index, className)
something_else -> error ("Expected UTF8Info but got " ++ show something_else))
classNameIndices
desiredClassIndex = find (\(index, className) -> className == name) classNames
in case desiredClassIndex of
Just (index, _) -> Just index
Nothing -> Nothing
getKnownMembers :: [ConstantInfo] -> [(Int, (String, String, String))]
getKnownMembers constants = let
fieldsClassAndNT = [
(index, constants!!(fromIntegral classIndex - 1), constants!!(fromIntegral nameTypeIndex - 1))
| (index, FieldRefInfo classIndex nameTypeIndex) <- zip [1..] constants
] ++ [
(index, constants!!(fromIntegral classIndex - 1), constants!!(fromIntegral nameTypeIndex - 1))
| (index, MethodRefInfo classIndex nameTypeIndex) <- zip [1..] constants
]
fieldsClassNameType = map (\(index, nameInfo, nameTypeInfo) -> case (nameInfo, nameTypeInfo) of
(ClassInfo nameIndex, NameAndTypeInfo fnameIndex ftypeIndex) -> (index, (constants!!(fromIntegral nameIndex - 1), constants!!(fromIntegral fnameIndex - 1), constants!!(fromIntegral ftypeIndex - 1)))
something_else -> error ("Expected Class and NameType info, but got: " ++ show nameInfo ++ " and " ++ show nameTypeInfo))
fieldsClassAndNT
fieldsResolved = map (\(index, (nameInfo, fnameInfo, ftypeInfo)) -> case (nameInfo, fnameInfo, ftypeInfo) of
(Utf8Info cname, Utf8Info fname, Utf8Info ftype) -> (index, (cname, fname, ftype))
something_else -> error ("Expected UTF8Infos but got " ++ show something_else))
fieldsClassNameType
in
fieldsResolved
-- same as findClassIndex, but inserts a new entry into constant pool if not existing
getClassIndex :: [ConstantInfo] -> String -> ([ConstantInfo], Int)
getClassIndex constants name = case findClassIndex constants name of
Just index -> (constants, index)
Nothing -> (constants ++ [ClassInfo (fromIntegral (length constants) + 2), Utf8Info name], fromIntegral (length constants) + 1)
-- get the index for a field within a class, creating it if it does not exist.
getFieldIndex :: [ConstantInfo] -> (String, String, String) -> ([ConstantInfo], Int)
getFieldIndex constants (cname, fname, ftype) = case findMemberIndex constants (cname, fname, ftype) of
Just index -> (constants, index)
Nothing -> let
(constantsWithClass, classIndex) = getClassIndex constants cname
baseIndex = 1 + length constantsWithClass
in
(constantsWithClass ++ [
FieldRefInfo (fromIntegral classIndex) (fromIntegral (baseIndex + 1)),
NameAndTypeInfo (fromIntegral (baseIndex + 2)) (fromIntegral (baseIndex + 3)),
Utf8Info fname,
Utf8Info (datatypeDescriptor ftype)
], baseIndex)
getMethodIndex :: [ConstantInfo] -> (String, String, String) -> ([ConstantInfo], Int)
getMethodIndex constants (cname, mname, mtype) = case findMemberIndex constants (cname, mname, mtype) of
Just index -> (constants, index)
Nothing -> let
(constantsWithClass, classIndex) = getClassIndex constants cname
baseIndex = 1 + length constantsWithClass
in
(constantsWithClass ++ [
MethodRefInfo (fromIntegral classIndex) (fromIntegral (baseIndex + 1)),
NameAndTypeInfo (fromIntegral (baseIndex + 2)) (fromIntegral (baseIndex + 3)),
Utf8Info mname,
Utf8Info mtype
], baseIndex)
findMemberIndex :: [ConstantInfo] -> (String, String, String) -> Maybe Int
findMemberIndex constants (cname, fname, ftype) = let
allMembers = getKnownMembers constants
desiredMember = find (\(index, (c, f, ft)) -> (c, f, ft) == (cname, fname, ftype)) allMembers
in
fmap fst desiredMember
injectFieldInitializers :: String -> [VariableDeclaration] -> [ConstructorDeclaration] -> [ConstructorDeclaration]
injectFieldInitializers classname vars constructors = let
initializers = mapMaybe (\variable -> case variable of
VariableDeclaration dtype name (Just initializer) -> Just (
TypedStatement dtype (
StatementExpressionStatement (
TypedStatementExpression dtype (
Assignment
(TypedExpression dtype (BinaryOperation NameResolution (TypedExpression classname (LocalVariable "this")) (TypedExpression dtype (FieldVariable name))))
initializer
)
)
)
)
otherwise -> Nothing
) vars
in
map (\con -> let
ConstructorDeclaration classname params (TypedStatement "void" (Block statements)) = con
in
ConstructorDeclaration classname params (TypedStatement "void" (Block (initializers ++ statements)))
) constructors
-- effect of one instruction/operation on the stack
operationStackCost :: [ConstantInfo] -> Operation -> Int
operationStackCost constants Opiadd = -1
operationStackCost constants Opisub = -1
operationStackCost constants Opimul = -1
operationStackCost constants Opidiv = -1
operationStackCost constants Opirem = -1
operationStackCost constants Opiand = -1
operationStackCost constants Opior = -1
operationStackCost constants Opixor = -1
operationStackCost constants Opineg = 0
operationStackCost constants Opdup = 1
operationStackCost constants (Opnew _) = 1
operationStackCost constants (Opif_icmplt _) = -2
operationStackCost constants (Opif_icmple _) = -2
operationStackCost constants (Opif_icmpgt _) = -2
operationStackCost constants (Opif_icmpge _) = -2
operationStackCost constants (Opif_icmpeq _) = -2
operationStackCost constants (Opif_icmpne _) = -2
operationStackCost constants Opaconst_null = 1
operationStackCost constants Opreturn = 0
operationStackCost constants Opireturn = -1
operationStackCost constants Opareturn = -1
operationStackCost constants Opdup_x1 = 1
operationStackCost constants Oppop = -1
operationStackCost constants (Opinvokespecial idx) = let
(params, returnType) = methodParametersFromIndex constants (fromIntegral idx)
in (length params + 1) - fromEnum (returnType /= "V")
operationStackCost constants (Opinvokevirtual idx) = let
(params, returnType) = methodParametersFromIndex constants (fromIntegral idx)
in (length params + 1) - fromEnum (returnType /= "V")
operationStackCost constants (Opgoto _) = 0
operationStackCost constants (Opsipush _) = 1
operationStackCost constants (Opldc_w _) = 1
operationStackCost constants (Opaload _) = 1
operationStackCost constants (Opiload _) = 1
operationStackCost constants (Opastore _) = -1
operationStackCost constants (Opistore _) = -1
operationStackCost constants (Opputfield _) = -2
operationStackCost constants (Opgetfield _) = 0
simulateStackOperation :: (Int, Int) -> [ConstantInfo] -> Operation -> (Int, Int)
simulateStackOperation (cd, md) constants op = let
depth = cd + operationStackCost constants op
in if depth < 0
then error ("Consuming value off of empty stack: " ++ show op)
else (depth, max depth md)
maxStackDepth :: [ConstantInfo] -> [Operation] -> Int
maxStackDepth constants ops = snd $ foldl (`simulateStackOperation` constants) (0, 0) ops

View File

@@ -1,267 +0,0 @@
module Example where
import Ast
import Typecheck
import Control.Exception (catch, evaluate, SomeException, displayException)
import Control.Exception.Base
import System.IO (stderr, hPutStrLn)
import Data.Maybe
import Data.List
green, red, yellow, blue, magenta, cyan, white :: String -> String
green str = "\x1b[32m" ++ str ++ "\x1b[0m"
red str = "\x1b[31m" ++ str ++ "\x1b[0m"
yellow str = "\x1b[33m" ++ str ++ "\x1b[0m"
blue str = "\x1b[34m" ++ str ++ "\x1b[0m"
magenta str = "\x1b[35m" ++ str ++ "\x1b[0m"
cyan str = "\x1b[36m" ++ str ++ "\x1b[0m"
white str = "\x1b[37m" ++ str ++ "\x1b[0m"
printSuccess :: String -> IO ()
printSuccess msg = putStrLn $ green "Success:" ++ white msg
handleError :: SomeException -> IO ()
handleError e = hPutStrLn stderr $ red ("Error: " ++ displayException e)
printResult :: Show a => String -> a -> IO ()
printResult title result = do
putStrLn $ green title
print result
sampleClasses :: [Class]
sampleClasses = [
Class "Person" [
MethodDeclaration "void" "setAge" [ParameterDeclaration "int" "newAge"]
(Block [
LocalVariableDeclaration (VariableDeclaration "int" "age" (Just (Reference "newAge")))
]),
MethodDeclaration "int" "getAge" [] (Return (Just (Reference "age"))),
MethodDeclaration "Person" "Person" [ParameterDeclaration "int" "initialAge"] (Block [])
] [
VariableDeclaration "int" "age" (Just (IntegerLiteral 25))
]
]
initialSymtab :: [(DataType, Identifier)]
initialSymtab = []
exampleExpression :: Expression
exampleExpression = BinaryOperation NameResolution (Reference "bob") (Reference "age")
exampleAssignment :: Expression
exampleAssignment = StatementExpressionExpression (Assignment (Reference "a") (IntegerLiteral 30))
exampleMethodCall :: Statement
exampleMethodCall = StatementExpressionStatement (MethodCall (Reference "this") "setAge" [IntegerLiteral 30])
exampleConstructorCall :: Statement
exampleConstructorCall = LocalVariableDeclaration (VariableDeclaration "Person" "bob" (Just (StatementExpressionExpression (ConstructorCall "Person" [IntegerLiteral 30]))))
exampleNameResolution :: Expression
exampleNameResolution = BinaryOperation NameResolution (Reference "bob2") (Reference "age")
exampleBlockResolution :: Statement
exampleBlockResolution = Block [
LocalVariableDeclaration (VariableDeclaration "Person" "bob" (Just (StatementExpressionExpression (ConstructorCall "Person" [IntegerLiteral 30])))),
LocalVariableDeclaration (VariableDeclaration "int" "age" (Just (StatementExpressionExpression (MethodCall (Reference "bob") "getAge" [])))),
StatementExpressionStatement (MethodCall (Reference "bob") "setAge" [IntegerLiteral 30])
]
exampleBlockResolutionFail :: Statement
exampleBlockResolutionFail = Block [
LocalVariableDeclaration (VariableDeclaration "Person" "bob" (Just (StatementExpressionExpression (ConstructorCall "Person" [IntegerLiteral 30])))),
LocalVariableDeclaration (VariableDeclaration "bool" "age" (Just (StatementExpressionExpression (MethodCall (Reference "bob") "getAge" [])))),
StatementExpressionStatement (MethodCall (Reference "bob") "setAge" [IntegerLiteral 30])
]
exampleMethodCallAndAssignment :: Statement
exampleMethodCallAndAssignment = Block [
LocalVariableDeclaration (VariableDeclaration "Person" "bob" (Just (StatementExpressionExpression (ConstructorCall "Person" [IntegerLiteral 30])))),
LocalVariableDeclaration (VariableDeclaration "int" "age" (Just (StatementExpressionExpression (MethodCall (Reference "bob") "getAge" [])))),
StatementExpressionStatement (MethodCall (Reference "bob") "setAge" [IntegerLiteral 30]),
LocalVariableDeclaration (VariableDeclaration "int" "a" Nothing),
StatementExpressionStatement (Assignment (Reference "a") (Reference "age"))
]
exampleMethodCallAndAssignmentFail :: Statement
exampleMethodCallAndAssignmentFail = Block [
LocalVariableDeclaration (VariableDeclaration "Person" "bob" (Just (StatementExpressionExpression (ConstructorCall "Person" [IntegerLiteral 30])))),
LocalVariableDeclaration (VariableDeclaration "int" "age" (Just (StatementExpressionExpression (MethodCall (Reference "bob") "getAge" [])))),
StatementExpressionStatement (MethodCall (Reference "bob") "setAge" [IntegerLiteral 30]),
StatementExpressionStatement (Assignment (Reference "a") (Reference "age"))
]
exampleNameResolutionAssignment :: Statement
exampleNameResolutionAssignment = Block [
LocalVariableDeclaration (VariableDeclaration "Person" "bob" (Just (StatementExpressionExpression (ConstructorCall "Person" [IntegerLiteral 30])))),
StatementExpressionStatement (Assignment (BinaryOperation NameResolution (Reference "bob") (Reference "age")) (IntegerLiteral 30))
]
exampleCharIntOperation :: Expression
exampleCharIntOperation = BinaryOperation Addition (CharacterLiteral 'a') (IntegerLiteral 1)
exampleNullDeclaration :: Statement
exampleNullDeclaration = LocalVariableDeclaration (VariableDeclaration "Person" "bob" (Just NullLiteral))
exampleNullDeclarationFail :: Statement
exampleNullDeclarationFail = LocalVariableDeclaration (VariableDeclaration "int" "a" (Just NullLiteral))
exampleNullAssignment :: Statement
exampleNullAssignment = StatementExpressionStatement (Assignment (Reference "a") NullLiteral)
exampleIncrement :: Statement
exampleIncrement = StatementExpressionStatement (PostIncrement (Reference "a"))
testClasses :: [Class]
testClasses = [
Class "Person" [
MethodDeclaration "Person" "Person" [ParameterDeclaration "int" "initialAge"]
(Block [
Return (Just (Reference "this"))
]),
MethodDeclaration "void" "setAge" [ParameterDeclaration "int" "newAge"]
(Block [
LocalVariableDeclaration (VariableDeclaration "int" "age" (Just (Reference "newAge")))
]),
MethodDeclaration "int" "getAge" []
(Return (Just (Reference "age")))
] [
VariableDeclaration "int" "age" Nothing -- initially unassigned
],
Class "Main" [
MethodDeclaration "int" "main" []
(Block [
LocalVariableDeclaration (VariableDeclaration "Person" "bob" (Just (StatementExpressionExpression (ConstructorCall "Person" [IntegerLiteral 25])))),
StatementExpressionStatement (MethodCall (Reference "bob") "setAge" [IntegerLiteral 30]),
LocalVariableDeclaration (VariableDeclaration "int" "bobAge" (Just (StatementExpressionExpression (MethodCall (Reference "bob2") "getAge" [])))),
Return (Just (Reference "bobAge"))
])
] [
VariableDeclaration "Person" "bob2" Nothing
]
]
runTypeCheck :: IO ()
runTypeCheck = do
catch (do
print "====================================================================================="
evaluatedExpression <- evaluate (typeCheckExpression exampleExpression [("bob", "Person")] sampleClasses)
printSuccess "Type checking of expression completed successfully"
printResult "Result Expression:" evaluatedExpression
) handleError
catch (do
print "====================================================================================="
evaluatedAssignment <- evaluate (typeCheckExpression exampleAssignment [("a", "int")] sampleClasses)
printSuccess "Type checking of assignment completed successfully"
printResult "Result Assignment:" evaluatedAssignment
) handleError
catch (do
print "====================================================================================="
evaluatedMethodCall <- evaluate (typeCheckStatement exampleMethodCall [("this", "Person"), ("setAge", "Person"), ("getAge", "Person")] sampleClasses)
printSuccess "Type checking of method call this completed successfully"
printResult "Result MethodCall:" evaluatedMethodCall
) handleError
catch (do
print "====================================================================================="
evaluatedConstructorCall <- evaluate (typeCheckStatement exampleConstructorCall [] sampleClasses)
printSuccess "Type checking of constructor call completed successfully"
printResult "Result Constructor Call:" evaluatedConstructorCall
) handleError
catch (do
print "====================================================================================="
evaluatedNameResolution <- evaluate (typeCheckExpression exampleNameResolution [("this", "Main")] testClasses)
printSuccess "Type checking of name resolution completed successfully"
printResult "Result Name Resolution:" evaluatedNameResolution
) handleError
catch (do
print "====================================================================================="
evaluatedBlockResolution <- evaluate (typeCheckStatement exampleBlockResolution [] sampleClasses)
printSuccess "Type checking of block resolution completed successfully"
printResult "Result Block Resolution:" evaluatedBlockResolution
) handleError
catch (do
print "====================================================================================="
evaluatedBlockResolutionFail <- evaluate (typeCheckStatement exampleBlockResolutionFail [] sampleClasses)
printSuccess "Type checking of block resolution failed"
printResult "Result Block Resolution:" evaluatedBlockResolutionFail
) handleError
catch (do
print "====================================================================================="
evaluatedMethodCallAndAssignment <- evaluate (typeCheckStatement exampleMethodCallAndAssignment [] sampleClasses)
printSuccess "Type checking of method call and assignment completed successfully"
printResult "Result Method Call and Assignment:" evaluatedMethodCallAndAssignment
) handleError
catch (do
print "====================================================================================="
evaluatedMethodCallAndAssignmentFail <- evaluate (typeCheckStatement exampleMethodCallAndAssignmentFail [] sampleClasses)
printSuccess "Type checking of method call and assignment failed"
printResult "Result Method Call and Assignment:" evaluatedMethodCallAndAssignmentFail
) handleError
catch (do
print "====================================================================================="
let mainClass = fromJust $ find (\(Class className _ _) -> className == "Main") testClasses
case mainClass of
Class _ [mainMethod] _ -> do
let result = typeCheckMethodDeclaration mainMethod [("this", "Main")] testClasses
printSuccess "Full program type checking completed successfully."
printResult "Main method result:" result
) handleError
catch (do
print "====================================================================================="
let typedProgram = typeCheckCompilationUnit testClasses
printSuccess "Type checking of Program completed successfully"
printResult "Typed Program:" typedProgram
) handleError
catch (do
print "====================================================================================="
typedAssignment <- evaluate (typeCheckStatement exampleNameResolutionAssignment [] sampleClasses)
printSuccess "Type checking of name resolution assignment completed successfully"
printResult "Result Name Resolution Assignment:" typedAssignment
) handleError
catch (do
print "====================================================================================="
evaluatedCharIntOperation <- evaluate (typeCheckExpression exampleCharIntOperation [] sampleClasses)
printSuccess "Type checking of char int operation completed successfully"
printResult "Result Char Int Operation:" evaluatedCharIntOperation
) handleError
catch (do
print "====================================================================================="
evaluatedNullDeclaration <- evaluate (typeCheckStatement exampleNullDeclaration [] sampleClasses)
printSuccess "Type checking of null declaration completed successfully"
printResult "Result Null Declaration:" evaluatedNullDeclaration
) handleError
catch (do
print "====================================================================================="
evaluatedNullDeclarationFail <- evaluate (typeCheckStatement exampleNullDeclarationFail [] sampleClasses)
printSuccess "Type checking of null declaration failed"
printResult "Result Null Declaration:" evaluatedNullDeclarationFail
) handleError
catch (do
print "====================================================================================="
evaluatedNullAssignment <- evaluate (typeCheckStatement exampleNullAssignment [("a", "Person")] sampleClasses)
printSuccess "Type checking of null assignment completed successfully"
printResult "Result Null Assignment:" evaluatedNullAssignment
) handleError
catch (do
print "====================================================================================="
evaluatedIncrement <- evaluate (typeCheckStatement exampleIncrement [("a", "int")] sampleClasses)
printSuccess "Type checking of increment completed successfully"
printResult "Result Increment:" evaluatedIncrement
) handleError

View File

@@ -1,8 +1,32 @@
module Main where
import Example
import Typecheck
import Parser.Lexer (alexScanTokens)
import Parser.JavaParser
import ByteCode.Builder
import ByteCode.ClassFile
import Data.ByteString (pack, writeFile)
import System.Environment
import System.FilePath.Posix (takeDirectory)
main = do
Example.runTypeCheck
args <- getArgs
let filename = if null args
then error "Missing filename, I need to know what to compile"
else head args
let outputDirectory = takeDirectory filename
print ("Compiling " ++ filename)
file <- readFile filename
let untypedAST = parse $ alexScanTokens file
let typedAST = typeCheckCompilationUnit untypedAST
let assembledClasses = map (`classBuilder` emptyClassFile) typedAST
mapM_ (\classFile -> let
fileContent = pack (serialize classFile)
fileName = outputDirectory ++ "/" ++ className classFile ++ ".class"
in Data.ByteString.writeFile fileName fileContent
) assembledClasses

View File

@@ -1,5 +1,5 @@
{
module Parser.JavaParser (parse, parseStatement, parseExpression) where
module Parser.JavaParser (parse, parseStatement, parseExpression, parseMethod) where
import Ast
import Parser.Lexer
}
@@ -7,6 +7,7 @@ import Parser.Lexer
%name parse
%name parseStatement statement
%name parseExpression expression
%name parseMethod classbodydeclarations
%tokentype { Token }
%error { parseError }
%errorhandlertype explist
@@ -75,6 +76,7 @@ import Parser.Lexer
OREQUAL { OREQUAL }
COLON { COLON }
LESS { LESS }
FOR { FOR }
%%
compilationunit : typedeclarations { $1 }
@@ -83,41 +85,43 @@ typedeclarations : typedeclaration { [$1] }
| typedeclarations typedeclaration { $1 ++ [$2] }
name : simplename { Reference $1 }
-- | qualifiedname { }
| qualifiedname { $1 }
typedeclaration : classdeclaration { $1 }
qualifiedname : name DOT IDENTIFIER { }
qualifiedname : name DOT IDENTIFIER { BinaryOperation NameResolution $1 (Reference $3) }
simplename : IDENTIFIER { $1 }
classdeclaration : CLASS IDENTIFIER classbody { case $3 of (methods, fields) -> Class $2 methods fields }
| modifiers CLASS IDENTIFIER classbody { case $4 of (methods, fields) -> Class $3 methods fields }
classdeclaration : CLASS IDENTIFIER classbody { case $3 of (constructors, methods, fields) -> Class $2 "public" constructors methods fields }
| modifiers CLASS IDENTIFIER classbody { case $4 of (constructors, methods, fields) -> Class $3 (head $1) constructors methods fields }
classbody : LBRACKET RBRACKET { ([], []) }
classbody : LBRACKET RBRACKET { ([], [], []) }
| LBRACKET classbodydeclarations RBRACKET { $2 }
modifiers : modifier { }
| modifiers modifier { }
modifiers : modifier { [$1] }
| modifiers modifier { $1 ++ [$2] }
classbodydeclarations : classbodydeclaration {
case $1 of
MethodDecl method -> ([method], [])
FieldDecls fields -> ([], fields)
ConstructorDecl constructor -> ([constructor], [], [])
MethodDecl method -> ([], (convertMethodDeclarationWithOptionals method), [])
FieldDecls fields -> ([], [], fields)
}
| classbodydeclarations classbodydeclaration {
case ($1, $2) of
((methods, fields), MethodDecl method) -> ((methods ++ [method]), fields)
((methods, fields), FieldDecls newFields) -> (methods, (fields ++ newFields))
((constructors, methods, fields), ConstructorDecl constructor) -> ((constructors ++ [constructor]), methods, fields)
((constructors, methods, fields), MethodDecl method) -> (constructors, (methods ++ (convertMethodDeclarationWithOptionals method)), fields)
((constructors, methods, fields), FieldDecls newFields) -> (constructors, methods, (fields ++ newFields))
}
modifier : PUBLIC { }
| PROTECTED { }
| PRIVATE { }
| STATIC { }
| ABSTRACT { }
modifier : PUBLIC { "public" }
| PROTECTED { "protected" }
| PRIVATE { "private" }
| STATIC { "static" }
| ABSTRACT { "abstract" }
classtype : classorinterfacetype{ }
classtype : classorinterfacetype { $1 }
classbodydeclaration : classmemberdeclaration { $1 }
| constructordeclaration { $1 }
@@ -127,29 +131,29 @@ classorinterfacetype : simplename { $1 }
classmemberdeclaration : fielddeclaration { $1 }
| methoddeclaration { $1 }
constructordeclaration : constructordeclarator constructorbody { MethodDecl $ MethodDeclaration "void" "<init>" $1 $2 }
| modifiers constructordeclarator constructorbody { MethodDecl $ MethodDeclaration "void" "<init>" $2 $3 }
constructordeclaration : constructordeclarator constructorbody { case $1 of (identifier, parameters) -> ConstructorDecl $ ConstructorDeclaration "public" identifier parameters $2 }
| modifiers constructordeclarator constructorbody { case $2 of (identifier, parameters) -> ConstructorDecl $ ConstructorDeclaration (head $1) identifier parameters $3 }
fielddeclaration : type variabledeclarators SEMICOLON { FieldDecls $ map (convertDeclarator $1) $2 }
| modifiers type variabledeclarators SEMICOLON { FieldDecls $ map (convertDeclarator $2) $3 }
fielddeclaration : type variabledeclarators SEMICOLON { FieldDecls $ map (convertDeclarator "public" $1) $2 }
| modifiers type variabledeclarators SEMICOLON { FieldDecls $ map (convertDeclarator (head $1) $2) $3 }
methoddeclaration : methodheader methodbody { case $1 of (returnType, (name, parameters)) -> MethodDecl (MethodDeclaration returnType name parameters $2) }
methoddeclaration : methodheader methodbody { case $1 of (returnType, modifier, (name, (parameters, optionalparameters))) -> MethodDecl (MethodDeclarationWithOptionals returnType modifier name parameters optionalparameters $2) }
block : LBRACKET RBRACKET { Block [] }
| LBRACKET blockstatements RBRACKET { Block $2 }
constructordeclarator : simplename LBRACE RBRACE { [] }
| simplename LBRACE formalparameterlist RBRACE { $3 }
constructordeclarator : simplename LBRACE RBRACE { ($1, []) }
| simplename LBRACE formalparameterlist RBRACE { ($1, $3) }
constructorbody : LBRACKET RBRACKET { Block [] }
-- | LBRACKET explicitconstructorinvocation RBRACKET { }
| LBRACKET blockstatements RBRACKET { Block $2 }
-- | LBRACKET explicitconstructorinvocation blockstatements RBRACKET { }
methodheader : type methoddeclarator { ($1, $2) }
| modifiers type methoddeclarator { ($2, $3) }
| VOID methoddeclarator { ("void", $2) }
| modifiers VOID methoddeclarator { ("void", $3)}
methodheader : type methoddeclarator { ($1, "public", $2) }
| modifiers type methoddeclarator { ($2, head $1, $3) }
| VOID methoddeclarator { ("void", "public", $2) }
| modifiers VOID methoddeclarator { ("void", head $1, $3) }
type : primitivetype { $1 }
| referencetype { $1 }
@@ -161,7 +165,11 @@ methodbody : block { $1 }
| SEMICOLON { Block [] }
blockstatements : blockstatement { $1 }
| blockstatements blockstatement { $1 ++ $2}
| blockstatements blockstatement { $1 ++ $2 }
formalandoptionalparameterlist : formalparameterlist { ($1, []) }
| formalparameterlist COMMA optionalparameterlist { ($1, $3) }
| optionalparameterlist { ([], $1) }
formalparameterlist : formalparameter { [$1] }
| formalparameterlist COMMA formalparameter { $1 ++ [$3] }
@@ -172,15 +180,19 @@ explicitconstructorinvocation : THIS LBRACE RBRACE SEMICOLON { }
classtypelist : classtype { }
| classtypelist COMMA classtype { }
methoddeclarator : IDENTIFIER LBRACE RBRACE { ($1, []) }
| IDENTIFIER LBRACE formalparameterlist RBRACE { ($1, $3) }
methoddeclarator : IDENTIFIER LBRACE RBRACE { ($1, ([], [])) }
| IDENTIFIER LBRACE formalandoptionalparameterlist RBRACE { ($1, $3) }
optionalparameterlist : optionalparameter { [$1] }
| optionalparameterlist COMMA optionalparameter { $1 ++ [$3] }
optionalparameter : type variabledeclaratorid ASSIGN variableinitializer { OptionalParameter $1 $2 $4 }
primitivetype : BOOLEAN { "boolean" }
| numerictype { $1 }
referencetype : classorinterfacetype { $1 }
variabledeclarator : variabledeclaratorid { Declarator $1 Nothing }
| variabledeclaratorid ASSIGN variableinitializer { Declarator $1 (Just $3) }
@@ -204,6 +216,7 @@ statement : statementwithouttrailingsubstatement{ $1 } -- statement retu
| ifthenstatement { [$1] }
| ifthenelsestatement { [$1] }
| whilestatement { [$1] }
| forstatement { [$1] }
expression : assignmentexpression { $1 }
@@ -211,7 +224,8 @@ expression : assignmentexpression { $1 }
integraltype : INT { "int" }
| CHAR { "char" }
localvariabledeclaration : type variabledeclarators { map LocalVariableDeclaration $ map (convertDeclarator $1) $2 }
localvariabledeclaration : type variabledeclarators { map (LocalVariableDeclaration . convertDeclarator "public" $1) $2 }
| modifiers type variabledeclarators { map (LocalVariableDeclaration . convertDeclarator (unwords $1) $2) $3 }
statementwithouttrailingsubstatement : block { [$1] }
| emptystatement { [] }
@@ -224,6 +238,21 @@ ifthenelsestatement : IF LBRACE expression RBRACE statementnoshortif ELSE state
whilestatement : WHILE LBRACE expression RBRACE statement { While $3 (Block $5) }
forstatement : FOR LBRACE forinit optionalexpression forupdate statement { Block ($3 ++ [While ($4) (Block ($6 ++ $5))]) }
forinit : statementexpressionlist SEMICOLON { $1 }
| localvariabledeclaration SEMICOLON { $1 }
| SEMICOLON { [] }
optionalexpression : expression SEMICOLON { $1 }
| SEMICOLON { BooleanLiteral True }
forupdate : statementexpressionlist RBRACE { $1 }
| RBRACE { [] }
statementexpressionlist : statementexpression { [StatementExpressionStatement $1] }
| statementexpressionlist COMMA statementexpression { $1 ++ [StatementExpressionStatement $3] }
assignmentexpression : conditionalexpression { $1 }
| assignment { StatementExpressionExpression $1 }
@@ -249,12 +278,12 @@ assignment : lefthandside assignmentoperator assignmentexpression {
statementexpression : assignment { $1 }
-- | preincrementexpression { }
-- | predecrementexpression { }
-- | postincrementexpression { }
-- | postdecrementexpression { }
| preincrementexpression { $1 }
| predecrementexpression { $1 }
| postincrementexpression { $1 }
| postdecrementexpression { $1 }
| methodinvocation { $1 }
-- | classinstancecreationexpression { }
| classinstancecreationexpression { $1 }
ifthenelsestatementnoshortif :IF LBRACE expression RBRACE statementnoshortif
ELSE statementnoshortif { }
@@ -265,6 +294,7 @@ conditionalorexpression : conditionalandexpression { $1 }
-- | conditionalorexpression LOGICALOR conditionalandexpression{ }
lefthandside : name { $1 }
| primary DOT IDENTIFIER { BinaryOperation NameResolution $1 (Reference $3) }
assignmentoperator : ASSIGN { Nothing }
| TIMESEQUAL { Just Multiplication }
@@ -287,17 +317,17 @@ postincrementexpression : postfixexpression INCREMENT { PostIncrement $1 }
postdecrementexpression : postfixexpression DECREMENT { PostDecrement $1 }
methodinvocation : simplename LBRACE RBRACE { MethodCall (Reference "this") $1 [] }
| simplename LBRACE argumentlist RBRACE { MethodCall (Reference "this") $1 $3 }
methodinvocation : name LBRACE RBRACE { let (exp, functionname) = extractFunctionName $1 in (MethodCall exp functionname []) }
| name LBRACE argumentlist RBRACE { let (exp, functionname) = extractFunctionName $1 in (MethodCall exp functionname $3) }
| primary DOT IDENTIFIER LBRACE RBRACE { MethodCall $1 $3 [] }
| primary DOT IDENTIFIER LBRACE argumentlist RBRACE { MethodCall $1 $3 $5 }
classinstancecreationexpression : NEW classtype LBRACE RBRACE { }
| NEW classtype LBRACE argumentlist RBRACE { }
classinstancecreationexpression : NEW classtype LBRACE RBRACE { ConstructorCall $2 [] }
| NEW classtype LBRACE argumentlist RBRACE { ConstructorCall $2 $4 }
conditionalandexpression : inclusiveorexpression { $1 }
fieldaccess : primary DOT IDENTIFIER { }
fieldaccess : primary DOT IDENTIFIER { BinaryOperation NameResolution $1 (Reference $3) }
unaryexpression : unaryexpressionnotplusminus { $1 }
| predecrementexpression { StatementExpressionExpression $1 }
@@ -318,8 +348,8 @@ inclusiveorexpression : exclusiveorexpression { $1 }
primarynonewarray : literal { $1 }
| THIS { Reference "this" }
| LBRACE expression RBRACE { $2 }
-- | classinstancecreationexpression { }
-- | fieldaccess { }
| classinstancecreationexpression { StatementExpressionExpression $1 }
| fieldaccess { $1 }
| methodinvocation { StatementExpressionExpression $1 }
unaryexpressionnotplusminus : postfixexpression { $1 }
@@ -366,16 +396,39 @@ multiplicativeexpression : unaryexpression { $1 }
{
data MethodOrFieldDeclaration = MethodDecl MethodDeclaration
| FieldDecls [VariableDeclaration]
data MemberDeclaration = MethodDecl MethodDeclarationWithOptionals
| ConstructorDecl ConstructorDeclaration
| FieldDecls [VariableDeclaration] deriving (Show)
data Declarator = Declarator Identifier (Maybe Expression)
convertDeclarator :: DataType -> Declarator -> VariableDeclaration
convertDeclarator dataType (Declarator id assigment) = VariableDeclaration dataType id assigment
convertDeclarator :: Modifier -> DataType -> Declarator -> VariableDeclaration
convertDeclarator modifier dataType (Declarator id assignment) = VariableDeclaration dataType modifier id assignment
data StatementWithoutSub = Statement
extractFunctionName :: Expression -> (Expression, Identifier)
extractFunctionName (BinaryOperation NameResolution exp (Reference functionname)) = (exp, functionname)
extractFunctionName (Reference functionname) = ((Reference "this"), functionname)
data OptionalParameter = OptionalParameter DataType Identifier Expression deriving (Show)
data MethodDeclarationWithOptionals = MethodDeclarationWithOptionals DataType String Identifier [ParameterDeclaration] [OptionalParameter] Statement deriving (Show)
convertMethodDeclarationWithOptionals :: MethodDeclarationWithOptionals -> [MethodDeclaration]
convertMethodDeclarationWithOptionals (MethodDeclarationWithOptionals returnType modifier id param [] stmt) = [MethodDeclaration returnType modifier id param stmt]
convertMethodDeclarationWithOptionals (MethodDeclarationWithOptionals returnType modifier id param (opt : optRest) stmt) = generateHelperMethod returnType modifier id param opt : convertMethodDeclarationWithOptionals (generateBaseMethod returnType modifier id param opt optRest stmt)
convertOptionalParameter :: OptionalParameter -> ParameterDeclaration
convertOptionalParameter (OptionalParameter dtype id exp) = ParameterDeclaration dtype id
generateHelperMethod :: DataType -> Modifier -> Identifier -> [ParameterDeclaration] -> OptionalParameter -> MethodDeclaration
generateHelperMethod returnType modifier methodName params (OptionalParameter dtype id exp) =
let references = ((map (\(ParameterDeclaration paramType ident) -> (Reference ident)) params) ++ [exp])
methodcall = (MethodCall (Reference "this") methodName references)
lastStatement = if returnType == "void" then StatementExpressionStatement methodcall else Return $ Just $ StatementExpressionExpression methodcall
in MethodDeclaration returnType modifier methodName params $ Block [lastStatement]
generateBaseMethod :: DataType -> Modifier -> Identifier -> [ParameterDeclaration] -> OptionalParameter -> [OptionalParameter] -> Statement -> MethodDeclarationWithOptionals
generateBaseMethod returnType modifier methodName params (OptionalParameter dtype id exp) optRest stmt = MethodDeclarationWithOptionals returnType modifier methodName (params ++ [ParameterDeclaration dtype id]) optRest stmt
parseError :: ([Token], [String]) -> a
parseError (errortoken, expected) = error ("parse error on token: " ++ show errortoken ++ "\nexpected one of: " ++ show expected)

View File

@@ -72,7 +72,7 @@ tokens :-
-- end keywords
$JavaLetter$JavaLetterOrDigit* { \s -> IDENTIFIER s }
-- Literals
[1-9]([0-9\_]*[0-9])* { \s -> case readMaybe $ filter ((/=) '_') s of Just a -> INTEGERLITERAL a; Nothing -> error ("failed to parse INTLITERAL " ++ s) }
[0-9]([0-9\_]*[0-9])* { \s -> case readMaybe $ filter ((/=) '_') s of Just a -> INTEGERLITERAL a; Nothing -> error ("failed to parse INTLITERAL " ++ s) }
"'"."'" { \s -> case (s) of _ : c : _ -> CHARLITERAL c; _ -> error ("failed to parse CHARLITERAL " ++ s) }
-- separators
"(" { \_ -> LBRACE }

View File

@@ -1,33 +1,82 @@
module Typecheck where
import Data.List (find)
import Data.Maybe
import Ast
typeCheckCompilationUnit :: CompilationUnit -> CompilationUnit
typeCheckCompilationUnit classes = map (`typeCheckClass` classes) classes
typeCheckCompilationUnit classes =
let
-- Helper function to add a default constructor if none are present
ensureDefaultConstructor :: Class -> Class
ensureDefaultConstructor (Class className modifier constructors methods fields) =
let
defaultConstructor = ConstructorDeclaration modifier className [] (Block [])
constructorsWithDefault = if null constructors then [defaultConstructor] else constructors
in Class className modifier constructorsWithDefault methods fields
-- Inject default constructors into all classes
classesWithDefaultConstructors = map ensureDefaultConstructor classes
in map (`typeCheckClass` classesWithDefaultConstructors) classesWithDefaultConstructors
typeCheckClass :: Class -> [Class] -> Class
typeCheckClass (Class className methods fields) classes =
typeCheckClass (Class className modifier constructors methods fields) classes =
let
-- Create a symbol table from class fields and method entries
-- TODO: Maybe remove method entries from the symbol table?
methodEntries = [(methodName, className) | MethodDeclaration _ methodName _ _ <- methods]
initalSymTab = ("this", className) : methodEntries
-- Fields and methods dont need to be added to the symtab because they are looked upon automatically under "this"
-- if its not a declared local variable. Also shadowing wouldnt be possible then.
initalSymTab = [("this", className)]
checkedConstructors = map (\constructor -> typeCheckConstructorDeclaration constructor initalSymTab classes) constructors
checkedMethods = map (\method -> typeCheckMethodDeclaration method initalSymTab classes) methods
in Class className checkedMethods fields
checkedFields = map (\field -> typeCheckVariableDeclaration field initalSymTab classes) fields
in Class className modifier checkedConstructors checkedMethods checkedFields
typeCheckMethodDeclaration :: MethodDeclaration -> [(Identifier, DataType)] -> [Class] -> MethodDeclaration
typeCheckMethodDeclaration (MethodDeclaration retType name params body) classFields classes =
typeCheckConstructorDeclaration :: ConstructorDeclaration -> [(Identifier, DataType)] -> [Class] -> ConstructorDeclaration
typeCheckConstructorDeclaration (ConstructorDeclaration modifier name params body) symtab classes =
let
-- Combine class fields with method parameters to form the initial symbol table for the method
methodParams = [(identifier, dataType) | ParameterDeclaration dataType identifier <- params]
initialSymtab = classFields ++ methodParams
constructorParams = [(identifier, dataType) | ParameterDeclaration dataType identifier <- params]
initialSymtab = symtab ++ constructorParams
className = fromMaybe (error "Constructor Declaration: 'this' not found in symtab") (lookup "this" symtab)
checkedBody = typeCheckStatement body initialSymtab classes
bodyType = getTypeFromStmt checkedBody
-- Check if the type of the body matches the declared return type
in if bodyType == retType || (bodyType == "void" && retType == "void") || (bodyType == "null" && isObjectType retType)
then MethodDeclaration retType name params checkedBody
else error $ "Return type mismatch in method " ++ name ++ ": expected " ++ retType ++ ", found " ++ bodyType
in if name == className
then if bodyType == "void"
then ConstructorDeclaration modifier name params checkedBody
else error $ "Constructor Declaration: Return type mismatch in constructor " ++ name ++ ": expected void, found " ++ bodyType
else error $ "Constructor Declaration: Constructor name " ++ name ++ " does not match class name " ++ className
typeCheckMethodDeclaration :: MethodDeclaration -> [(Identifier, DataType)] -> [Class] -> MethodDeclaration
typeCheckMethodDeclaration (MethodDeclaration retType modifier name params body) symtab classes =
let
methodParams = [(identifier, dataType) | ParameterDeclaration dataType identifier <- params]
initialSymtab = ("thisMeth", retType) : symtab ++ methodParams
checkedBody = typeCheckStatement body initialSymtab classes
bodyType = getTypeFromStmt checkedBody
in if bodyType == retType || (bodyType == "void" && retType == "void") || (bodyType == "null" && isObjectType retType) || isSubtype bodyType retType classes
then MethodDeclaration retType modifier name params checkedBody
else error $ "Method Declaration: Return type mismatch in method " ++ name ++ ": expected " ++ retType ++ ", found " ++ bodyType
typeCheckVariableDeclaration :: VariableDeclaration -> [(Identifier, DataType)] -> [Class] -> VariableDeclaration
typeCheckVariableDeclaration (VariableDeclaration dataType modifier identifier maybeExpr) symtab classes =
let
-- Ensure the type is valid (either a primitive type or a valid class name)
validType = dataType `elem` ["int", "boolean", "char"] || isUserDefinedClass dataType classes
-- Ensure no redefinition in the same scope
redefined = any ((== identifier) . snd) symtab
-- Type check the initializer expression if it exists
checkedExpr = fmap (\expr -> typeCheckExpression expr symtab classes) maybeExpr
exprType = fmap getTypeFromExpr checkedExpr
checkedExprWithType = case exprType of
Just "null" | isObjectType dataType -> Just (TypedExpression dataType NullLiteral)
_ -> checkedExpr
in case (validType, redefined, exprType) of
(False, _, _) -> error $ "Type '" ++ dataType ++ "' is not a valid type for variable '" ++ identifier ++ "'"
(_, True, _) -> error $ "Variable '" ++ identifier ++ "' is redefined in the same scope"
(_, _, Just t)
| t == "null" && isObjectType dataType -> VariableDeclaration dataType modifier identifier checkedExprWithType
| t /= dataType -> error $ "Type mismatch in declaration of '" ++ identifier ++ "': expected " ++ dataType ++ ", found " ++ t
| otherwise -> VariableDeclaration dataType modifier identifier checkedExprWithType
(_, _, Nothing) -> VariableDeclaration dataType modifier identifier checkedExprWithType
-- ********************************** Type Checking: Expressions **********************************
@@ -42,12 +91,13 @@ typeCheckExpression (Reference id) symtab classes =
Nothing ->
case lookup "this" symtab of
Just className ->
let classDetails = find (\(Class name _ _) -> name == className) classes
let classDetails = find (\(Class name _ _ _ _) -> name == className) classes
in case classDetails of
Just (Class _ _ fields) ->
let fieldTypes = [dt | VariableDeclaration dt fieldId _ <- fields, fieldId == id]
Just (Class _ _ _ _ fields) ->
let fieldTypes = [dt | VariableDeclaration dt modifier fieldId _ <- fields, fieldId == id]
-- this case only happens when its a field of its own class so the implicit this will be converted to explicit this
in case fieldTypes of
[fieldType] -> TypedExpression fieldType (FieldVariable id)
[fieldType] -> TypedExpression fieldType (BinaryOperation NameResolution (TypedExpression className (LocalVariable "this")) (TypedExpression fieldType (FieldVariable id)))
[] -> error $ "Field '" ++ id ++ "' not found in class '" ++ className ++ "'"
_ -> error $ "Ambiguous reference to field '" ++ id ++ "' in class '" ++ className ++ "'"
Nothing -> error $ "Class '" ++ className ++ "' not found for 'this'"
@@ -107,62 +157,88 @@ typeCheckStatementExpression (Assignment ref expr) symtab classes =
ref' = typeCheckExpression ref symtab classes
type' = getTypeFromExpr expr'
type'' = getTypeFromExpr ref'
typeToAssign = if type' == "null" && isObjectType type'' then type'' else type'
exprWithType = if type' == "null" && isObjectType type'' then TypedExpression type'' NullLiteral else expr'
in
if type'' == type' || (type' == "null" && isObjectType type'') then
TypedStatementExpression type'' (Assignment ref' expr')
if type'' == typeToAssign then
TypedStatementExpression type'' (Assignment ref' exprWithType)
else
error $ "Type mismatch in assignment to variable: expected " ++ type'' ++ ", found " ++ type'
error $ "Type mismatch in assignment to variable: expected " ++ type'' ++ ", found " ++ typeToAssign
typeCheckStatementExpression (ConstructorCall className args) symtab classes =
case find (\(Class name _ _) -> name == className) classes of
case find (\(Class name _ _ _ _) -> name == className) classes of
Nothing -> error $ "Class '" ++ className ++ "' not found."
Just (Class _ methods fields) ->
-- Constructor needs the same name as the class
case find (\(MethodDeclaration retType name params _) -> name == className && retType == className) methods of
Nothing -> error $ "No valid constructor found for class '" ++ className ++ "'."
Just (MethodDeclaration _ _ params _) ->
let
args' = map (\arg -> typeCheckExpression arg symtab classes) args
-- Extract expected parameter types from the constructor's parameters
expectedTypes = [dataType | ParameterDeclaration dataType _ <- params]
argTypes = map getTypeFromExpr args'
-- Check if the types of the provided arguments match the expected types
typeMatches = zipWith (\expected actual -> if expected == actual then Nothing else Just (expected, actual)) expectedTypes argTypes
mismatchErrors = map (\(exp, act) -> "Expected type '" ++ exp ++ "', found '" ++ act ++ "'.") (catMaybes typeMatches)
in
if length args /= length params then
error $ "Constructor for class '" ++ className ++ "' expects " ++ show (length params) ++ " arguments, but got " ++ show (length args) ++ "."
else if not (null mismatchErrors) then
error $ unlines $ ("Type mismatch in constructor arguments for class '" ++ className ++ "':") : mismatchErrors
else
TypedStatementExpression className (ConstructorCall className args')
Just (Class _ modifier constructors _ _) ->
let matchParams (ParameterDeclaration paramType _) arg =
let argTyped = typeCheckExpression arg symtab classes
argType = getTypeFromExpr argTyped
in if argType == "null" && isObjectType paramType
then Just (TypedExpression paramType NullLiteral)
else if argType == paramType
then Just argTyped
else Nothing
matchConstructor (ConstructorDeclaration constructorModifier name params _)
| constructorModifier == "public" = fmap (\checkedArgs -> (ConstructorDeclaration constructorModifier name params (Block []), checkedArgs)) (sequence $ zipWith matchParams params args)
| constructorModifier == "private" = if checkAccess className (lookup "this" symtab)
then fmap (\checkedArgs -> (ConstructorDeclaration constructorModifier name params (Block []), checkedArgs)) (sequence $ zipWith matchParams params args)
else Nothing
| otherwise = Nothing
validConstructors = filter (\(ConstructorDeclaration _ _ params _, _) -> length params == length args) $ mapMaybe matchConstructor constructors
expectedSignatures = [ map (\(ParameterDeclaration t _) -> t) params | ConstructorDeclaration _ _ params _ <- constructors ]
actualSignature = map (\arg -> getTypeFromExpr (typeCheckExpression arg symtab classes)) args
mismatchDetails = "Constructor not found for class '" ++ className ++ "' with given arguments.\n" ++
"Expected signatures:\n" ++ show expectedSignatures ++
"\nActual arguments:" ++ show actualSignature
in case validConstructors of
[(ConstructorDeclaration _ _ params _, checkedArgs)] ->
TypedStatementExpression className (ConstructorCall className checkedArgs)
[] -> error mismatchDetails
_ -> error $ "Multiple matching constructors found for class '" ++ className ++ "' with given arguments."
typeCheckStatementExpression (MethodCall expr methodName args) symtab classes =
let objExprTyped = typeCheckExpression expr symtab classes
in case objExprTyped of
TypedExpression objType _ ->
case find (\(Class className _ _) -> className == objType) classes of
Just (Class _ methods _) ->
case find (\(MethodDeclaration retType name params _) -> name == methodName) methods of
Just (MethodDeclaration retType _ params _) ->
let args' = map (\arg -> typeCheckExpression arg symtab classes) args
expectedTypes = [dataType | ParameterDeclaration dataType _ <- params]
argTypes = map getTypeFromExpr args'
typeMatches = zipWith (\expType argType -> (expType == argType, expType, argType)) expectedTypes argTypes
mismatches = filter (not . fst3) typeMatches
where fst3 (a, _, _) = a
in
if null mismatches && length args == length params then
TypedStatementExpression retType (MethodCall objExprTyped methodName args')
else if not (null mismatches) then
error $ unlines $ ("Argument type mismatches for method '" ++ methodName ++ "':")
: [ "Expected: " ++ expType ++ ", Found: " ++ argType | (_, expType, argType) <- mismatches ]
else
error $ "Incorrect number of arguments for method '" ++ methodName ++ "'. Expected " ++ show (length expectedTypes) ++ ", found " ++ show (length args) ++ "."
Nothing -> error $ "Method '" ++ methodName ++ "' not found in class '" ++ objType ++ "'."
case find (\(Class className _ _ _ _) -> className == objType) classes of
Just (Class className _ _ methods _) ->
let matchParams (ParameterDeclaration paramType _) arg =
let argTyped = typeCheckExpression arg symtab classes
argType = getTypeFromExpr argTyped
in if argType == "null" && isObjectType paramType
then Just (TypedExpression paramType NullLiteral)
else if argType == paramType
then Just argTyped
else Nothing
matchMethod (MethodDeclaration retType modifier name params _)
| modifier == "public" = fmap (\checkedArgs -> (MethodDeclaration retType modifier name params (Block []), checkedArgs)) (sequence $ zipWith matchParams params args)
| modifier == "private" = if checkAccess className (lookup "this" symtab)
then fmap (\checkedArgs -> (MethodDeclaration retType modifier name params (Block []), checkedArgs)) (sequence $ zipWith matchParams params args)
else Nothing
| otherwise = Nothing
validMethods = filter (\(MethodDeclaration _ _ name params _, _) -> name == methodName && length params == length args) $ mapMaybe matchMethod methods
expectedSignatures = [ map (\(ParameterDeclaration t _) -> t) params | MethodDeclaration _ _ name params _ <- methods, name == methodName ]
actualSignature = map (\arg -> getTypeFromExpr (typeCheckExpression arg symtab classes)) args
mismatchDetails = "Method not found for class '" ++ objType ++ "' with given arguments.\n" ++
"Expected signatures for method '" ++ methodName ++ "':\n" ++ unlines (map show expectedSignatures) ++
"Actual arguments:\n" ++ show actualSignature
in case validMethods of
[(MethodDeclaration retType _ _ params _, checkedArgs)] ->
TypedStatementExpression retType (MethodCall objExprTyped methodName checkedArgs)
[] -> error mismatchDetails
_ -> error $ "Multiple matching methods found for class '" ++ objType ++ "' and method '" ++ methodName ++ "' with given arguments."
Nothing -> error $ "Class for object type '" ++ objType ++ "' not found."
_ -> error "Invalid object type for method call. Object must have a class type."
typeCheckStatementExpression (PostIncrement expr) symtab classes =
let expr' = typeCheckExpression expr symtab classes
type' = getTypeFromExpr expr'
@@ -203,18 +279,23 @@ typeCheckStatementExpression (PreDecrement expr) symtab classes =
typeCheckStatement :: Statement -> [(Identifier, DataType)] -> [Class] -> Statement
typeCheckStatement (If cond thenStmt elseStmt) symtab classes =
let cond' = typeCheckExpression cond symtab classes
thenStmt' = typeCheckStatement thenStmt symtab classes
elseStmt' = case elseStmt of
Just stmt -> Just (typeCheckStatement stmt symtab classes)
Nothing -> Nothing
in if getTypeFromExpr cond' == "boolean"
then
TypedStatement (getTypeFromStmt thenStmt') (If cond' thenStmt' elseStmt')
else
error "If condition must be of type boolean"
let
cond' = typeCheckExpression cond symtab classes
thenStmt' = typeCheckStatement thenStmt symtab classes
elseStmt' = fmap (\stmt -> typeCheckStatement stmt symtab classes) elseStmt
typeCheckStatement (LocalVariableDeclaration (VariableDeclaration dataType identifier maybeExpr)) symtab classes =
thenType = getTypeFromStmt thenStmt'
elseType = maybe "void" getTypeFromStmt elseStmt'
ifType = if thenType == "void" || elseType == "void"
then "void"
else unifyReturnTypes thenType elseType
in if getTypeFromExpr cond' == "boolean"
then TypedStatement ifType (If cond' thenStmt' elseStmt')
else error "If condition must be of type boolean"
typeCheckStatement (LocalVariableDeclaration (VariableDeclaration dataType modifier identifier maybeExpr)) symtab classes =
-- Check for redefinition in the current scope
if any ((== identifier) . snd) symtab
then error $ "Variable '" ++ identifier ++ "' is redefined in the same scope"
@@ -222,13 +303,16 @@ typeCheckStatement (LocalVariableDeclaration (VariableDeclaration dataType ident
-- If there's an initializer expression, type check it
let checkedExpr = fmap (\expr -> typeCheckExpression expr symtab classes) maybeExpr
exprType = fmap getTypeFromExpr checkedExpr
checkedExprWithType = case (exprType, dataType) of
(Just "null", _) | isObjectType dataType -> Just (TypedExpression dataType NullLiteral)
_ -> checkedExpr
in case exprType of
Just t
| t == "null" && isObjectType dataType ->
TypedStatement dataType (LocalVariableDeclaration (VariableDeclaration dataType identifier checkedExpr))
| t == "null" && isObjectType dataType ->
TypedStatement dataType (LocalVariableDeclaration (VariableDeclaration dataType modifier identifier checkedExprWithType))
| t /= dataType -> error $ "Type mismatch in declaration of '" ++ identifier ++ "': expected " ++ dataType ++ ", found " ++ t
| otherwise -> TypedStatement dataType (LocalVariableDeclaration (VariableDeclaration dataType identifier checkedExpr))
Nothing -> TypedStatement dataType (LocalVariableDeclaration (VariableDeclaration dataType identifier checkedExpr))
| otherwise -> TypedStatement dataType (LocalVariableDeclaration (VariableDeclaration dataType modifier identifier checkedExprWithType))
Nothing -> TypedStatement dataType (LocalVariableDeclaration (VariableDeclaration dataType modifier identifier checkedExpr))
typeCheckStatement (While cond stmt) symtab classes =
let cond' = typeCheckExpression cond symtab classes
@@ -242,37 +326,48 @@ typeCheckStatement (While cond stmt) symtab classes =
typeCheckStatement (Block statements) symtab classes =
let
processStatements (accSts, currentSymtab, types) stmt =
let
checkedStmt = typeCheckStatement stmt currentSymtab classes
stmtType = getTypeFromStmt checkedStmt
in case stmt of
LocalVariableDeclaration (VariableDeclaration dataType identifier maybeExpr) ->
case stmt of
LocalVariableDeclaration (VariableDeclaration dataType modifier identifier maybeExpr) ->
let
alreadyDefined = any (\(id, _) -> id == identifier) currentSymtab
newSymtab = if alreadyDefined
then error ("Variable " ++ identifier ++ " already defined in this scope.")
else (identifier, dataType) : currentSymtab
checkedExpr = fmap (\expr -> typeCheckExpression expr currentSymtab classes) maybeExpr
newSymtab = (identifier, dataType) : currentSymtab
checkedStmt = typeCheckStatement stmt newSymtab classes
in (accSts ++ [checkedStmt], newSymtab, types)
If {} -> (accSts ++ [checkedStmt], currentSymtab, if stmtType /= "void" then types ++ [stmtType] else types)
While _ _ -> (accSts ++ [checkedStmt], currentSymtab, if stmtType /= "void" then types ++ [stmtType] else types)
Return _ -> (accSts ++ [checkedStmt], currentSymtab, if stmtType /= "void" then types ++ [stmtType] else types)
Block _ -> (accSts ++ [checkedStmt], currentSymtab, if stmtType /= "void" then types ++ [stmtType] else types)
_ -> (accSts ++ [checkedStmt], currentSymtab, types)
_ ->
let
checkedStmt = typeCheckStatement stmt currentSymtab classes
stmtType = getTypeFromStmt checkedStmt
in case stmt of
If {} -> (accSts ++ [checkedStmt], currentSymtab, if stmtType /= "void" then types ++ [stmtType] else types)
While _ _ -> (accSts ++ [checkedStmt], currentSymtab, if stmtType /= "void" then types ++ [stmtType] else types)
Return _ -> (accSts ++ [checkedStmt], currentSymtab, if stmtType /= "void" then types ++ [stmtType] else types)
Block _ -> (accSts ++ [checkedStmt], currentSymtab, if stmtType /= "void" then types ++ [stmtType] else types)
_ -> (accSts ++ [checkedStmt], currentSymtab, types)
-- Initial accumulator: empty statements list, initial symbol table, empty types list
(checkedStatements, finalSymtab, collectedTypes) = foldl processStatements ([], symtab, []) statements
-- Determine the block's type: unify all collected types, default to "Void" if none
-- Determine the block's type: unify all collected types, default to "void" if none (UpperBound)
blockType = if null collectedTypes then "void" else foldl1 unifyReturnTypes collectedTypes
in TypedStatement blockType (Block checkedStatements)
typeCheckStatement (Return expr) symtab classes =
let expr' = case expr of
Just e -> Just (typeCheckExpression e symtab classes)
let methodReturnType = fromMaybe (error "Method return type not found in symbol table") (lookup "thisMeth" symtab)
expr' = case expr of
Just e -> let eTyped = typeCheckExpression e symtab classes
in if getTypeFromExpr eTyped == "null" && isObjectType methodReturnType
then Just (TypedExpression methodReturnType NullLiteral)
else Just eTyped
Nothing -> Nothing
in case expr' of
Just e' -> TypedStatement (getTypeFromExpr e') (Return (Just e'))
Nothing -> TypedStatement "void" (Return Nothing)
returnType = maybe "void" getTypeFromExpr expr'
in if returnType == methodReturnType || isSubtype returnType methodReturnType classes
then TypedStatement returnType (Return expr')
else error $ "Return: Return type mismatch: expected " ++ methodReturnType ++ ", found " ++ returnType
typeCheckStatement (StatementExpressionStatement stmtExpr) symtab classes =
let stmtExpr' = typeCheckStatementExpression stmtExpr symtab classes
@@ -280,6 +375,17 @@ typeCheckStatement (StatementExpressionStatement stmtExpr) symtab classes =
-- ********************************** Type Checking: Helpers **********************************
isSubtype :: DataType -> DataType -> [Class] -> Bool
isSubtype subType superType classes
| subType == superType = True
| subType == "null" && isObjectType superType = True
| superType == "Object" && isObjectType subType = True
| superType == "Object" && isUserDefinedClass subType classes = True
| otherwise = False
isUserDefinedClass :: DataType -> [Class] -> Bool
isUserDefinedClass dt classes = dt `elem` map (\(Class name _ _ _ _) -> name) classes
isObjectType :: DataType -> Bool
isObjectType dt = dt /= "int" && dt /= "boolean" && dt /= "char"
@@ -297,8 +403,10 @@ getTypeFromStmtExpr _ = error "Untyped statement expression found where typed wa
unifyReturnTypes :: DataType -> DataType -> DataType
unifyReturnTypes dt1 dt2
| dt1 == dt2 = dt1
| otherwise = "Object"
| dt1 == dt2 = dt1
| dt1 == "null" = dt2
| dt2 == "null" = dt1
| otherwise = "Object"
resolveResultType :: DataType -> DataType -> DataType
resolveResultType "char" "char" = "char"
@@ -319,7 +427,7 @@ checkBitwiseOperation :: BinaryOperator -> Expression -> Expression -> DataType
checkBitwiseOperation op expr1' expr2' type1 type2
| type1 == "int" && type2 == "int" =
TypedExpression "int" (BinaryOperation op expr1' expr2')
| otherwise = error $ "Bitwise operation " ++ show op ++ " requires operands of type int"
| otherwise = error $ "Bitwise operation " ++ show op ++ " requires operands of type int or char"
checkComparisonOperation :: BinaryOperator -> Expression -> Expression -> DataType -> DataType -> Expression
checkComparisonOperation op expr1' expr2' type1 type2
@@ -329,9 +437,13 @@ checkComparisonOperation op expr1' expr2' type1 type2
checkEqualityOperation :: BinaryOperator -> Expression -> Expression -> DataType -> DataType -> Expression
checkEqualityOperation op expr1' expr2' type1 type2
| type1 == type2 =
| type1 == type2 || (type1 == "null" && isObjectType type2) || (type2 == "null" && isObjectType type1) =
TypedExpression "boolean" (BinaryOperation op expr1' expr2')
| otherwise = error $ "Equality operation " ++ show op ++ " requires operands of the same type"
| type1 /= type2 =
error $ "Equality operation " ++ show op ++ " requires operands of the same type. Found types: " ++ type1 ++ " and " ++ type2
| (type1 == "null" && not (isObjectType type2)) || (type2 == "null" && not (isObjectType type1)) =
error $ "Equality operation " ++ show op ++ " requires that null can only be compared with object types. Found types: " ++ type1 ++ " and " ++ type2
| otherwise = error $ "Equality operation " ++ show op ++ " encountered unexpected types: " ++ type1 ++ " and " ++ type2
checkLogicalOperation :: BinaryOperator -> Expression -> Expression -> DataType -> DataType -> Expression
checkLogicalOperation op expr1' expr2' type1 type2
@@ -343,12 +455,22 @@ resolveNameResolution :: Expression -> Expression -> [(Identifier, DataType)] ->
resolveNameResolution expr1' (Reference ident2) symtab classes =
case getTypeFromExpr expr1' of
objType ->
case find (\(Class className _ _) -> className == objType) classes of
Just (Class _ _ fields) ->
let fieldTypes = [dt | VariableDeclaration dt id _ <- fields, id == ident2]
in case fieldTypes of
[resolvedType] -> TypedExpression resolvedType (BinaryOperation NameResolution expr1' (TypedExpression resolvedType (FieldVariable ident2)))
case find (\(Class className _ _ _ _) -> className == objType) classes of
Just (Class _ _ _ _ fields) ->
let fieldDetails = [(dt, mod) | VariableDeclaration dt mod id _ <- fields, id == ident2]
className = objType
in case fieldDetails of
[(resolvedType, mod)] ->
if mod == "public" || (mod == "private" && checkAccess objType (lookup "this" symtab))
then TypedExpression resolvedType (BinaryOperation NameResolution expr1' (TypedExpression resolvedType (FieldVariable ident2)))
else error $ "Field '" ++ ident2 ++ "' is private and cannot be accessed outside its class."
[] -> error $ "Field '" ++ ident2 ++ "' not found in class '" ++ objType ++ "'"
_ -> error $ "Ambiguous reference to field '" ++ ident2 ++ "' in class '" ++ objType ++ "'"
Nothing -> error $ "Class '" ++ objType ++ "' not found"
resolveNameResolution _ _ _ _ = error "Name resolution requires object reference and field name"
checkAccess :: DataType -> Maybe DataType -> Bool
checkAccess objType (Just thisClass) = objType == thisClass
checkAccess _ Nothing = False