Compiler Project (Bali#) Grammar Spring 2008

File			-> (ClassDefinition | DelegateDefinition)*
DelegateDefinition	-> delegate type name ( [type (, type)*]);
ClassDefinition		-> class name { (classMember)* }

classMember		-> public (Method | MemberVariable)
	Method		-> type name ([type name (, type name)*]) statementBlock
	MemberVariable	-> type name ;

statement		-> (ifStatement | whileLoop | expressionStatement | statementBlock | returnStatement | (my localVariableDeclaration) )
statementBlock		-> {statement*}
ifStatement		-> if(expression) statement [else statement]
whileLoop		-> while(expression) statement
returnStatement		-> return expression ;
expressionStatement	-> [expression] ;
localVariableDeclaration	-> type name [= expression] ;

expression		-> e1
e1			-> e2 (= e2)*
e2			-> e3 ((||) e3)*
e3			-> e4 ((&&) e4)*
e4			-> e5 ((< | > | <= | =< | >= | => | == | !=) e5)*
e5			-> e6 ((+ | -) e6)*
e6			-> e7 ((* | / | %) e7)*
e7			-> e8 (functionCall)*
e8			-> (! | - | * | cast<type>)* e9
e9			-> literal | objectConstructor | reference | (expression) | builtInFunction

builtInFunction		-> sizeof(Type) | ((malloc|print)(expression))
objectConstructor		-> new name functionCall
reference			-> (name | this) (.name)*
functionCall		-> ([e1 (, e1)*] )
literal			-> number | string | character | true | false | null

type			-> name (*)*

Notes:

Comments:
Anything following a # on a line is considered to be a comment. This doesn't show up in the grammar above because it's handled by the scanner (Scanner212.java).
Keywords:
Keywords (those shown in bold-blue above) cannot be used as names. We use 'name' in the set of productions above to refer to an arbitrary word (i.e., a token whose kind is 'w'). Thus, something like the use of else as a variable name is a syntactic error.
The complete list of keywords:
public, class, delegate, my, cast, new, if, else, while, return, sizeof, print, malloc, null, true, false
The following words have special meaning, but they are not keywords. All of these special names are defined in the global namespace (See below for more on namespaces, also see below for what these predefined names mean).
  • Built In Types (in the global namespace): int, boolean, char
  • Built In Variables (in the global namespace): Main, ReadInt, ReadChar, ReadString
Bali Program:
All classes and delegate definitions of a program must be in one file.
In Bali, memory is not required to be freed. The compiler need not act as a garbage collector. It is acceptable to leave allocated memory on the heap on program exit.
When a program is executed it's "main" method is executed (see main method below).
Main Method
There must be a method called main that has an empty parameter list. It is a semantic error if this function does not exist (or there are multiple main methods).
The return type for the main function must be int; this integer value appears as the exit code when the SaM Simulator halts.
This main method is a member method of a class. When the Bali Program is executed, the an instance of the class that contains this main method is created from a no-args constructor (see classes below). It is a semantic error if a no-args constructor does not exist for this class. The main method of the class is then called with the this parameter set as the instance of the class created by the no-args constructor.
This initial instance of the class that contains the main method is stored in the global variable Main. (see keywords and predefined names)
Declarations
Variables need not be initialized to a default value. (This is implementation dependent).
Variables must be declared before use.
Variables declared at the class-level are fields of that class.
Variables declared at the function level (using the my keyword) are local to that function.
Namespaces
At any point in a program there may be many active namespaces: the global level, the class/function level, the method level and any nested statement block levels.
Any name can be used at most once per namespace. This means, for instance, that a member method name must be different from all class member variable names. Names can be re-used if they are at different levels.  For instance, a function can have a local variable that has the same name as a global variable; in this case, the global variable is inaccessible from within the function.
To determine the meaning of a name that is not a type, Bali first checks the method-level namespace, then the class/function-level namespace, and finally the global-level namespace.
A global variable can be inaccessible due to its name being re-used at a lower level, but a class-level name is always accessible via the keyword this.
A name does not have to be defined before it is used for the class level and the global level. This rule allows, for instance, class A and class B to each refer to the other. However, in the method/function namespace (ie: within a statementBlock) a name can not be used before being defined. A name always refers to the meaning within the current namespace.
Function parameters and variables declared in the method/function's statement block are to be considered in the same namespace.
Namespace rules for types
All type names in Bali reside in the global namespace. The types in Bali consist of the predefined types and any classes and delegate types created by the program.
For a type-name (a name that appears as syntactic type as specified in the grammar above), the name is looked up in the global namespace. All other names are looked up in the active namespaces as specified above.
Methods/Functions
The type of the expression in the return statement must match the type specified in the function header. To return early from a constructor, use return this.
If you "fall off the end" of the function body (i.e., you execute the last statement and that statement is not a return statement) then a default return is automatically executed.  This default-return returns the following default return values:
  • int, boolean: = 0 (integer)
  • char = '\0' (char)
  • Pointer types: = 0 (memory address)
  • Delegate: = 2 stack spaces, the first is 0 (memory address) and the second is 0 (program address)
  • class types can not be used as a "return type" for methods, though className* may be. className* is a pointer type and as such follows the rule for Pointer types.
Functions cannot be overloaded.
The declared type (i.e., the type that can be determined from declarations within the program) for an argument of a function call must be compatible with the corresponding parameter type.
Functions CAN be assigned to variables if the variable type is a delegate and it's signature matches that of the function
Method/Function Parameters
Parameters pass by value. (Note: All variables except for delegate types use only one stack location -- delegates use 2 stack locations)
Parameter names are local to the current function (see namespaces).
Parameters can NOT have type className but may be of type className*
Statements
Expressions in if and loop statements must be of type boolean.
Expressions
For assignment expressions, the types of the left-hand and right-hand sides must be compatible.
For an assignment statement to be semantically valid the target of the assignment statement (the reference on the left) must be something that can be assigned to (i.e., it must evaluate to something with an address).
This implies that you cannot assign to ReadInt etc. unless you redefine it as a local/class-member variable.
This also implies that you cannot assign to a function-call since the result of such a function-call is a value on top of the stack.  Note though that it's possible to assign to a function-call that is preceded by a * operator
This also implies that you can not assign to a function. (ie: you can not redefine a function with another).
Expressions can NOT have type className but may be of type className*
Different rules apply to each operator:
  • +If the two operands are integers, their sum is the result.
  • +If the one operand is an integer and the other a pointer, the result is (pointer + integer*sizeof(pointer-Type-with-one-less-star))
  • -If the two operands are integers, the result is integer1-integer2.
  • -If the two operands are pointers, the two pointers must be of the same type (and have the same level of indirection). The result is (pointer1-pointer2)/sizeof(pointer-Type-with-one-less-star).
  • For the other mathematical binary operators, the 2 operands must be of type int and they each return an int.
  • For the relational operators (<, >, <=, =<, >=, =>), operands must be of type int, and they return a boolean.
  • For the Equality operators (==, !=) operands must be of equal type and they return a boolean.
  • The cast operator takes the given expression and returns that expression with the type contained between the ankle brackets. The only checking that is done is to ensure that the type being casted to has the same "size" as the type of the given expression. Otherwise, no compiler-time nor run-time check occurs for this casting.
  • - The negation unary operator (as opposed to the subtraction operator) takes an expression of type int and returns an int.
  • ! The not unary operator takes an expression of type boolean and returns a boolean.
  • * The value-at opeartor (as opposed to the multiplication operator) takes an expression of any type T* and returns a type T.
Different rules apply to the built in functions:
  • The sizeof function takes in a type and returns the size of (the number of memory locations in the SaM engine) that a variable of that type would occupy. Below is a list that summarizes these values:
    • For int, char, boolean and all pointer types, this is 1.
    • For delegatetypes, this is 2.
    • For className types this equals the number of member variables that are defined for that class.
  • The malloc function takes in an expression of type int and returns a the memory address of the newly allocated memory (on the heap) of size given by the expression. The return type for malloc is int*. (Therefore, in general you need to cast it to the appropriate type).
  • The print function takes an expression of type char, int or char*. For a string (char*) to print properly, it must be null terminated.  The print function always returns the integer 1.
For assignment, expression types that can be used as the immediate left hand side of an expression are:
*, a member field reference (but not a member method/function), a local variable.
User Input
Rudimentary input is available using the predefined variables (in the global namespace): ReadInt, ReadChar, ReadString
The types of these built in variables are: int, boolean and char* respectively.
It is a semantic error to use one of these variables on the left side of an assignment statement. (Though note that this variable may be shadowed in a member method, in which case it refers to the shadowed variable and not this global "special" variable.
Predefined Names
The names int, boolean and char are predefined as global types. You can have local variables that use these names if you want; it's legal even though it's bad programming style.
The names ReadInt, ReadChar and ReadString are also predefined global variables.
The namespace rules (see above) imply that it is illegal to have delegate definitions or classes that use any of these names.
Classes
Fields and methods are all local to the class's namespace; they are accessible using their unqualified names from within constructors and methods in the same class.  If there is a local variable with the same name as a field then the field can be accessed using the notation this.fieldName.
There is no inheritance in Bali. Additionally, casting is NOT type checked.
All fields and methods of a class are public. (note the public keyword)
A field and a method cannot share the same name (this is different from Java and follows from the namespace rules for Bali).
There are a number of rules about constructors:
  • A constructor for a class is a method with the same name as the class in which it is contained.
  • There is only ever at most one declared constructor. (This follows from the namespace rules).
  • This constructor must have a return type of className*
  • If no constructor is declared, then there is a default no-args constructor (a constructor that takes no arguments). This constructor does not do anything.
  • The constructor of a class may NOT be used in a reference (see References) and my only be used when preceded with the new keyword. (This rule means that it is illegal to assign a constructor method to a delegate variable).
Delegates
this
As in Java, this refers to the current class instance. It is only valid within a member method or constructor.
The form this.name refers to a field of the current instance.
The form this.name(arguments) refers to a method of the current instance.
The form this (by itself) refers to the current instance.

Sample Programs

Hello World


class Main
{
	public int BaliMain()
	{
		print("hello world");
	}
}

Sring Example


delegate bool LinkedListCallback(int v, int* state);
class String
{
	private char* _InternalString = null;
	int GetPrimitiveStringLength(char * s)
	{
		if(s == null) return 0;
		char * c = s;
		int length = 0;
		while(*c != '\0') { c = c+1; length++; }
		return length;
	}
	char* ClonePrimitiveString(char* s)
	{
		if(s == null) return castnull;

		int l;
		char* NewPrimitiveString = malloc(sizeof(char) * (l=GetPrimitiveStringLength(s)));
		for(int i=0;i<l;i++) *(NewPrimitiveString + i) = *(s+i);
		return NewPrimitiveString;
	}

	String String(char *s)
	{
		_InternalString = ClonePrimitiveString(s);
	}
}

Linked List Example


delegate bool LinkedListCallback(int, int*);
class LinkedList
{
	public LinkedListNode *_FirstNode = null;
	public LinkedListNode *_LastNode = null;
	public int _Length = 0;
	public int Length() { return _Length; }
	public LinkedList LinkedList() { }
	public LinkedListNode Add(int v)
	{
		local LinkedListNode* n = new LinkedListNode(v);
		if(_FirstNode == null) _FirstNode = _LastNode = n;
		_LastNode.SetNext(n);
		_LastNode = n;
	}
	public bool RemoveDefaultCallback(int v, int* state)
	{
		return v == *(caststate);
	}
	public int RemoveValue(int v)
	{
		int* ValueToRemove = cast malloc(sizeof(int));
		*ValueToRemove = v;
		Remove(RemoveDefaultCallback, ValueToRemove);
		free(ValueToRemove);
	}
	public int Remove(LinkedListCallback cb, int* state)
	{
		# left as an excersize
	}
	public int PrintCallback(int v, int* state) { print(v); }
	public int Print() { Enumerate(PrintCallback); }
	public int Enumerate(LinkedListCallback cb)
	{
		local LinkedListNode* n = _FirstNode;
		while(n != castnull)
		{
			cb(n);
			n = n.Next();
		}
	}
}
class LinkedListNode
{
	public int _Value; public LinkedListNode _Next = null;
	public LinkedListNode Next() { return _Next; }
	public int SetNext(LinkedListNode n) { _Next = n; }
	public LinkedListNode LinkedListNode(int n)
	{
		_Value = n;
	}
}
class Main
{
	public int main()
	{
		print("hello world");
		local LinkedList* List = new LinkedList();
		List.Add(1);
		List.Add(1);
		List.Add(2);
		List.Add(1);
		List.Add(2);
		List.Add(1);
		List.Add(1);
		List.Add(3);
		List.Add(1);
		List.Add(3);

		List.Remove(1);
		List.Remove(3);
		print(List.Length);
		List.Print();
	}
}