Weekend interpreter

An interpreter of a simple programming language. It is based on the previously created text editor, being its extension with the ability to launch the edited file, if, of course, the file contains a program. The interpreter is quite autonomous (game.weekend.interpreter package). The editor provides it only with the name of the current file and a panel for displaying messages and outputting text using the PRINT language command.

The language is called Weekend Game Language. WGL is selected as the default file extension. The language description is in the project repository. Here I will briefly describe the work of the interpreter. It will be convenient to download the project, open it in Eclipse and see how exactly the described is implemented.

The interpreter is launched by the Runner.run() method. This method opens the current editor file and creates an object of the Interpreter class in a separate thread, launching its work by calling the execute() method.

The interpreter constructor creates the objects necessary for work, existing during the entire time of the interpreter's operation.

The Text class object is a wrapper for the interpreted file. It provides methods for convenient reading of the program text, but its level is too low for interpretation. The Text class object is used by the TokenReader. TokenReader, using Text, reads the program and returns the next token. The token is an object of the Token class, which can be (type field): a delimiter (DELIMITER), a string (STRING), a number (NUMBER), a variable (VARIABLE), a command (COMMAND). Since this object does not make any changes to its data, I decided to do without get methods. The class variables are declared as final and cannot be changed. In the execute() method, the tokens are read sequentially. If the token is a command, then the method responsible for implementing the command is called. All of them are in the Command class. If the token is a variable, then I consider this an assignment operator.

Variable

As it was said in the language description, it supports the use of only 26 variables, and a variable is a letter of the Latin alphabet. The Variables class is responsible for working with variables. The class contains an array of 26 elements, which stores the values ​​of the variables, and two methods for getting and assigning the value of the specified variable. Here is such a simple class.

Assignment

So, if the token is a variable, then we read the next token and expect it to be a delimiter (DELIMITER), namely the = symbol. If this is not the case, an error occurs. If this is the case, then we read the expression that follows it using the Expressions.getExp() method. In general, when we expect an expression to follow, we call the Expressions.getExp() method.

This method is a little difficult to understand if you are not familiar with the term "recursive descent method". But in simple terms, we read the next token, store it in the token class variable, read the number using the level2() method, return the token to the token reader for further reading, and return the read number to the calling method.

Is something unclear? Complete nonsense? Well, I warned you.

level2() is responsible for the + and - operations. These are the lowest priority operations. In this method, we read the number using level7(). Actually, we read using level3(), but let's skip levels 3, 4, 5, and 6 for ease of understanding. Level7() will return the value from the token stored in the token class variable. This should be a number or a variable. It will then replace the contents of token with the new token.

Next, we look at the token class variable: is it + or -? If so, we read the new token and store it instead of the previously read one. We read the second number using the level7() method, perform the + or - operations on the resulting numbers, and return the result. If not, we simply return the value obtained at the lower level.

Methods level3(), 4, 5, and 6 are similar to level2(), but are intended for operations *, /, %, ^, unary + and -, brackets. That is, this is the implementation of the priority of computational operations. At the top level there is a number or variable, then brackets, unary + or -, ^, *, /, %, and finally + and -.

Now I will describe the implementation of commands.

REM command

I command TokenReader to translate a line in the file being read by calling TokenReader.nextLine(). In other words, I skip everything that is written to the end of the line.

PRINT command

I read the tokens in a loop. If this is a line feed or the end of the file, the loop ends. If this is a string, I output this string, otherwise I try to read and evaluate the expression. I output the result. If the next token is ';' or ',', then I continue the loop, otherwise I finish the work.

PRINTLN command

This is the same PRINT command, but at the end I output a line feed.

INPUT command

I read the next token. If this is a line, I remember it and read the next token. Here I expect that the variable where the user's input will be placed has been read. Actually, I use JOptionPane.showInputDialog() for input. I understand that this is not very good, and it would be better to do the input in the output panel, but this interpreter is just an exercise in writing interpreters, and I decided not to complicate the program.

GOTO command

I read an expression. I assume that GOTO is followed by a label (line number) to which control should be transferred. Using the Labels.goToLabel(label) method, I transfer control to the specified label. Here, we should digress to the implementation of labels.

Labels

The Labels class object is responsible for working with labels. When a Labels object is created, it scans the program text and reads the number at the beginning of each line. If such a number is found, then the label number and the line number where it was found are placed in an ArrayList. It is clear that with the help of such a list it is easy to get the line number in the program text corresponding to the label, by the label number. And then, using TokenReader.setLine(line_number), set the current position for further reading of tokens.

IF command

I read an expression - the left expression. Then I read the token and hope to get "<", ">" , "=" or "#". Then I read the right expression. I compare the left and right expressions, and if the result is TRUE, I read the next token. This should be THEN. If the result is FALSE, I move to the next line to read tokens, which means I continue interpreting from the line after the IF. So the command after THEN will not be executed.

FOR command

I read a token and check if it is a variable. This will be the loop variable. Then there should be an '=' token. Then the expression is the start value of the loop variable. Then there should be a 'TO' token. Then I read the expression again - this is the end value of the loop variable. If any of this goes wrong, it's an error. I create a ForItem object consisting of three fields: the name of the loop variable, its end value, the line number of the start of the loop body (the line number immediately following the FOR command), and push it onto the stack. That's it. The stack is designed to handle nested loops. I expect the FOR command to be followed by commands - the loop body to be interpreted, and then the NEXT command - the end of the loop.

NEXT command

I pop the ForItem object from the stack. I increment the loop variable by one (yes, in this implementation the loop variable is always incremented and always only by one). If its value is still less than or equal to the final value, I return the ForItem object to the stack and set the current line to the line where the loop body begins. If not, I do nothing. This way, tokens following the NEXT will continue to be processed. In other words, I end the iteration of the loop body.

GOSUB command

After the GOSUB token, I read the expression. Yes, in this implementation the routines are identified by numbers, not names. I then remember the line number following the GOSUB command and push it onto the routine stack (the most common stack). I go to the line marked by the given label (subroutine number). I expect the subroutine to be terminated by the RETURN command.

RETURN command

I read the line number following the GOSUB command from the stack that was previously placed there and set it as the current line for further reading by TokenReader. In other words, I terminate the subroutine and transfer control for further program execution.

END command

The Commands class does not have a method corresponding to this token. When this token is encountered, the interpreter simply terminates.

That's it. More detailed information can only be obtained by parsing the program text.