This article is part 9 of the Creating JVM Language translation. The inconsistency between the original code and the original code has been corrected in the new code repository. We suggest you refer to the new code repository.

The source code

Github

1. Change the grammar rules

Let’s create a new rule “returnStatement”. Why isn’t it called returnExpression? After all, expressions always return values. Doesn’t a statement return a value? This may sound tricky, but the return value doesn’t always return a value. In Java, the code int x = return 5; It doesn’t make sense, nor in Enkel. In other words, an expression can always assign a value to a variable. That’s why the return is a statement, not an expression.

statement : variableDeclaration
           //other statements rules
           | returnStatement ;

variableDeclaration : VARIABLE name EQUALS expression;
printStatement : PRINT expression ;
returnStatement : 'return' #RETURNVOID
                | ('return')? expression #RETURNWITHVALUE;
Copy the code

Return statements take two forms:

  • RETURNVOID – used in methods that have no return value. The return keyword is required; no expression is required after it
  • RETURNWITHVALUE – used in methods that return a value. The return keyword is not required, but an expression is required

Therefore, the method can display or return a value from the hermit:

SomeClass {
    fun1 {
       return  //explicitly return from void method
    }
    
    fun2 {
        //implicitly return from void method
    }
    
    int fun2 {
        return 1  //explicitly return "1" from int method
    }
    
    int fun3 {
        1  //implicitly return "1" from int method
    }
}
Copy the code

After the above code is parsed, the AST graph is displayed as follows:

As you can see, the hermit return value in fun2 is not handled in the AST. This is because methods are empty blocks, and matching empty blocks as return values is not a good idea. Therefore, the actual return statements are added manually during the bytecode generation phase.

2. Match the Antlr context object

After parsing, the ReturnStatement is converted from the antlr context object to the POJO class ReturnStatement. The goal of this step is to match only the data needed for bytecode generation, rather than fetching data directly from antLR generated objects, which would make the code look ugly.

public class StatementVisitor extends EnkelBaseVisitor<Statement> {

    //other stuff
    
    @Override
    public Statement visitRETURNVOID(@NotNull EnkelParser.RETURNVOIDContext ctx) {
        return new ReturnStatement(new EmptyExpression(BultInType.VOID));
    }
    
    @Override
    public Statement visitRETURNWITHVALUE(@NotNull EnkelParser.RETURNWITHVALUEContext ctx) {
        Expression expression = ctx.expression().accept(expressionVisitor); 
        returnnew ReturnStatement(expression); }}Copy the code

3. Detect empty return of hermit

Assuming the method contains a hermit return, no return statement is generated during the parsing phase, which is why we need to detect this scenario and manually add a return statement during the bytecode generation phase.

public class MethodGenerator {
    //other stuff
    private void appendReturnIfNotExists(Function function. Block block,StatementGenerator statementScopeGenrator) { Statement lastStatement = block.getStatements().get(block.getStatements().size() - 1); boolean isLastStatementReturn = lastStatement instanceof ReturnStatement;if(! isLastStatementReturn) { EmptyExpression emptyExpression = new EmptyExpression(function.getReturnType()); ReturnStatementreturnStatement = new ReturnStatement(emptyExpression);
            returnStatement.accept(statementScopeGenrator); }}}Copy the code

The above method checks whether the last statement of the method is a return statement and adds a return instruction if it is not.

4. Generate bytecode

public class StatementGenerator {
    //oher stuff
    public void generate(ReturnStatement returnStatement) {
        Expression expression = returnStatement.getExpression();
        Type type = expression.getType();
        expression.accept(expressionGenrator); //generate bytecode for expression itself (puts the value of expression onto the stack)
        if(type == BultInType.VOID) {
            methodVisitor.visitInsn(Opcodes.RETURN);
        } else if (type== BultInType.INT) { methodVisitor.visitInsn(Opcodes.IRETURN); }}}Copy the code

Therefore, return 5 goes through the following phases:

  • Get the expression from the return statement (in this case, 5, type is value)
  • Generate the bytecode corresponding to 5. (expression. The accept (expressionGenerator) call expressionGenerator. Generate Value) (Value)
  • In the bytecode generation phase, a new value of 5 is generated and pushed onto the operand stack
  • The IRETURN instruction pushes data off the top of the operand stack and returns it

Bytecode represents:

 bipush        5
 ireturn
Copy the code

Example 5.

Suppose we have the following Enkel code:

SumCalculator {

    void main(string[] args) {
        printSum (5,2)} int sum(int x,int y) {x+y}}Copy the code

The resulting bytecode is as follows:

$ javap -c  SumCalculator
public class SumCalculator {
  public static void main(java.lang.String[]);
    Code:
       0: getstatic     #12 //get static field java/lang/System.out:Ljava/io/PrintStream;
       3: bipush        5
       5: bipush        2
       7: invokestatic  #16 // call method sum (with the values on operand stack 5,2)
      10: invokevirtual #21 // call method println (with the value on stack - the result of method sum)
      13: return                           //return

  public static int sum(int, int);
    Code:
       0: iload_0
       1: iload_1
       2: iadd
       3: ireturn //return the value from operand stack (result of iadd)
}
Copy the code