Generating an Abstract Syntax Tree for java source code using ANTLR

11,298

Solution 1

The setps to generate java src AST using antlr4 are:

  1. Install antlr4 you can use this link to do that.
  2. After installation download the JAVA grammar from here.
  3. Now generate Java8Lexer and Java8Parser using the command:

    antlr4 -visitor Java8.g4

  4. This will generate several files such as Java8BaseListener.java Java8BaseVisitor.java Java8Lexer.java Java8Lexer.tokens Java8Listener.java Java8Parser.java Java8.tokens Java8Visitor.java

Use this code to generate AST:

import java.io.File;
import java.io.IOException;

import java.nio.charset.Charset;
import java.nio.file.Files;

import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.ParserRuleContext;
import org.antlr.v4.runtime.RuleContext;
import org.antlr.v4.runtime.tree.ParseTree;

public class ASTGenerator {

    public static String readFile() throws IOException {
        File file = new File("path/to/the/test/file.java");
        byte[] encoded = Files.readAllBytes(file.toPath());
        return new String(encoded, Charset.forName("UTF-8"));
    }

    public static void main(String args[]) throws IOException {
        String inputString = readFile();
        ANTLRInputStream input = new ANTLRInputStream(inputString);
        Java8Lexer lexer = new Java8Lexer(input);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        Java8Parser parser = new Java8Parser(tokens);
        ParserRuleContext ctx = parser.classDeclaration();

        printAST(ctx, false, 0);
    }

    private static void printAST(RuleContext ctx, boolean verbose, int indentation) {
        boolean toBeIgnored = !verbose && ctx.getChildCount() == 1 && ctx.getChild(0) instanceof ParserRuleContext;

        if (!toBeIgnored) {
            String ruleName = Java8Parser.ruleNames[ctx.getRuleIndex()];
            for (int i = 0; i < indentation; i++) {
                System.out.print("  ");
            }
            System.out.println(ruleName + " -> " + ctx.getText());
        }
        for (int i = 0; i < ctx.getChildCount(); i++) {
            ParseTree element = ctx.getChild(i);
            if (element instanceof RuleContext) {
                printAST((RuleContext) element, verbose, indentation + (toBeIgnored ? 0 : 1));
            }
        }
    }
}

After you are done coding you can use gradle to build your project or you can download antlr-4.7.1-complete.jar in your project directory and start compiling.

If you want a the output in a DOT file so that u can visualise the AST then you can refer to this QnA post or directly refer to this repository in which i have used gradle to build the project.

Hope this helps. :)

Solution 2

OK, here are the steps:

  1. Go to the ANTLR site and download the latest version
  2. Download the Java.g and the JavaTreeParser.g files from here.
  3. Run the following commands:

    java -jar antlrTool Java.g
    java -jar antlrTool JavaTreeParser.g
    
  4. 5 files will be generated:

    1. Java.tokens
    2. JavaLexer.java
    3. JavaParser.java
    4. JavaTreeParser.g
    5. JavaTreeParser.tokens

use this java code to generate the Abstract Syntax Tree and to print it:

        String input = "public class HelloWord {"+
                   "public void print(String r){" +
                   "for(int i = 0;true;i+=2)" +
                   "System.out.println(r);" +
                   "}" +
                   "}";

    CharStream cs = new ANTLRStringStream(input);
    JavaLexer jl = new JavaLexer(cs);

    CommonTokenStream tokens = new CommonTokenStream();
    tokens.setTokenSource(jl);
    JavaParser jp = new JavaParser(tokens);
    RuleReturnScope result = jp.compilationUnit();
    CommonTree t = (CommonTree) result.getTree();

    CommonTreeNodeStream nodes = new CommonTreeNodeStream(t);

    nodes.setTokenStream(tokens);

    JavaTreeParser walker = new JavaTreeParser(nodes);

    System.out.println("\nWalk tree:\n");

    printTree(t,0);


    System.out.println(tokens.toString());

    }

public static void printTree(CommonTree t, int indent) {
    if ( t != null ) {
        StringBuffer sb = new StringBuffer(indent);
        for ( int i = 0; i < indent; i++ )
            sb = sb.append("   ");
        for ( int i = 0; i < t.getChildCount(); i++ ) {
            System.out.println(sb.toString() + t.getChild(i).toString());
            printTree((CommonTree)t.getChild(i), indent+1);
        }
    }
}
Share:
11,298

Related videos on Youtube

Aboelnour
Author by

Aboelnour

Updated on June 04, 2022

Comments

  • Aboelnour
    Aboelnour almost 2 years

    How Can I Generate an AST from java src code Using ANTLR?
    any help?

  • Makan Tayebi
    Makan Tayebi over 9 years
    Thanks @Aboelnour. But the page you are reffering to does not exist anymore. any help?
  • Makan Tayebi
    Makan Tayebi over 9 years
    Another question, if JavaTreeParser.java is not among generated files, where does it come from?
  • ConductedClever
    ConductedClever over 6 years
    @MakanTayebi I think this would help: github.com/antlr/grammars-v4
  • ConductedClever
    ConductedClever over 6 years
    @Aboelnour, would you please update this answer for antlr v4?
  • Aboelnour
    Aboelnour over 6 years
    sorry guys, my answer is too old, and I don't enough current knowledge for antlr v4...
  • Aboelnour
    Aboelnour over 6 years
    If anyone can update the question with a good new answer for v4 I will accept his answer instead of mine.
  • isnvi23h4
    isnvi23h4 over 6 years
    the link to the java grammer seems to be broken
  • Davide
    Davide about 5 years
    Even if this answer is old, I put the link back: habelitz.com/images/downloads/javagrammars/…