Lexer written in Javascript?

22,138

Solution 1

Something like http://jscc.phorward-software.com/, maybe?

JS/CC is the first available parser development system for JavaScript and ECMAScript-derivates. It has been developed, both, with the intention of building a productive compiler development system and with the intention of creating an easy-to-use academic environment for people interested in how parse table generation is done general in bottom-up parsing.

The platform-independent software unions both: A regular expression-based lexical analyzer generator matching individual tokens from the input character stream and a LALR(1) parser generator, computing the parse tables for a given context-free grammar specification and building a stand-alone, working parser. The context-free grammar fed to JS/CC is defined in a Backus-Naur-Form-based meta language, and allows the insertion of individual semantic code to be evaluated on a rule's reduction.

JS/CC itself has been entirely written in ECMAScript so it can be executed in many different ways: as platform-independent, browser-based JavaScript embedded on a Website, as a Windows Script Host Application, as a compiled JScript.NET executable, as a Mozilla/Rhino or Mozilla/Spidermonkey interpreted application, or a V8 shell script on Windows, *nix, Linux and Mac OSX. However, for productive execution, it is recommended to use the command-line versions. These versions are capable of assembling a complete compiler from a JS/CC parser specification, which is then stored to a .js JavaScript source file.

Solution 2

If you want to build JavaScript parsers and code generators, check out the MetaII implementation in Javascript.

A MetaII Compiler tutorial walks you through building a completely self-contained compiler system that can translate itself and other languages:

MetaII Compiler Tutorial

This is all based on an amazing little 10-page technical paper by Val Schorre: META II: A Syntax-Oriented Compiler Writing Language from honest-to-god 1964. The MetaII compiler complete self-description is about 30 lines! I learned how to build compilers from this back in 1970. There's a mind-blowing moment when you finally grok how the compiler can regenerate itself....

The tutorial explains MetaII, how it works, and implements MetaII compiling MetaII into JavaScript. You can easily modify this compiler to parse other langauges, and produce different Javascript.

I know the website author from my college days, but have nothing to do with the website.

Solution 3

Jison is probably the best and most active lexer & parser generator out there for Javascript. It mimics Bison and Yacc.

Jison: http://zaach.github.io/jison/

If you want just a light weight lexer (~100 sloc) you can take a look at Lexed.js: https://github.com/tantaman/lexed.js

Solution 4

Here is an example of a parser for a "pseudo" natural language of instructions, which was implemented in pure JavaScript with Chevrotain Parsing DSL:

https://github.com/SAP/chevrotain/blob/master/examples/parser/inheritance/inheritance.js

This example even includes support for multiple natural languages (English & German) using grammar inheritance.

Chevrotain falls under the category of "libraries out there for parsing that are 100% javascript" as it performs no code generation. Using Chevrotain is similar to "hand crafting" a recursive decent parser, only without most of the headache such as:

  • Lookahead function creation (deciding which alternative to take)
  • Automatic Error Recovery.
  • Left recursion detection
  • Ambiguity Detection.
  • Position information.
  • ...

as Chevrotain handles that automatically.

Solution 5

For simple parsing tasks I'm quite fond of using a variant of Pratt's Top Down Operator Precedence parser. While Pratt wrote the original paper using an old Lisp dialect, the same concepts can easily be used in most any language. In fact, Douglas Crockford wrote an excellent article on Top Down Operator Precedence parsing in JavaScript, which might be just what you need.

Share:
22,138
VJ2013
Author by

VJ2013

BY DAY: Chief Technology Officer of Virteom BY NIGHT: I'm a coder and software architect for various projects, products, and communities.

Updated on August 11, 2020

Comments

  • VJ2013
    VJ2013 almost 4 years

    I have a project where a user needs to define a set of instructions for a ui that is completely written in javascript. I need to have the ability to parse a string of instructions and then translate them into instructions. Is there any libraries out there for parsing that are 100% javascript? Or a generator that will generate in javascript? Thanks!