Context-free grammar to Sublime-syntax file

Overview

Context-free grammar to Sublime-syntax file

This project produces sublime-syntax highlighting files from a description of a context-free grammar.

It implements a "Generalised Recursive Descent Parser" as described in Generalised recursive descent parsing and follow-determinism by Adrian Johnstone and Elizabeth Scott. It's essentially a non-deterministic LL(1) parser. If the grammar happens to be LL(1) then no backtracking will happen and it's just an LL(1) parser. If the grammar is not LL(1), then alternatives will be tried in sequence, backtracking until one succeeds.

IMPORTANT: The grammar must be non-left-recursive, and also must be follow-determined. If the grammar is left-recursive then the program will complain and alert the user, but I don't know an algorithm to detect whether the grammar is follow-determined. IF THE GRAMMAR IS NOT FOLLOW-DETERMINED, THEN THE LANGUAGE RECOGNIZED BY THE GENERATED SYNTAX WILL SIMPLY NOT MATCH THE INPUT GRAMMAR.

A grammar is follow-determined if whenever a nonterminal X produces both and y <...> , then y is not in the follow set of X. Intuitively, a grammar is follow-determined whenever a single lookahead token is enough to tell us whether it's okay to pop out of the context for X or if we should keep going within X. (i.e. if we've just consumed the prefix and need to decide whether to finish with X or to keep consuming, then the presence or absence of the next token in the follow set of X had better be enough to tell us which option to take, because once we pop out of X then we can't backtrack to try other options anymore.)

Implementation

Sublime syntax files allow one to define contexts; within each context one can match against any number of regular expressions (including lookaheads) and then perform actions like pushing other contexts onto the context stack, pop out of the context, set the scope of the consumed tokens (i.e. instruct Sublime Text that a token is e.g. a function definition and highlight it appropriately), and others. One can also set a branch point and try multiple branches in sequence; if an action taken is to fail that branch point, then the syntax engine backtracks and tries the next branch in the sequence.

See the Wikipedia page on LL parsers for more details on how LL parsers work in general. What I do here is always indicate "success" by pop: 2; i.e. popping twice out of the current context, and failure by pop: 1. Contexts for a given production are pushed onto the stack interleaved by a pop2! context which always pops 2 contexts off the stack. Therefore a failure, which pops once, moves into the "always pop 2" stream until it hits a failure context (to backtrack and try a different branch) or pops all the way out of the current stack.

TO-DO:

  • Detect whether the input grammar is follow-determined. This may be undecidable for all I know.
  • Accept a convenient text description of a grammar rather than require constructing a Python object by hand. Benjamin Schaaf's sbnf is a project with essentially the same goals as this one, and has a very nice syntax for defining grammars so it'd be nice to allow inputs in that format.
  • Also add other conveniences from sbnf such as a prototype context, and embedding other grammars.
You might also like...
VAST - Visualise Abstract Syntax Trees for Python

VAST VAST - Visualise Abstract Syntax Trees for Python. VAST generates ASTs for a given Python script and builds visualisations of them. Install Insta

UdemyPy is a bot that hourly looks for Udemy free courses and post them in my Telegram Channel: Free Courses.

UdemyPy UdemyPy is a bot that hourly looks for Udemy free courses and post them in my Telegram Channel: Free Courses. How does it work? For publishing

an opensourced roblox group finder writen in python 100% free and virus-free

Roblox-Group-Finder an opensourced roblox group finder writen in python 100% free and virus-free note : if you don't want install python or just use w

Gba-free-fonts - Free font resources for GBA game development
Gba-free-fonts - Free font resources for GBA game development

gba-free-fonts Free font resources for GBA game development This repo contains m

MiniJVM is simple java virtual machine written by python language, it can load class file from file system and run it.

MiniJVM MiniJVM是一款使用python编写的简易JVM,能够从本地加载class文件并且执行绝大多数指令。 支持的功能 1.从本地磁盘加载class并解析 2.支持绝大多数指令集的执行 3.支持虚拟机内存分区以及对象的创建 4.支持方法的调用和参数传递 5.支持静态代码块的初始化 不支

A Python3 script to decode an encoded VBScript file, often seen with a .vbe file extension

vbe-decoder.py Decode one or multiple encoded VBScript files, often seen with a .vbe file extension. Usage usage: vbe-decoder.py [-h] [-o output] file

This app converts an pdf file into the audio file.

PDF-to-Audio This app takes an pdf as an input and convert it into audio, and the library text-to-speech starts speaking the preffered page given in t

Fetch data from an excel file and create HTML file

excel-to-html Problem Statement! - Fetch data from excel file and create html file Excel.xlsx file contain the information.in multiple rows that is ne

JD-backup is an advanced Python script, that will extract all links from a jDownloader 2 file list and export them to a text file.

JD-backup is an advanced Python script, that will extract all links from a jDownloader 2 file list and export them to a text file.

Comments
Owner
Haggai Nuchi
https://humantoanimal.com
Haggai Nuchi
Sublime Text 2/3 style auto completion for ST4

Hippie Autocompletion Sublime Text 2/3 style auto completion for ST4: cycle through words, do not show popup. Simply hit Tab to insert completion, hit

Alexander Schepanovski 20 May 19, 2022
Basic Clojure REPL for Sublime Text

Basic Clojure REPL for Sublime Text Goals: Decomplected: just REPL, nothing more Zero dependencies: works directly with pREPL Compact: Display code ev

Nikita Prokopov 23 Dec 24, 2021
[x]it! support for working with todo and check list files in Sublime Text

[x]it! for Sublime Text This Sublime Package provides syntax-highlighting, shortcuts, and auto-completions for [x]it! files. Features Syntax highlight

Jan Heuermann 18 Sep 19, 2022
Statically typed BNF with semantic actions; A frontend of frontend frameworks; Use your grammar everywhere.

Statically typed BNF with semantic actions; A frontend of frontend frameworks; Use your grammar everywhere.

Taine Zhao 56 Dec 14, 2022
A simple string parser based on CLR to check whether a string is acceptable or not for a given grammar.

A simple string parser based on CLR to check whether a string is acceptable or not for a given grammar.

Bharath M Kulkarni 1 Dec 15, 2021
Pygments is a generic syntax highlighter written in Python

Welcome to Pygments This is the source of Pygments. It is a generic syntax highlighter written in Python that supports over 500 languages and text for

null 1.2k Jan 6, 2023
laTEX is awesome but we are lazy -> groff with markdown syntax and inline code execution

pyGroff A wrapper for groff using python to have a nicer syntax for groff documents DOCUMENTATION Very similar to markdown. So if you know what that i

Subhaditya Mukherjee 27 Jul 23, 2022
This package tries to emulate the behaviour of syntax proposed in PEP 671 via a decorator

Late-Bound Arguments This package tries to emulate the behaviour of syntax proposed in PEP 671 via a decorator. Usage Mention the names of the argumen

Shakya Majumdar 0 Feb 6, 2022
A complex language with high level programming and moderate syntax.

zsq a complex language with high level programming and moderate syntax.

an aspirin 6 Jun 25, 2022
Syntax highlighting for yarn.lock and bun.lockb files

Yarn.lock Syntax Highlighting Syntax highlighting for yarn.lock and bun.lockb files Installation Plugin is not publushed yet on Package Control, to in

Alexander Kuznetsov 4 Jul 6, 2022