API

Symbols, alphabets, words, and languages

This library main focus are formal languages, with particular attention to parsing aspects.

It is hence important first of all to focus on how symbols are represented; to keep things simple (and make the library more interoperable with the rest of Python data structures), in LibLet there isn’t a specific type for such entities that are always represented by strings (possibly longer that one character). Observe in passing that Python hasn’t a type for characters, but uses strings of length one to represent them.

It is straightforward then to conclude that alphabets are represented by sets of strings.

On the other hand, one must pay particular attention to words that are represented as sequences of strings, most commonly tuples or lists. It will never be the case that a LibLet word coincides with a Python string! In the very particular case in which all the symbols have length one (as Python strings) one can use the shortcuts list(s) and ''.join(w) to go from string to words, and vice versa.

Finally, languages are represented by sets of sequences of strings.

Productions, Grammars, and Derivations

The basic building block of a grammar is a production represented by the following class.

For the purpose of presenting the Knuth Automaton in the context of LR(0) parsing, productions are extended to include a dot in the following class.

A grammar can be represented by the following class, that can be instantiated given the usual formal definition of a grammar as a tuple.

Once a grammar has been defined, one can build derivations with the help of the following class.

Derivations can be displayed using a ProductionGraph.

Transitions and Automata

Albeit in the context of parsing (that is the main focus of this library), the role of finite state automata and regular grammars is not the main focus, a couple of classes to handle them is provided.

First of all, a basic implementation of a transition is provided; as in the case of grammars, the symbols and terminals are simply strings.

From transitions one can obtain a representation of a (nondeterministic) finite state automata using the following class.

Automata can be displayed using StateTransitionGraph.from_automaton.

Instantaneous Descriptions

During the analysis of some parsing algorithms can be convenient to keep track of the computation of the related (nondeterministic) pushdown automata. The following classes have the purpose to represent an instantaneous description of the automaton given by its stack and, the tape content and the position of the reading head.

ANTLR support

This module provides a commodity class to deal with ANTLR for the Python 3 target.

LLVM support

This very experimental module provides a commodity class to play with LLVM IR language.

Rich display

Utilities and decorators