DynParser

From thelas.dk

Jump to: navigation, search

DynParser is a library for defining parsers and tokenizers.

It is written in C++, and has a nice and simple object oriented design.
Tokens are defined by regular expressions, and languages are defined by BNF formulas.
It has only been tested in Linux, but should work in any OS with a standard C++ compiler.
There are currently only alpha and beta releases avaliable, but the development is progressing steadily, and only some almost trivial functions need to be implemented before the API is stable, and the first real release is ready.

Contents

Features

  • A simple but powerful API (tokens can be defined by regular expressions, and types can be defined by BNF formulas)
  • Portable (Currently tested in Linux, but there should be no problems on other platforms)
  • Dynamic design (no pre-compilation is required)
  • Unparsing feature: A parsed tree can be unparsed to the original string
  • Parsing and unparsing preserves indentation and whitespace (so parsing and unparsing really are each others inverse)

Missing features

  • Efficiency (DynParser is less efficient than most parser-generators because of the missing pre-compilation)
  • Generality (DynParser is not as expressive as SLR and other parsing algorithms)
  • No consistency check (Normally it is required that a grammar can yield at most one way to parse a string, but this is not currently checked). This will probably not be implemented until version 2.0, which is some time away.

Versions / Download

Requirements

The latest version of DynParser depends on the regular expression parsing in rcp. This library must be installed before DynParser can compile.
Please follow the instructions on the RCP page.

Older versions (<1.0.0) of DynParser in stead of rcp depends on the PCRE++ library for regular expression matching. In order to compile older versions of DynParser it is necessary to install the PCRE++ development-package.
In Ubuntu the required package is libpcre++-dev.

Installation Instructions

Dynparser is available from my deb repository in the package libdpl.

The manual compilation procedure is as follows

tar -xvzf dpl-0.0.3.tgz
cd dpl-0.0.3
make config
make build
sudo make install

There are two sample application xmlparser and calc in the examples directory. Simply write the following to test it out.

cd examples
make config
make build
./xmlparser "<xml></xml>"
./calc "7-3*2"

Example

We will give a simple example of a calculator.

Source Code

// calc.cpp
#include <iostream>
#include <sstream>
#include <dpl/parser.hpp>
using namespace dpl;
using namespace std;

int a2i(const string &a) // {{{
{ // Might be overkill, but it works
  stringstream ss;
  ss << a;
  int i;
  ss >> i;
  return i;
} // }}}

int calc(const parsed_tree *tree) // {{{
{ // Do the actual calculation
  if (tree->type_name == "number") return a2i(tree->root.content);
  if (tree->type_name == "eqnm" && tree->case_name == "case1") return a2i(tree->content[0]->root.content);
  if (tree->type_name == "eqnm" && tree->case_name == "case2") return calc(tree->content[1]);
  if (tree->type_name == "eqnm" && tree->case_name == "case3") return calc(tree->content[0]) * calc(tree->content[2]);
  if (tree->type_name == "eqnm" && tree->case_name == "case4") return calc(tree->content[0]) / calc(tree->content[2]);
  if (tree->type_name == "eqnm" && tree->case_name == "case5") return -calc(tree->content[1]);
  if (tree->case_name == "case1") return calc(tree->content[0]) + calc(tree->content[2]);
  if (tree->case_name == "case2") return calc(tree->content[0]) - calc(tree->content[2]);
  if (tree->case_name == "case3") return calc(tree->content[0]);
  cout << "Calc error unexpected case: " << tree->type_name << "." << tree->case_name << endl;
  return 0;
} // }}}

int main(int argc, char **argv) // {{{
{ // Main program
  if (argc<2)
  { // check for argument
    cout << "Syntax: calc \"<expression>\"" << endl;
    return 1;
  }

  string exp = argv[1]; // Copy argument
  Parser parser; // Define parser
  parser.DefToken("","[ \t\r\n][ \t\r\n]*");
  parser.DefGeneralToken("number", "0123456789");
  parser.DefKeywordToken("(");
  parser.DefKeywordToken(")");
  parser.DefKeywordToken("+");
  parser.DefKeywordToken("-");
  parser.DefKeywordToken("~");
  parser.DefKeywordToken("*");
  parser.DefKeywordToken("/");

  parser.DefType("eqn ::= eqn + eqn | eqn - eqn | eqnm");
  parser.DefType("eqnm ::= number | ( eqn ) | eqnm * eqnm | eqnm / eqnm | ~ eqnm");
  
  parsed_tree *tree=parser.Parse(exp); // Parse argument
  // Calculate and print result
  cout << parser.Unparse(*tree) << " = " << calc(tree) << endl;
  // Clean up
  delete tree;
  return 0;
} // }}}

Compilation

g++ calc.cpp -o calc -ldpl

Execution

>./calc "5+3 * 2"
5+3 * 2 = 11
Personal tools