Wednesday, April 23, 2014

Parsing Problems

Last week I mentioned that I'm going to be working on a parser for the programming language I designed. I've been trying to refine and simplify the language's grammar before starting that. For the most part, the language is C-like and most of the grammar can be summed up with a precedence chart. However, a few of the operators work strangely and throw a wrench into that.

The best example I can give is the colon operator. The colon operator fills the roles of both constructors and type casts. It is typically used like this:

foo int: bar + 10,

Which declares a variable, foo, of type int and initializes it to bar + 10. When the colon operator is used for casts, it might look something like this:

foo = bar + int: baz + 1,

Which is parsed like this:

foo = (bar + (int: (baz + 1)),

Basically, everything to the right of a colon 'belongs' to it. However, this creates a problem. It doesn't fit neatly into the precedence chart. The left side of the colon has a different precedence than the right. This makes the language's grammar more complex.

The behavior of the colon operator is usually pretty intuitive when actually writing code. Likewise, it's not particularly hard on the parser. However, it does substantially complicate learning the language. Instead of a simple precedence chart, I have an unwieldy spreadsheet.



The colon operator is not the only thing complicating the grammar. Right now, I'm trying to change things to reduce weird parsing rules. It cases like the colon operator, I would like to preserve it's general usage without making the user type a lot more.

No comments:

Post a Comment