What is parser generator without even simplest form of repetition? I had to add it to NLT as well:
namespaces -> ns:namespace_+ { new Namespaces(currCoords(),ns) };
Behind the scenes NLT creates a list — List<T>
— where “T
” is either “object
” or the type defined for given symbol (here: “Namespace
” for “ns
”). So when I pass “ns
” in code all it takes is making “Namespaces
” class accept “IEnumerable<Namespace>
”.
When I tried to write the production for parsing parameters I ran into a problem:
// INCORRECT params_ -> (COMMA p:param)* ...
Of course this is incorrect, it says that the list of parameters starts with the comma.
// INVALID IN NLT params_ -> p:param (COMMA p:param)* ...
I didn’t like it very much — not only I would have to make an exception of allowing duplicate names, but the user would have to repeat some of the symbols. Redundancy is bad. So I came up with an idea:
params_ -> (COMMA- p:param)* { new Parameters(currCoords(),p) };
Minus character tells NLT to exclude such symbol from initial repetition and the rule is written in compact way. I don’t know if this syntax will scale up to other problems, but for now it works.
One thing I miss here is local projection — the list of the parameters is created with raw parameters but on-fly transformations are helpful to express more complex scenarios:
// SUGGESTED SYNTAX params_ := (COMMA- p:param -> {p.ToUpperCase()})* { new Parameters(currCoords(),p) };
The main action is the same, but the rule contains transformation for each parameter which is added to the list. I wrote it next to shuffle mode on my “to-do” list.
Another Skila-related issue emerged while working on NLT — there are a lot of constraints which are too complex to handle them in parser rules, however they do not entirely fit into semantic analysis (consider checking if all of the elements inside the group are not marked with minus sign).
Let‘s say I add another line to action — after creating an object I check it, and if it is invalid I raise an exception. It works, but it also means that an error during check actually stops all subsequent checks. NLT provides a mechanism for that — pass an extra argument, ParseControl
, and set its state accordingly to the error. This approach has two problems — it requires more code and it does not prevent from using incorrect object.
I solved the first issue in the simplest way — I introduced “IParseControl
” interface and implemented it in desired classes. This way I can create an object, validate its state in constructor and set state of the “IParseControl
” part accordingly to the error. The action code is clean as before — just creating an object.
The remaining issue is how to prevent from using invalid object? Because it is really interesting not only for parsing I restate this problem fully:
How to create an object, validate it and if it is invalid forbid using it except of passing reference of it?
Maybe I am wrong but in C# world the solution is quite elaborate — let’s assume we are about to validate class “Foo
”:
- make all constructors private in “
Foo
”, - make all fields private in “
Foo
”, - inherit from “
Foo
” a new type “InvalidFoo
” which overrides every property and every method with single statement — throwing an exception, - add static method “
Create
” to “Foo
” — it creates instance of “Foo
”, validates it, if everything is OK it returns it, but if not — it returns an instance of “InvalidFoo
” instead.
I don’t see simpler solution in any language which does not support objects validation natively. I have in mind something similar to Ruby’s taint checking. And so this issue makes an entry in Skila “to-do” list.
Update 2013-11-06: while maybe it is interesting from theoretical point of view, I am puzzled that I didn’t saw a tree while standing in the forest. In NLT I simply even don’t need that mechanism, because once user returns invalid object no other user action is executed. In other words invalid object won’t make its way into any other object. Passing back and forth parse control object might look cleaner, but throwing an exception enforce more tight control and requires less code (1KB).