Monthly Archives: May 2013

If-else statement

Minor thing but anyway — I just finished “if-else” statement. That’s right, “if-else” is not an expression, it is a statement, but if time shows expression is needed I will add it as an expression as well (I already have syntax in head). But first — statement:

if cond1 then
  ...
else cond2 then
  ...
else
  ...
end

For some time I was wondering if I should introduce “else if”, “elif” or something else. As for the latter form (and its variations) I prefer to stick to English. I opted against the former, because I prefer to kill a chain of last end (it’s doable with no sweat), but this means there is a slight ambiguity when reading the code (not for compiler though). And presented form at each point there is double safe guard. Consider the tail:

else cond3
  ...
end

It is human error — but in order to fool compiler, you would have to add “then” or semicolon. Thus double mistake.
As for middle section:

else cond2
  ...

Again human error (dropped “then”) but then you would have to add a semicolon (not mentioning error on not used expression outcome). Double mistake here is needed as well.

And there is no ambiguity when reading the code with inner “if”, because there is no such instruction as “else if”:

if cond1 then
  ...
else if cond2 then
  ...

Despite formatting you don’t have to look down how many “ends” close such code, because the second “if” cannot be part of the first one.

OK, that’s it for what is done, one word about expression-like syntax (on hold):

a = if cond1 : simple_expr1
    else cond2 : simple_expr2
    else         simple_expr3
    end

I admit, I hate reading “if-else” in any language I know, because of the formatting, and I seriously (no kidding) considered changing “if” to “when”.

a = when cond1 : simple_expr1
    else cond2 : simple_expr2
    else         simple_expr3
    end

Isn’t it a beauty?

Tagged

Using context of method call

I am reading “The Ruby Programming Language” by David Flanagan, and Yukihiro Matsumoto — a piece about indexing a string. In Ruby to get a character from a string counting from the end of it you can use negative index, for example:

str[-5]

I am not the fan of such design, because in every language there is some valid range of indices and such approach doubles it. You have to be twice as much thorough in Ruby to make sure negative index is not a side effect of some computation. And negative numbers don’t come for free — it is one bit less for the digits.

And since just a day before I wished for having some reference to a caller when using call chaining altogether it gave me an idea — why not add to Skila simply a call context? Indexing a string could look like this:

str[\length-5]

Backslash (from top of my head) would switch context to caller, so any expression valid for it, would be valid in a call as well. Such feature could bring more flexibility to call chaining (I love this style except when I have to debug it).

I am not sure about it, but at least it is worth remembering, thus I posted it.

Tagged ,

Function result — how to read it?

Lately I spent quite some time thinking about perfect “else if” syntax (don’t confuse it with “if-else”), and I just wonder how much time I will spend on this subject.

Let’s start from easy one — in all C-like languages function call and condition would like this:

if ((idx = str.indexOf("hello")) != -1)
  ...

This is ugly. Not only we have to add extra scope (not shown here) to avoid variable leak, but we are using magic number. Worse — what about such values as “-2”? They are valid for “idx” but they are invalid in context of “indexOf”.

Before you even warm up consider more difficult case — reading values from a map (associative array, dictionary):

value = dict["hello"];

This won’t fly, if there is no such key we will get an exception. So maybe like this:

if (dict.ContainsKey("hello"))
  value = dict["hello"];

OK, this is safe, but now we hit map twice. In case of C# the following is the best approach:

if (dict.TryGetValue("hello",out value))
  // we have valid value here

One can say — not so bad — but! First of all you have to mark your “trying” functions in some way to distinguish them from “regular” functions. For me it is not appealing to have bunch of “TryRead”, “TryConcat” and alike. Secondly, such syntax is inconsistent with already existing notion:

result ← expression

I have nothing against switching the flow (to — from left to right), however what bothers me here is inconsistency. I wouldn’t like to see this happening to Skila.

What are the other options? “Options” indeed:

if (!(value = dict["hello"]).isEmpty())
  Console.WriteLine(value.get());

Such code is efficient (kind of), with result being kept on left, but it is too elaborate. And all the time you use value, you have to use “get” method for it (or introduce temporary variable).

I was struggling with some mad ideas of dual variable, which is partially and implicitly converted to “bool”, or solving the issue with tuples (Skila style):

if ((success:@,value) = dict["hello"]).success then
  // we have direct access to value
  // success is dropped

But the concept of failure from Icon struck me as really elegant. Map indexer could “return” failure in case of missing key, and this would cause “if” to fail.

I didn’t write even “hello world” in Icon, so better read about Icon from somewhere else, Wikipedia may be a good starting point.

At first glance it looks, oh, so charming — I am afraid though that this beauty might come at cost of problematic maintenance, workflow which is hard to understand, not to mention complex implementation.

But this approach will be my first to investigate. As usual, if you have better, or simply other ideas worth considering, I am all ears.

Tagged , , ,

How Google ruined my evening

I wish I could spent fruitfully my entire day, not bothering with irrelevant issues — not so many days like this in real life. Today when uploading newer NLT package to Google Project Hosting site I was notified that starting on January 15, 2014 upload of binary packages will be disabled. This translates to picking up another server, finding and updating all the links… I am thrilled. Sure, it is their server, free of charge, but it would be at least decent to give a honest reason for such call, not some pathetic excuse of abusing upload feature by some users.

I am planning gentle switch — keeping both sites in parallel, and if Google does not cancel this new policy, eventually kill Google address. The problem is which hosting site choose as a replacement? I tried Bintray (complex, no tickets), CodePlex (ticket system is a disaster), LaunchPad, GitHub and BitBucket (all three have no support for binary packages). Visually much less appealing than GPH (no MarkDown) there are SourceForge and BerliOS — most likely one of these two will be my choice.

Let’s leave the story of killing a good product aside now — this weekend, as I intended I wrote resolver for function overloading. And also:

  • I added extra syntax and logic for recursive function call,
  • I introduced infinitive “loop-end” — no big deal, but writing “loop” instead of “while true do” is quicker and more elegant,
  • I noticed Skila didn’t have boolean operators (and, or, exor), so I added them.

I also changed the concept of the function and procedure distinction. Now, it is only a function, the only difference is that by default the result of the function has to be read.

def square(x : Int) : @ Int = x*x;

The “@” character before type indicates it is OK to drop the result. One keyword saved (proc), and now syntax is a bit more consistent with existing sink variable. This will also help in introducing light-weight interfaces.

And to keep me busy — all those changes in syntax made NLT parser fail, so I had a reason and opportunity to improve it.

OK, it is Tuesday already, so I went way overboard with “weekend work” idea — see you next week then, I am going to add strict control of execution flow and rich strings.

Tagged ,

Emptiness — don’t rely on it

Looking through my old notes how I should design Skila and what pitfalls to avoid I found out a piece about whitespaces. Probably the most famous language designed to rely on whitespaces is Python. But other languages follow the trend as well — like Scala or Ruby. They seem elegant, clean, but there is a dark side when you have to read the code or worse — fix it.

For me it is surprising, because time already proved that such design leads to frustrating errors — you see something, but compiler sees something else, for example:

       hello

What do you see? Word “hello” prefixed with seven spaces, right? No, wait — eight spaces. No, wait, there is tab character in the middle — can’t you see it?

And here exactly is the problem because I cannot, yet computer can, and it will precisely wreck me just because I am not a whitespace sharpshooter.

Writing programs is demanding craft, the code is information, blueprint, whitespaces are merely a background for that blueprint — and I cannot get it how someone could rely code on something that really does not exist. Only this feature and experience with writing Python code when later I see a flood of whitespace related errors rendered this language for me as dead option.

I was kidding, there is no-breaking space in the middle, not tab.

But other languages are no better, consider this simplified Scala piece:

if (cond1())
  action1();
else (cond2())
  action2();

Pretty typical “if-else-if” case, the code is so trivial that I spent embarrassing part of the evening fixing it. Don’t get me wrong — it is not about finding a bug in entire program, I already pin-pointed it and I knew the bug was right here, in front of my eyes, but I looked and for me the code was flawless and compiler concurred. How can it be flawless and erroneous at the same time?

What went wrong with Scala here? When computing boundary of expressions it relies on end of the line — if EOL can end expression, compiler assumes it is the end of expression. Without this fancy feature compiler could save my time in a split of the second by demanding semicolon. Freaking single character.

And I thought that computers are made to save our, human, time. That’s why Skila comes with reliability — if it can set a safe guard, good. If the safe guard can be doubled, even better. If it can be tripled — perfect. I don’t want to hear the story about comma misplaced with dot in the 21th century.

The presented code in Skila would give you errors because of the following reasons — missing semicolon, ignored result of a function and misplaced “then” keyword which serves as condition terminator. Those guards do not limit your expressiveness, yet even drunk or dead you still couldn’t ignore compiler error.

Function is not an expression

Coming from SML world I killed the notion of the function as an expression, because implicit evaluation of the expression (and thus — a function) can lead to undesired behaviour. Probably the best examples come from Scala forums, where it is asked what happened with if-no-else expression — it is not a proof, but solid hint, that some people were taken by surprise.

One could say it is solely Scala issue, because of snowball effect — the aforementioned function-expression notion and inference of result type — but the less surprises in code, the better. Besides, not only writing code matters, but also maintaining it. And actually seeing “here, here, it is the result of the function” is much clear than doing manual parsing and evaluation in the head.

However there is one exception… if there is no big code, because entire function is simple expression. In case of such one-liner function could be an expression without loss of readability.

Thus in Skila a function is either single expression:

def foo() : Int = addThis(5,2);

or a list of statements:

def bar() : Int
do
  var x : Int = addThis(7,3);
  return x+10;
end

but in both cases the result type is not inferred and has to be given explicitly.

Tagged

Recursive and ancestor calls

The problem with recursive calls is they look the same as regular calls to a function — i.e. you have to specify the name of the function and arguments. The obvious shortcoming is writing recursive lambda — in languages like C# you cannot do this. The second one looks trivial but distracts me — when I don’t have too elaborate function, let’s say 3­–5 lines, and I have to write another, very similar, one. Creating unified, more general, function for such short form is an overkill, so in all cases I remember I copy&paste the code changing important bits. But every time I do this, I have to pay special attention if I changed the recursive call — otherwise my program would be compiled without any error, yet this would introduce a bug into the code.

Skila forbids recursive calls by function name — you have to call “self” for recursive call. “self” does not mean “function(s) with this name”, it means “this function”.

This improves distinction when reading the code — if you see “self”, you know automatically there is recursion involved. If not — it is regular call. In the second case, the workflow could still be recursive (via cross calls), but checking this is out of scope of compiler (and there is no purpose anyway).

There is also another type of call with improved visual factor — a call to closest base method in class inheritance tree. When writing descendant “SayIt” method instead of using the name:

base.SayIt("hello world");

call “super”:

super("hello world");

and the “same” method will be invoked, but implemented in ancestor class.

Tagged , ,

Operator — what is it?

Either I am missing something or there is not too much ink spent on the issue of picking up an operator from the sequence of symbols. Sadly to say, I am still guessing and improving my parser by trial&error approach when it comes to shift/reduce operator selection. Consider such case:

5 + 4 | * 3

where “|” character denotes boundary between stack and input. Assuming “*” is defined with higher priority than “+” it is easy to say we should shift. Sure, but how can parser tell that “+” is the operator to consider? Why for example does “*” not stand for reduce operator as well? Or what if we have the same sequence but written as:

NUM + NUM | * NUM

and “NUM” is defined in precedence table as well, thus leading to problem of resolving priority between “NUM” and “*”, instead of “+” and “*”? The bigger the precedence table, the more valid the issue is.

So many questions and so little answers…

And since I was just bitten by this, I solved the problem of choosing the right operator for reduce by considering the last global operator on the stack (within the symbols considered for reduction). The global operator is the one defined without specifying for which productions it should be applied. Example? Usually “+” is defined as-is, without worrying about productions, thus I would call it global operator. In contrast to this, take a look at those productions:

fq_identifier := IDENTIFIER
fq_identifier := fq_identifier COLON COLON IDENTIFIER
named_argument := IDENTIFIER COLON expression

The first two productions are for C++-like syntax for accessing static property of a class (simplified here), the last one is for passing a named expression to a function. The way they are written now, they will cause shift/reduce conflict on “COLON”. One solution would be to define global operator “COLON”, however this is crude. The more precise way would be to define “COLON” locally — for “fq_identifier” and “named_argument” only. This translates to ignoring “COLON” operator when searching on the stack.

Will it work reliably? I don’t know. If not — I have another two refinements in my pocket:

  • setting sets of related operators explicitly, this way user could separate “+” from “COLON”,
  • adding info for each operator where it should be looked for — on stack, in input, or both.

But! I am not a big fan of reinventing the wheel. So if you — dear reader — know any good resource about this subject, I would be grateful for letting me know about it.

Tagged

Prolonged weekend

I was supposed to spend only weekends on Skila, but writing all posts and setting up WordPress and finding domain and… took about a day. And it is not all tied up the way I wanted, since “skila.org” was taken, “skila.wordpress.com” was taken as well (just my luck), and since WordPress uses absolute URLs I had to make some hoops to make the proper domain address somehow stick. As the effect the readers cannot really bookmark any post, unless they open it as new tab or window — sorry about that.

I am happy that I changed my mind and this blog went live — I realize only now that if I had waited more, until my project is 100% ready (my usual modus operandi), writing about all features, not missing any, would be an ordeal. And who knows, maybe that way I find a job related to compilers.

This is all for the start — it was fruitful time, I managed to finish enforced named parameters in Skila. See you next week — I intend to write resolver for function overloading in backend layer.

Parsing rules with constraints

While adding a rule for enforced named parameter in Skila I wished I could add simple expression to parsing rule like:

param := IDENT=named IDENT COLON IDENT

which would translate that I can have two identifiers in a row, however first has to be equal to “named”. Since NLT is not a generator suite, instead of this straightforward approach I added extra lambda where you can make such comparison yourself. As a bonus I added ability to set custom error message and control if the parsing should continue or stop (check NLT package for details).

Relaxed (job done), I prepared to publish the package, I added small new example for this feature, I checked old examples — bang! Infinite loop in lexer, what the devil? Apparently the one that loves details — I didn’t validate lexer rules whether they match empty input. Such lexer rule:

IDENT = [a-zA-Z]*

asks for trouble because it will work as fallback rule some day matching zero characters and moving forward by zero characters. Not any longer, fixed!