Tag Archives: performance

Skila3 — starting again…

Well, the first Skila attempt is dead and maybe luckily so. The compiler was way too slow (it needed for entire testsuite around 40 minutes) and I tried to fix it, then to start from scratch but I stopped the work — the design of the language itself was doomed to fail from the beginning. You simply cannot ignore performance of the language (not compiler) itself — in C# if you rely on exceptions instead of methods TryXXX you can face x220 slowdown. In Python if you use regular arrays you will soon hear “use NumPy instead”, in JS if you need speed you have to shove your code into GPU (“ConvNet.js is great but it is written in JS, so now we have DeepLearn.js…”). It is the same story no matter what the language is.

And so on the 28th of December, 2016 I started Skila3, initially as only a good-bye exercise in type matching. In time the work gained some speed as I added more features.

This time I dropped parser, at this point it is only a distraction. This decision allowed me to gain a new perspective — why not treat semantic analyser and code generator as engine, and parser as exchangeable skin? Something like C# and VB for .Net but on different level.

Anyway, syntax is so ahead of me that I don’t think about it at all. As for the old code, instead of refactoring it I started clean, I don’t even bring it in pieces. Currently I have rough basics — notion of deep immutability, conditionals, loops, generics, concurrency and channels (as in Go). A lot of ideas from initial Skila wait for being brought to this new project.

To speed working on compiler at first I wrote interpreter instead to check whether I can add numbers together, and well — I can.

The code is more straightforward than before — there two passes of computation, evaluation and validation. The former does name binding, computes the types of the expressions and so on, the latter one checks whether the outcomes of the expressions are read or whether the variables are assigned before they are used. I put the code at GitHub, so take a look at spare time.

Tagged , , ,

Performance crisis

The last days were really tough for me because everything started to fall apart — compiler was way too slow to experiment with, programs written in Skila were too slow to run and allocated too much memory. To build DOM tree out of 1MB web page — single web page is taking nowadays over one million bytes? that’s insane — I needed over 0.5GB of memory.

Without solving those issues I would have to stop development of Skila, using other backend for speed and memory has no sense, since I am still experimenting with language features. So I had to blend in with smarter internal wiring.

First I dropped basic types from the backend — Char, Int and Real. They were just wrappers and because of their ubiquity they took a lot of extra space in total. Currently I use native PHP values and redirect them to appropriate static classes — when I know in advance I have for example Int type in hand I call static method for it. When I don’t know what I have I call small dispatcher which checks the types in runtime and makes appropriate call. Luckily PHP makes such task trivial. I saved around 60% memory with this move.

It was promising but not enough — I still had the bottleneck and its name was String (another commonly used type) which was implemented as Skila Array of Chars. When you split 210KB string into an PHP array of characters (since PHP does not have Char type it is an array of strings really) you will get 14MB array — in other words single character takes around 60 bytes. You can reduce the memory a little by converting an array to splFixedArray, but there is no enough gain here (not mentioning time required for all back and forth conversions). So I decided to simply wrap the native string with my type and rewrite all String methods. That move was big time winner — from initial 0.5GB the memory requirement dropped to 7MB and the speed is close to native PHP code, because now in fact it is almost native PHP code.

Currently I am going to make two micro optimizations — postponing parameter default initialization and adding inline functions. Both for sole purpose — speeding up assert execution.

With compiler I was not so successful — I use rich AST with pointers to parent’s node. It is really comfortable to work with such tree (and worse — I got used to it) but there is penalty in performance, you have to clone nodes whenever you would like to share some subtree. I don’t know how I will get rid of it completely, nevertheless the diet is already implemented with positive effect — compile time was reduced by 50%.

A lot of work ahead of me, but at least I managed to get out of trouble — I can continue my journey…

Tagged , ,

All the things that can bite you…

The summer is rather slow when it comes to development (hopefully we passed 35°C zone) but I am doing good progress with cleaning the code anyway.

One thing that stopped me was C# struct. Most of the time I work with classes and despite I know what the struct is, I simply don’t have vast practice with it, and so I was caught off-guard.

I was implementing Option which should work as Nullable for all types. Since it should express strictly the notion of an option, it has to be a struct (otherwise you would have to deal with missing option, unset option and set option with missing value — tad too much). Implementing it was no brainer, thanks to C# properties next to SetValue I added nice Value setter which changes HasValue as well. Concise, easy to understand and… completely wrong.

I added it to Skila and to my surprise on one end I called SetValue just to observe no value was ever set on the other end. What on the Earth? Well, struct is always passed by value, which means that if Option is returned from property getter or a method, there is a copy returned of real data. I could call SetValue million times and it would not change a thing for stored data (only local, temporary, copy of Option would be changed).

So rule of thumb for C# developer — by default make your structs immutable. For a language/framework designer there is a lesson too — don’t make risky features implicit.

I‘ve recently watched “Essential Truths Everyone Should Know about Performance in a Large Managed Codebase” by Dustin Campbell. It is longer version of an old saying “profiler is your friend” — but what struck me while watching it is that in C# you don’t see bottlenecks. Boxing? Implicit. Unboxing? Implicit. Method wrapping (with boxing)? Implicit.

Add to this a framework which uses object too often, and profiler becomes necessity. And I thought implicit conversion constructors in C++ were bad — boy, you can at least make them explicit by yourself.

Tagged , , , , , ,

Hard time with optimization

It is about time to think about NLT performance — the results I saw in profiler output really surprised me. I thought that parser is the culprit of poor performance (I tested regular LALR), but except for creating huge logs I was mistaken. Running regular expressions against input (lexer) takes most of the time.

Another surprise is how a tiny line can be a major drag — converting enum to int takes about 9% of total time. This is actually due to flaws in C# — firstly lack of enum constraint on generic types. All I have is some struct (despite I know it is an enum really) so I have to cast data to object and then back to int. In effect I have unboxing just after boxing, which should lead to no-op — but it does not ¹.

All in all, I managed to shave running time to around 30% of original performance — I will keep enum as it is now, for regular expressions I am thinking about writing my own engine capable of matching multiple regexes at the same time (the whole purpose of the lexer is to pick up the longest match). Since it is a challenge not to be taken lightly (time) I moved to other issues.

I am not a fan of using ad-hoc symbols, but I see its purpose — it is easier to explain some production without introducing extra code. NLT grammar looked too heavy for my taste, so I added support for literals used as terminals in parser productions. One can write:

expr Expr -> e1:expr "+" e2:expr
             ...  

instead of introducing beforehand appropriate terminal like “PLUS”.

¹ C# non-boxing conversion of generic enum to int?.

Tagged , , ,