Tag Archives: string

Strings — performance hurts

I barely have time to write what happens with Skila but this one is so off I cannot resist — strings. Wide Unicode or UTF-8? The latter right, because they take less space so they require less fetches from the memory. How do we index them? By character or by byte? The former will not give us constant time thus by byte.

I know there are currently languages going this path — like Rust or Julia — but I don’t feel comfortable with it. For one, there is no abstraction layer here, the implementation details leak right into API. Secondly it forbids having common IString interface with UnicodeString because it would be indexed differently. Oh boy…

This choice even pushed me to a strange at first glance decision for having reverse methods lastIndexOf starting from exclusive index rather than inclusive. Oddly this is somewhat in sync with ranges (inclusive to exclusive).

UTF-8 string, where will you lead me…?

Advertisement
Tagged , ,

Arbitrary quotations

While working on html code I realized (for the n-th time) that having just verbatim and non-verbatim strings is way too little. Ruby did it right, so I simply copied proven solution:

let s = %s<<p>quotation counts "parentheses"</p>>;

In the process I reverted the syntax for characters literals back to C#:

let c = 'b';

at the expense of adding backtick as meta switch. In Ada apostrophe serves dual role but I don’t see how to make it in Skila grammar:

••• > ' •••

would be ambiguous (accessing meta information about generic type or comparing characters).

Tagged , , , , , ,

Self type and pinned spiral

It is amazing how different ideas can all of the sudden click and create one coherent blueprint for a cool feature. Yes, I know, I should be doing generator — but this is exactly where the problem began.

Here is how it went — I was implementing rich `Iterable` interface and stumbled on `concat` method. On the first glance it is obvious — it takes another `Iterable` and also returns `Iterable`, no problem. But think about using it — when you have some arbitrary `Iterable` it works as expected, but if you have for example `String`… Do you really expect to concatenate two strings and get some `Iterable` in result?

OK, so let’s make a correction ¹ — the outcome should be `Self`. It will work perfectly with `String`, we can also define `prepend` and `append`. But… wait a second, `Set` is `Iterable` as well, right? How can you prepend anything to a set and get a set back? If you don’t see a problem — you can add something to set, but not concatenate sets, or append anything to it, because except for one special case, set is not a sequence (there is no order in a set).

I found a solution, it waits right now for implementation (I needed a break from coding, thus this post) — `Set` is not an `Iterable`, it can be treated as one (thanks to implicit conversion) — subtle difference, but making things consistent. You can call `append` on `String` and get back a `String` and at the same time you can `append` on `Set` and get an `Iterable`, because it will be silently converted to this type on call.

Implicit conversions were easy — they are just a parameterless methods overloaded on result type. In Skila with “have to read the result of the function” feature it simply fitted right in.

One thing was still missing — how can you ensure that regular method returns `Self` type? You can track whether the returned value is really pure `Self`, but sometimes you don’t work with `Self` type in the first place. The only alternative would be relying on the fact that every descendant type will override such method and produce its own type, making effectively given method a factory of `Self`. And this concept was already present in Skila — `pinned` regular methods waited long for such cause, but now it is done. So either you write one general method with `Self` or you will guarantee with `pinned` every type will have its own implementation.


¹ If I am not mistaken, Scala solves similar kind of problems with what-can-build-what types. This part seems too convoluted for me so I even didn’t dig deeper.

Tagged , , , ,