Speed of UPARSE

I’m just wondering if any benchmarking has been done on UPARSE. How does it compare to PARSE in Red or in Rebol? Or to parser combinators in Haskell? For that matter, how does it compare to an ordinary recursive-descent parser handwritten in Ren-C?

(Yes, I know UPARSE is unoptimised and slow. But it would be interesting to know how slow.)

Several orders of magnitude slower. When I first tinkered with nativizing it, it was about 700x slower just on some arbitrary case:

Progress on Nativizing Parser Combinators

But, an hour of tinkering at the time got it to 250x slower.

However, predictably that tinkering was quickly out of date, so that early nativization is inactive. Things aren't set up to maintain nativized code in parallel to the usermode code, and it would be too much work to be justified until fully stable.

The baseline will be even slower today... because binding is slower, and because there's more hooking in it which was added for the debugging demo.

If you watch the debugging video, it explains that the debugger works because each call to a combinated parser can be hooked... such that the hook is responsible for invoking the frame. It can inspect the frame beforehand, and examine the multi-return result after it does the invocation. (Or it can skip the invocation entirely, or duplicate the frame and invoke it twice, or mutate the frame before it runs it, etc.) Each combinator just directly calls its subparser, but the subparser has been ENCLOSE'd with a wrapper:

 wrapper: func [
    "Enclosing function for hooking all combinators"
    return: [pack?]
    f [frame!]
][
    return either f.state.hook [
        run f.state.hook f
    ][
        run f
    ]
]

So that's overhead on each call to a parser that any other parser makes. Of course the pattern of enclose :combinator :wrapper itself could be partially nativized in a semi-generic way, as something like:

hookify :combinator 'f [f.state.hook]

But these things are patterns which should inform how to design the system in a more general sense. I am trying to get some kind of story together for how dialects are debugged... as I've said, it's sort of like you make a choice if you want to take "assembly level stepping" (e.g. debug the Rebol instructions) or if you want to debug at the higher level of the dialect. So I expect this hook to be a sunk cost of some kind.

Right now I'm doing other things, though (as well as non-Rebol-related life stuff, which is going to mean Rebol development will get a bit more sporadic than it was in the first couple of months of the year.)

Nativizing Plan

Ultimately, what I plan to do is make it so that all the combinators are in their own C files, where the usermode form is in a comment, something like:

//
// File: %src/core/parse/optional-combinator.c
//

/* <BEGIN USERMODE COMBINATOR>

'optional combinator [
    "If applying parser fails, succeed and return NULL; don't advance input"
    return: "PARSER's result if it succeeds, otherwise NULL"
        [any-value? pack?]
    parser [action?]
    <local> result'
][
    [^result' remainder]: parser input except e -> [
        remainder: input  ; succeed on parser fail but don't advance input
        return null
    ]
    return unmeta result'  ; return successful parser result
]

</END USERMODE COMBINATOR> */

//
//  optional-combinator: native/combinator [
//
//  "If applying parser fails, succeed and return NULL; don't advance input"
//
//      return: "PARSER's result if it succeeds, otherwise NULL"
//          [any-value? pack?]
//      parser [action?]
//  ]
//
DECLARE_NATIVE(optional_combinator)
{
    INCLUDE_PARAMS_OF_OPTIONAL_COMBINATOR;

    Value* remainder = ARG(remainder);  // output (combinator implicit)

    Value* input = ARG(input);  // combinator implicit
    Value* parser = ARG(parser);
    UNUSED(ARG(state));  // combinator implicit

    enum {
        ST_OPT_COMBINATOR_INITIAL_ENTRY = STATE_0,
        ST_OPT_COMBINATOR_RUNNING_PARSER
    };

    switch (STATE) {
      case ST_OPT_COMBINATOR_INITIAL_ENTRY :
        goto initial_entry;

      case ST_OPT_COMBINATOR_RUNNING_PARSER :
        goto parser_result_in_out;

      default : assert(false);
    }

  initial_entry: {  //////////////////////////////////////////////////////////

    Push_Parser_Sublevel(OUT, remainder, parser, input);

    STATE = ST_OPT_COMBINATOR_RUNNING_PARSER;
    return CATCH_CONTINUE_SUBLEVEL(SUBLEVEL);

} parser_result_in_out: {  ///////////////////////////////////////////////////

    if (not Is_Raised(OUT))  // parser succeeded...
        return OUT;  // so return its result

    Set_Var_May_Fail(remainder, SPECIFIED, input);  // convey no progress made
    return Init_Nulled(OUT);  // null result
}}

This way the two can be maintained in parallel, and a debug mode could switch between them to make sure they produce identical results in the tests.

Speaking of optimizations, this doesn't even need the result' temporary. Nor does it care what the raised error is, so the branch doesn't need to be a lambda:

return [@ remainder]: parser input except [
    remainder: input  ; succeed on parser fail but don't advance input
    return null
]

(Truly being equivalent requires the @ to not decay synthesized pack results, which should possibly be the same behavior you get if you'd named a variable, I need to look into this.)

Instead of a preemptive RETURN inside a RETURN, you could write that instead so the branch gives back the result and remainder as a pack (antiform block):

return [@ remainder]: parser input except [
    pack [null input]
]

But I happen to be certain that would be slower... you pay to reduce [null input], and then you have to pay for SET-BLOCK! to unpack the multi-return.

Neat as all that is (and I think it is neat), worrying over such details isn't really where I want to focus right now.


Note that usermode micro-optimization can get fairly crazy, e.g. you know that NULL needs to do a WORD! lookup to fetch the value of NULL... but the quasiform ~null~ has a rote evaluative rule of just setting the quote byte of the quasiform from 2 to 0 to be an antiform... so return ~null~ is faster than return null. Many historical Rebol programmers were obsessed with this kind of detail...

I’m wondering, then… to what extent would it be possible to optimise it without nativising? It would be very interesting to see how far we could get with just Ren-C.

(Of course, it’s perfectly possible to get good performance by writing the important bits in a fast language — it’s what Python does. But I have a great dislike of Python, and this is one of the reasons why.)

I’ll be blunt: this, to me, sounds unusably slow. Beyond the very simplest tasks, I can’t imagine a situation where I’d deliberately choose to take a 250× performance drop. UPARSE is such a key part of Ren-C that I think time spent optimising it would be time well spent.

UPARSE is a flagship dialect, where getting the design correct--including making sure all the necessary infrastructure for debug stepping and proper error delivery--is time well spent...

Rules of program optimization:

  1. Don't do it.
  2. (Experts Only) Don't do it... yet.

Optimizing before nativizing is certainly a good first pass.

But to give you a sense of the tip of the iceberg: UPARSE uses raised ERROR! object antiforms on each unsuccessful match. This means that other values are "in-band" as potential synthesized combinator results (including packs). But it also means that for a combinator to fail, it has to make an object.

It is possible for the objects to be created once and then reused, but that wouldn't permit parametrization. An error can say "didn't match" or it can say "didn't match the character #a" with a parameter, and the latter is more informative.

Generating an error object isn't particularly cheap, even natively. And if you look at the usermode RAISE function (was FAIL, which is now built on top of RAISE, so comments a bit out of date there), you can see that the process for going from a text string to an error isn't trivial:

ren-c/src/mezz/base-funcs.r at d8212c6d2de996f33f2142ebce40c328088f24bf · metaeducation/ren-c · GitHub

Does this mean that the concept to use custom raised errors as a signal for parser mismatching is fundamentally flawed? I don't think so. As I say, this is a flagship for exploring the concepts of what's possible in the medium. It's supposed to have the bells and the whistles and show what you can really accomplish if you stretch.

Making a crappier version of UPARSE that cheaps out on features and can't be stepwise debugged will be a trivial exercise for those who want to do it. Red has every opportunity to go after the market of people who thirst for that school of thought. I'm happy to let them.

Perhaps it's time for you to get your hands dirty and read to understand what UPARSE is doing in terms of how it is implemented, to get a sense of the kinds of power I'm trying to put in the hands of people using a new kind of technology...and why I'm interested in this kind of prototyping ability regardless of its initial costs.

This is a first-of-its-kind design. And if you don't understand why I'm building things the way I'm building them, then...to be blunt...there's not much point in me listening to your feedback regarding the order in which I do things.

And that’s how you end up with slow software!

To make myself entirely clear, this is certainly a position I can sympathise with. My main language is Haskell, which is difficult to optimise, so often I don’t bother. But Haskell still makes it possible to write fast programs, and eventually there comes a time when I do need to dig into the internals and speed up my code. It’s hard, but certainly worth the effort.

This is interesting… to me, this sounds more like an issue with the interpreter being too slow, rather than UPARSE itself. What makes objects so slow to construct?

I have been doing so, on and off, when I get the time. It’s difficult code to understand, and I can’t say I understand precisely how it works yet, but I do at least have some vague idea of what the major components are.

1 Like