The PARSE of /PROGRESS

There has been a lot of fiddling over time with PARSE's return value. :violin:

In order to make it play nicely with ELSE and THEN, a failed parse should return NULL. But a successful parse could return many things.

  1. Just returning #[true] makes the output of PARSE easier to read in tutorials. This isn't overwhelmingly important, given the pervasiveness of DID as the programming analogue to NOT.

  2. Returning the input value makes it easy to use PARSE as a validator for data.

    if parse data [integer! integer! end] [  ; exactly two integers
       call-routine data
    ] else [fail]
    
    call-routine (parse data [integer! integer! end] else [fail])  ; nicer
    
    call-routine non null parse data [integer! integer! end]  ; even nicer :-)
    
  3. Returning how far a successful parse got is strictly more informative, as the information on a partial process is difficult to reconstruct otherwise.

It seemed that #3 had won out, due to its flexibility. But that required a change to the semantics of PARSE to not automatically fail on partial inputs. This made it necessary to stick an END on every set of parse rules to get legacy compatibility.

But the need to tack on the END made some things seem less concise and elegant. (Consider the "END" in the above example on matching two integers.) And surveying how other languages do "destructuring" made me feel that PARSE was the best answer in the Redbol world. When you're matching a structure against [tag! tag!] it feels somewhat wrong for [<x> <y> <z>] to "match" when it seems "over the limit".

Best of All Worlds: /PROGRESS Multi-Return Output

Why choose? We have a very concise way of asking for the parse position now, and the very act of requesting a parse position could be enough to remove the requirement of reaching END.

>> parse "abc" ["ab"]
; null

>> [original progress]: parse "abc" ["ab"]
== "abc"

>> progress
== "c"

You don't need to name a variable for the original value if you don't want to. [_ progress]: ... and I'm going to make it so that multi-returns let you pick what the overall evaluation result is if you don't want it to be the main one (e.g. [_ @pos]: ...). There will be some way of avoiding naming a result...I'm still working through it, but [_ @]: ... could well be a possibility for how to say that, if we figure out what @ is (an empty email?):

>> [_ @]: parse "abc" ["ab"]
== "c"

Once again, pretty slick. The real "a ha" of this is the idea that PARSE can implicitly switch into a mode that doesn't require reaching the end by the mere act of requesting a result for how much progress it made.

1 Like