The PARSE of /PROGRESS

There has been a lot of fiddling over time with PARSE's return value. :violin:

It was long believed that a failed PARSE should return NULL. This would make it play nicely with ELSE and THEN. The question was thus what to return on success:

  1. Just returning ~okay~ makes the output of PARSE easier to read in tutorials. This isn't overwhelmingly important.

  2. Returning the input value would make it easy to use PARSE as a validator for data.

    if parse data [integer! integer!] [  ; exactly two integers
       call-routine data
    ] else [fail]
    
    call-routine (parse data [integer! integer!] else [fail])  ; nicer
    
    call-routine non null parse data [integer! integer!]  ; even nicer :-)
    
  3. Returning how far a successful parse got was strictly more informative, as the information on a partial process is difficult to reconstruct otherwise.

For at least some time, @rgchris favored #3, because many sophisticated tasks are helped by knowing how far PARSE got. But that required a change to the semantics of PARSE to not automatically fail on partial inputs, so the rules had to explicitly ask to hit an <end>

But the need to tack on <end> made some things seem less concise and elegant. And surveying how other languages do "destructuring" made me feel that PARSE requiring completion was the best answer in the Redbol world. When you're matching a structure against [tag! tag!] it feels somewhat wrong for [<x> <y> <z>] to "match" when it seems "over the limit".

UPARSE Offers The Best Of All Worlds

Everything changed with UPARSE.

First of all, if a PARSE doesn't match it raises a definitional error. This provides a welcome safety net.

>> parse "abc" ["ab"]
** Error: PARSE partially matched the input, but didn't reach the tail

You can use TRY PARSE if you like, and get NULL..though possibly conflating with NULL. You can use EXCEPT to specifically handle exceptions in a postfix manner. Or using META/EXCEPT will give you a plain ERROR! on definitonal error, and a META'd value otherwise.

All rules synthesize a result, and you can end the parse at any time with ACCEPT:

>> parse "abc" ["ab", accept <input>]
== "abc"

>> parse "abc" ["ab", accept <here>]
== "c"

You can even pack up multi-return values and give them back. The possibilties are pretty much endless, and so the policy of returning the synthesized result has won out.

2 Likes