UPARSE's spin on RETURN: "ACCEPT"

hostilefork · April 18, 2021, 7:53am

R3-Alpha re-used the word RETURN as a PARSE keyword. If it was used, it would make the overall PARSE expression evaluate to what you passed it in a GROUP!.

Conflating the word "return" bugged me, because it felt too easy to confuse with a function's return. Of course, other keywords in PARSE were conflated as well - you had to know you were writing parse rules to know what the semantics were. But RETURN felt worse, for some reason.

RETURN Deleted in 2018, Brought Back as ACCEPT in 2021

Despite having dropped the R3-Alpha feature, I thought to give it another shot in UPARSE... so see if my opinion about it had changed.

I noticed an annoyance: that seemingly convenient casual usages of it require you to specify END or you may be missing something:

>> uparse [1 2 <gotcha!>] [accept collect [some keep integer!]]
== [1 2]

So I was (rightfully) skeptical of incorrect uses.

BUT... when I was writing and obsessing over this issue, PARSE wasn't returning the synthesized result of its block. So getting values out of a parse would have to be done with assignments. People would thus be tempted to reach for RETURN even when they intended the match to make it to the end--resulting in mistakes like the above.

But UPARSE Started Returning Rule Results By Default

Once the natural behavior of a rule would be to synthesize out of the block, why would you "reach for ACCEPT" when it's so obviously cleaner to not use it?

>> parse data [collect [...]]
== [...]

People can understand that works, and checks the rules to the end...and that you should be doing that if you want the rules to be checked to the end.

With the clean answer sitting right in front of you for how to avoid a variable... the only time you would use ACCEPT was when it did what you meant: terminate the parse now, with this result!

So the functionality was deemed to be worth keeping!

hostilefork · September 9, 2022, 3:26pm

If you want, you can give back a definitional error as the overall parse result with accept (raise "some error").

But calling that an "accept" seems a bit misleading.

I've wondered if there should be a REJECT combinator:

rule: [
     some [integer! | accept tag!] reject ("No tag found")
     || reject ("Not solely an INTEGER!-TAG! block")
]

>> parse [1 2 <tag> 3 4] rule
== <tag>

>> parse [1 2 3 4] rule
** Error: No tag found

>> parse [1 2 3 4] rule else [print "Intercepted rejection"]
Intercepted rejection

>> parse [1 2 3 4 a b c] rule
** Error: Not solely an INTEGER!-TAG! block

And I'm thinking PACK could have an analogue in PARSE, to make multi-returns:

 >> [a b]: parse [1 2 <tag> 3 4] [some [integer! | accept pack [tag! <here>]]
 == <tag>

 >> b
 == [3 4]

This would enable returning a result -and- returning a position.

If you're in a SUBPARSE, I kind of feel like the ACCEPT and REJECT should apply to the subparse. I think I could live with bypassing that being done with a THROW of some kind.

There were previous usages of ACCEPT and REJECT that didn't really make sense to me. This seems a lot more powerful.

A FUNCTION's job is to RETURN value(s). Use RETURN.
A CATCH's job is to intercept THROWN values. Use THROW.
A GENERATOR's job is to YIELD values. Use YIELD.
A PARSE(er)'s job is to ACCEPT or REJECT input. Use one or the other... or use the default behavior to ACCEPT the result of the last combinator if the end of input is reached.