Could PARSE Have a "MAYBE" Combinator?

hostilefork · April 7, 2023, 12:14am

By design, nulls are handled noisily--right at the moment of fetching the word!--in UPARSE (and PARSE3):

>> prefix: null, suffix: ")"

>> parse "aaa)" [prefix, some "a", suffix]
** Error: (prefix is null, and we raise errors for that in parse)

If we didn't raise an error it seems there are only two other options:

Make null always succeed, keeping the parse position where it is (synonym for [])
Make null always be an unsuccessful combinator match, but not cause a failure (synonym for BYPASS)

I think (1) feels like a pretty obvious bad idea, because null is supposed to represent a soft failure. This is the behavior for void, e.g. parse "ab" ["a" void "b"] would work.

I'm not too pleased with the idea of (2), and prefer the error as the default.

...that said... it seems there should be some operators or combinators that let you get the other behaviors.

What About a "MAYBE" Combinator To Use With Null?

In standard code, the policy of "void-in-null-out" has worked well, with MAYBE transforming soft-failure nulls to voids:

 ; non-PARSE handling of NULL via MAYBE

 >> append [a b c] null
 ** Error: cannot append ~null~ isotope to a block

 >> append [a b c] maybe null
 == [a b c]

 >> block: null

 >> append maybe block [d e]
 == ~null~  ; anti

So if we imagine applying this to the parse example, it would presumably do this:

>> prefix: null, suffix: ")"

>> parse "aaa)" [maybe prefix, some "a", maybe suffix]
== ")"

For the above parse to succeed, the combinator made by maybe prefix would have to succeed and not advance the input.

But It Doesn't Combine Well In Larger Rules

What if what you intended was "if there's a prefix, match some non-zero number of instances, but if prefix is null then don't worry about matching":

You might try doing that by COMPOSE'ing your rules. But UPARSE actually lets us write that out literally using GET-GROUP! rule synthesis:

>> parse "aaa)))" [:(if prefix '[some prefix]), some "a", :(if suffix '[some suffix])]
== ")"

But what if we tried to do that with MAYBE...could it work?

>> parse "aaa)))" [some maybe prefix, some "a", some maybe suffix]
; infinite loop!

No dice. We've said maybe prefix just succeeds and doesn't advance the input when prefix was null. But if you combine that with some the null case will just match nothing in perpetuity, causing an infinite loop.

This may look familiar, because if you write some opt [...anything...] you'll always get an infinite loop. But in that case it's just wrong thinking: you know that the repetitive nature of some looking for an eventual non-match meant you must have intended some [...anything...] (at least one) or opt some [...anything...] (zero or more).

NOTE THAT HISTORICAL PARSE HAS NO GOOD ANSWER FOR THIS

Rebol2 treats NONE! as a no-op which just succeeds but doesn't advance the input. So the following gives you an infinite loop:
rebol2>> prefix: none suffix: ")"

rebol2>> parse "aaa)))" [some prefix some "a" some suffix]   
; infinite loop
The hackish "must make progress" rules in R3-Alpha actually make the above "work as intended", because the SOME will bail out after one non-advancing match. I don't consider that a "good" answer--more a random effect.

Another Problem: MAYBE is a very similar word to OPTIONAL

Imagine looking at this code:

>> prefix: "(", suffix: ")"

>> parse "aaa)" [maybe prefix, some "a", maybe suffix]
== ~null~  ; anti

"But wait"... I can imagine someone saying... "shouldn't MAYBE mean that if it's not there, you skip the rule"?

No... MAYBE is speaking about the optionality (nullability) of the rule itself, not the optionality of the (non-null) rule succeeding. That's a fairly fine point of distinction that might not be obvious to people.

So perhaps it should go by another name in parsing. There's the shorthand of ?, and that could be learnable as "this rule may be NULL, and if so then just ignore it and keep going".

An extra barrier to creating MAYBE is mechanical

It's a bit of a trick, because what happens when you "combinate" a ~null~ is that it has to abruptly fail. Because if it returned a definitional error, that would just seem like a combinator that didn't match to all the other constructs, and they wouldn't promote it to hard failure. They'd just treat it like anything else that didn't match.

So the only way I can see a null-disabling MAYBE parse combinator working would be by quoting its argument, doing the rule fetch itself, and turning into a failing combinator if it fetched null. This breaks the model somewhat.

Anyway, we've lived without the MAYBE combinator, and there are workarounds (as I mention, conditional code inside a splicing GROUP! construct). Perhaps it isn't necessary. But wanted to write it up.

hostilefork · June 23, 2023, 1:00am

A Quirky MAYBE Combinator Is Probably Bad News

Not everything in the evaluator universe is going to have a PARSE parallel. If you have a null rule, I guess you may just have to use a GET-GROUP! and call the evaluator's MAYBE.

>> c-rule: null

>> parse [a b] ['a 'b :(maybe c-rule)]
== b

This will keep you from erroring on the null by turning the null into a void.

UPARSE has richer mechanisms to help the higher-order rules, to more intentionally express the R3-Alpha progress rule...which you could use:

>> prefix: null, suffix: ")"

>> parse "aaa)))" [
        opt some further :(maybe prefix)
        some "a"
        opt some further :(maybe suffix)
     ]
== ")"

If you don't want PREFIX to be "ornery" when it's used in PARSE, then initialize it to void instead of null and this cleans up a bit:

>> prefix: void, suffix: ")"

>> parse "aaa)))" [
        opt some further prefix
        some "a"
        opt some further suffix
     ]
== ")"

There are a lot of tools at one's disposal, and I don't think we need anything crazier than this. I'm content enough with it, I think!