UPARSE Value-Bearing Control vs. Rule Success/Failure Control

hostilefork · April 13, 2021, 6:56pm

When @(...) was value-bearing, and (...) was not, I had the interesting idea that @(...) would consider the rule to have failed if it produced a NULL.

Because the only reason it could fail was producing NULL, you could hygienically turn it into a rule that succeeded on null just by saying opt @(...)

Cool as that is, I didn't feel comfortable transferring it over to plain GROUP! when it took over the role of "bearing a value" in a rule. What if a group incidentally returns NULL, like this?

>> verbosity: 1

>> did parse "aaa" [some "a" (if verbosity > 2 [print "found an A"])]
== #[false]

Oops... you had an IF that vanished and returned NULL. But then that triggered failure for the (...) rule. That seems unfriendly. While I'm okay with saying that BLOCK! rule's evaluative result is NULL because the group made a null, it doesn't seem the rule should fail to match.

But I feel like something got lost in the shuffle here. Like what I'd managed with generators:

gen: func [<static> n (0)] [
    if n < 3 [return n: n + 1]
    return null
]


>> uparse "a" ["a", data: some @(gen)]  ; old @(...) semantics (non-matching)
== "a"

>> data
== [1 2 3]

How could you make something like a GROUP! that would fail if it were NULL, but return its value otherwise?

This also calls to attention that there is a choice in constructs like WHILE to react to the success or failure, and not the value a rule produces. This made me wonder if we could do it the other way...if all succeeding rules returned non-null, and failing rules returned null...then the WHILE/SOME could be driven by the result and not whether it matched or not. Then you could have a rule that matched and produced NULL, but stopped a WHILE or SOME anyway.

But there's a problem with that if any interesting matching rules return NULL, e.g. opt:

>> did uparse "aaa" [while ["a" opt "b"]]
== #[false]

The problem is that if OPT "B" matches and returns NULL, and WHILE looks at that NULL instead of the success, the rule is run just once. So OPT would have to succeed and bear a value other than NULL (perhaps the elusive NULL-2...that seems like it might work)

That solves things like that OPT, but still throws a wrench in:

>> did uparse "aaa" [while ["a" opt "b" (if debug [print "Still have this."])]]
== #[false]

That detour examined...we can conclude it is probably the case that WHILE and SOME should not be looking at the "value bearing" result. They should only look at if the rule matched or not...and whether it returned NULL is not relevant (only relevant to what it returns as its value, if it bears the value of its last matching run).

>> did uparse "aaa" [x: while ["a" opt "b"]]
== #[true]

>> x
; null - because `opt "b"` was null so ["a" opt "b"] as a whole was null

>> did uparse "aaa" [x: while ["a" elide opt "b"]]
== #[true]

>> x
== "a"  ; because you ELIDE'd the opt

>> did uparse "aaa" [x: while ["a" opt "b" (1020)]]
== #[true]

>> x
== 1020  ; arbitrary value from GROUP!

But still, I feel there should be some way to branch or react on the value bearing bit. Maybe it's a refinement to while? WHILE/VALUE ? Maybe it's a new operator? while [non-null (gen)] Or an arity-2 operator while [non (null) (gen)] ?

I'm not sure. Just wanted to document my feeling something was lost when @(...) failing on NULL was folded into a (...) that doesn't do that.

Maybe the answer is we just get more careful with (...) and understand that if it evaluates to NULL the rule fails, so you throw in some non null thing at the end of your ():

uparse "aaa" [some "a" (if verbosity > 2 [print "found an A"], true)]

But that seems to bias it the wrong way.

Maybe if you want the optionality you say /(rule) and that's a signal to fail if it's NULL?

Or an operator like must (rule)?

Not sure. But still, I think matters are heading in a positive direction for the flexibility of PARSE.