ANY vs. WHILE... and NOT END

So the distinction between ANY and WHILE in R3-Alpha (and Red) is subtle...which is that there is an implicit NOT END built into ANY:

r3-alpha> parse "aa" [any [opt "a"]]
== #[true]

r3-alpha> parse "aa" [while [opt "a"]]
; infinite loop

Note here that the OPT is critical to the distinction. If the rule would fail at the end, then you don't see a difference. It's not "any number of a" that's running infinitely since the As are limited, it's "any number of optional a" which is unlimited. For comparison:

r3-alpha> parse "aa" [any ["a"]]
== #[true]

r3-alpha> parse "aa" [while ["a"]]
== #[true]

I tentatively removed the distinction in UPARSE, saying that ANY and SOME will keep running so long as the rule does not fail. Reasoning being that if you are iterating and want to stop at the end, you can do so by writing any [not end, ...]

One might wonder why you would want a parse rule that kept going even at the end. For one thing, so long as rules can modify the content, the end might not stay at the end...you could imagine a rule repeating and inserting new material.

UPARSE in particular is something that can pull data from multiple sources...the INPUT is just one of those sources.

>> number: func [<static> counter (4)] [if counter > 0 [counter: counter - 1]]

>> uparse "abc" [return collect [any [opt keep skip, keep @(number)]]]
== [#a 3 #b 2 #c 1 0]

Here we see it kept on collecting so long as the NUMBER generator would give back a non-null result. So it had an additional collect step even when the input provided no source data.

Note: I wanted a way of saying to keep going so long as one or the other, or both, were true...but keep them in order. I don't see a good way to do that. Here is a clumsy way:

any [
    (keepgoing: false)
    opt [keep skip, (keepgoing: true)]
    opt [keep @(number), (keepgoing: true)]
    keepgoing
]

I feel like it should be easier. :-/ It's too bad SOME is taken, because some [keep skip, keep @(number)] could arguably mean this. In DO we could use an analogue to this as well, as a non-short-circuit version of ANY.

In any case, the point is that the END-disregarding WHILE exists in R3-Alpha/Red parse for a reason of generality. You could be in a circumstance where a rule is at the end yet does not fail, and you want to keep iterating regardless.

Note also that there's no SOME form of WHILE, which mandates at least one match with END not stopping it.

All things considered, is it worth it to have the nuance of END behavior...and a separate construct, or should you be expected to put not end in your rule if you need end to terminate a set of conditions that wouldn't otherwise fail on it anyway?

1 Like

So the nuance turns out to be a little bit harder to deal with when END is one of the conditions you want to act upon:

some [
    ...
  | ...
  | end (print "I'm doing the END stuff")
]

I ran across this style in rebolek's markdown processor. The SOME uses the "no progress" rule to determine when to stop...so even if there's an END at the start it's willing to run one time through.

If all you have are WHILE-style primitives, you can't use not end to terminate it, you have to use end break.

some [
    ...
  | ...
  | end (print "I'm doing the END stuff") break
]

Note: Something about BREAK, ACCEPT, and REJECT have seemed historically confusing too me. Red only offers BREAK and REJECT...where BREAK returns success from the repeating rule and REJECT returns failure.

I might prefer the two be STOP (quit iterating but don't consider anything bad to have happened, so "accept") and BREAK (disrupt with incomplete status). That might line up better. Not sure.

This remains a subtle question, and I'm still very much on the fence about whether having a more "nuanced" default does more harm than good, just because it makes it less likely you'll get in an infinite loop.

What I think we know:

  • Having a construct be called "WHILE" and then having it keep applying until the rule you give it fails is the easiest looping PARSE rule to comprehend.

  • A rule with the WHILE name and behavior was introduced in R3-Alpha due to a belief in its necessity. Red being somewhat conservative in accepting the necessity of R3-Alpha-isms added it as well.

  • The meaning of ANY in DO code (pick the first of these expressions that "matches") corresponds very closely to something that in PARSE would be the alternate done with vertical bar |. It does not evoke the idea of "looping".

  • If SOME and ANY are deemed to be "must make progress" then only WHILE will have an analogue with ANY. There will be no "loop at least once with no progress requirement".

I feel there's a clear value that can be articulated about having while and some be the two clearly explainable looping instructions...which will keep going until they don't match or break...has a number of concrete advantages.

What you lose in this bargain is the likes of:

>> parse "aa" [any ["a" (print "found A") | end (print "hit end!")]]
found A
found A
hit end!  ; finished

You would instead get:

>> parse "aa" [while ["a" (print "found A") | end (print "hit end!")]]
found A
found A
hit end!
hit end!
hit end!... ; infinite loop

So you would have to find some way to get such a rule to stop. Historically:

>> parse "aa" [while ["a" (print "found A") | end (print "hit end!") break]]
found A
found A
hit end!