When something doesn't sit right in my head, I notice. Like how I could never remember in the beginning what GET-WORD! or SET-WORD! in PARSE did. (e.g. is a GET-WORD! GET-ing the parse position to SET a variable, or GET-ting a variable's value to use to SET the parse position?)
And I never really understood ACCEPT, REJECT, BREAK, and FAIL.
I Currently Consider the FAIL Confusion "Solved"
Things have settled nicely, in that LOGIC! is used to get a pure "keep going" or "stop". So FAIL is replaced simply by FALSE. It means if (expression-returning-logic) is replaced in UPARSE as :(expression-returning-logic), and we can keep FAIL on the meaning of "raise error".
This generalization has the pleasing property that we don't need to go introducing "parse switch" or "parse case" or any such things. Since NULL means the same thing as #[true] for a GET-GROUP! splice, you have every option at your disposal.
BREAK and REJECT Seem Too Similar
The problem I have is that BREAK sounds a lot like "this didn't work". In fact, I've enforced that loops return NULL if-and-only-if you BREAK them:
>> repeat 3 [break]
; null
>> repeat 3 [null]
== ~null~ ; isotope
A NULL is used as a signal of "soft failure", e.g. it causes ELSE to run.
>> repeat 3 [break] else [print "soft failure"]
soft failure
>> repeat 3 [null] else [print "soft failure"]
== ~null~ ; isotope
So the distinction between BREAK and REJECT seems a thin one. I feel like I'd rather that BREAK meant you decided the iterated rule isn't working out...and some other signal indicated that you want to accept it and go on.
But ACCEPT doesn't really hint at ceasing iteration. Perhaps STOP? As a word, it hints more at the ceasing of an iteration...and that's used in CYCLE.
>> cycle [stop]
== ~void~ ; isotope
Unlike BREAK (which always returns NULL) it is able to return a non-NULL...and a NULL will be isotopified so it won't be seen as a "soft failure" by ELSE:
>> cycle [stop 10]
== 10
>> cycle [stop null]
== ~null~ ; isotope
Similarly, if you're going to be saying an iterative construct in PARSE is to keep going, then you should have an opportunity to say what the value synthesized from that rule will be. This requires "endable" rules (because we want a plain stop to work). I think that's doable.
So I guess I'm saying prefer BREAK to mean rule failed... return NULL. And STOP to mean rule succeeded. Default to returning a void isotope if no argument given, but allow an argument. The argument would be a rule, so you could actually make the STOP a rule.
>> uparse "aaab" [while ["a" (print "A") | stop ["b" (1020)]]]
A
A
A
== 1020
What About CONTINUE ?
If your loop is only one deep in alternates, then all an alternate needs to do continue is succeed:
>> uparse "aaab" [while ["a" comment "continue" | "b" comment "continue"]]
But if you're deeper than that, it is trickier. And I don't see any particular reason why you shouldn't be able to ask a rule to CONTINUE a loop.
>> uparse "abbbaccc" [while [
"a" [some "bbb" (print "BBB"), continue | some "ccc" (print "CCC")]
(print "like this!")
]
BBB
CCC
like this!
And CONTINUE could also take an argument, which would matter only if it was the final iteration:
>> uparse "bba" [repeat (3) ["a" continue (<like this>) | "b"]]
== <like this>
Would That Be an Improvement?
I think CONTINUE is pretty obviously useful.
One thing that's a bit weird about what I suggest is that when a BREAK happens in a non-parse loop, the code after it runs.
But the idea that "failure" stops progression is a cross-cutting design aspect in PARSE. It seems consistent to me.
Yet another issue is that STOP is not currently offered by plain WHILE or REPEAT or FOR-EACH or other loops. The reason is that if you are to try and write your own iterator in terms of other iterators, you cannot tell from the outside if a "cease iterating" intention happened.
Consider this:
>> opaque-code: [print "looping", 1000 + 20]
>> repeat 2 (opaque-code) then [repeat 2 (opaque-code)]
looping
looping
looping
looping
== 1020
That's nice because if the opaque-code has a break, the whole thing will break:
>> opaque-code: [print "entering", break]
>> repeat 2 (opaque-code) then [repeat 2 (opaque-code)]
entering
; null
But if you permit STOP to return a value, the stopping intent is lost:
>> opaque-code: [print "entering", stop 1020]
>> repeat 2 (opaque-code) then [repeat 2 (opaque-code)]
entering
entering
== 1020
When you're trying to write compound looping expressions that are built up of smaller loops, this really matters. CYCLE is an oddball because you know the only way it ever terminates with a value is if there was a stopping intent...which is why it allows STOP.
Maybe ACCEPT and REJECT Should Be Used and No BREAK?
...but this kind of runs into the same problem that non-PARSE WHILE doesn't have ACCEPT or REJECT. So why get worked up about it having STOP when non-PARSE WHILE doesn't have STOP, if it makes everything line up?
Or maybe non-PARSE while can have STOP...you just understand that STOP has limits when it comes to loop abstraction. Not everything works all the time. So STOP can have a warning on it that you can't tell the stopping intent happened from outside a loop that isn't CYCLE...
What Do You Think?
Are the needs of PARSE different, or the same? Should BREAK make the overall expression evaluate to NULL but keep going? Are ACCEPT and REJECT the right answer?
It's hard to say. I have to work out the mechanism by which such things could work in usermode combinators whatever you call them...so there's time to think about it.