WHILE [Cold Feet]

I'm really certain that ANY should not be a looping construct in PARSE. Rebol's use of ANY everywhere else means "any one of", not "any number of". That applies to the ANY short-circuit-OR operation, to the ANY-XXX! types, and it can come up in PARSE such as:

parse block [some any-value!]

I like the shorthand for this this that works across series types with the TAG! combinator:

uparse block [some <any>]

This meaning any one element. It gets at that English concept that operators like * (or <*>) just don't have.

Plus, the "zero-or-more matches of a rule" interpretation doesn't jibe with how we use ANY in English:

  • "Do you have ANY bananas?"
  • "Yes."
  • "Cool. Can I have one, then?"
  • "No, sorry. I don't have ANY."

But I'm Not Happy With Bending WHILE For This

It seemed appealing at first to say that WHILE would be standardized in the language as arity-1, both in PARSE and in ordinary code loops. This would make UNTIL and WHILE line up, and LOOP could take the arity-2 role that WHILE used to have.

But I've been lamenting just how universally WHILE is arity-2 in pretty much every language and that LOOP doesn't really quite cut it while reading. :-/

Sorry for the flux, but I want to move back to while [condition] [body] as it was. However going through the process has spurred thought...

An Observation: OPT SOME <=> WHILE

It has in the past occurred to me that PARSE's WHILE (or ANY) was really OPT SOME. It's three more characters to say it:

while pattern
opt some pattern

(Note: This is only true in modern Ren-C, as previously the progress requirement differentiated these...that is now broken out into FURTHER.)

...but although it's more characters, "optionally some number of occurrences of the pattern" is pretty literally what you are talking about. In the UPARSE model of synthesized values it's kind of less confusing, because it's clearer what it returns in the case of nothing...the same thing OPT always returns when a rule doesn't match: NULL.

Anyway, I'm feeling remorse and a wish to go back to WHILE for arity-2 loops in the language. But I don't want to go back to ANY in PARSE.

Is OPT SOME really so bad?

I've gotten to wondering if there is a reason we don't have a separate word for "zero or more" in English. You actually have to write out "zero or more" to convey that intent... maybe because the intent is too weird for a single word.

When you just write WHILE it may be that you have a case that's actually supposed to be a SOME but it hasn't really bit you yet. If you're willing to tolerate between 1 and a million of something, the case of no things being there is distinguished...and calling attention to the fact that the rule you have may not match at all can be an asset.

I actually think OPT SOME offers an advantage, because it encourages you to look at it and decide if the OPT belongs there or not. It may feel kind of like a wart, but maybe it's a helpful wart.

(It reminds me a bit of the UNLESS vs. IF NOT situation. Many people felt UNLESS is actually obfuscating nearly everywhere it's used, and that it's better to break it apart even if that means two words instead of one.)

Trying Out The Change, I Noticed...

I actually did find a difference how I read the code. "This entire next section may not be relevant... none of it could match and it would go on." That weight of the OPT is felt more heavily when the word is there than the WHILE...which if you frequently expect the thing to be there, you may assume it will always be there for at least one instance.

You also can see redundancy in OPT more clearly. Things like:

opt [
    while [...]
]

Stand out more if they look like:

opt [
   opt some [...]
]

I think some things really do read more clearly. You can look at this as removing 0 or more newlines at the head of a series via a WHILE:

parse series [
    remove [while newline]
    ...
]

Or rephrase that with OPT SOME:

parse series [
    remove [opt some newline]
    ...
]

But I think it reads clearest when you bring the OPT outside, to say you're optionally removing some newlines:

parse series [
    opt remove [some newline]
    ...
]

More Distinct

ANY and WHILE both had the problem that they had analogues in imperative code. But if SOME remains a PARSE keyword, then this helps better intuit the difference...so the code looks more differentiable.

Compression Is Possible By Other Avenues

I noticed a particularly laborious substitution in %make-zlib.r which extracts the headers and code for zlib using parse, because it often was parsing C code and looking for the pattern while whitespace. This would happen multiple lines in a row and multiple times on a line. When it became opt some whitespace it got more annoying.

But this is kind of a problem anytime you repeat something over and over. Maybe that pattern should have been ws*: [opt some whitespace] and then it would just be ws* to mean "any number of whitespace characters here, including zero".

A Motivated Individual Can Overrule It

Remember, UPARSE is going to let you be the judge. If you want your own keywords, you can have them. Maybe you like MANY (some parser combinators seem to think that 0...N is "many" and 1...N is "some"). Maybe you don't care if WHILE is different. Maybe you don't want to use the ANY parse abstraction that I think is more interesting.

I'm Trying It Out

One can argue there's a bit of a 1984-newspeak to it ("you don't need words like better or worse, use plus-good and un-good and double-plus-ungood"). But we're sort of asking a programming language to be more "nuanced" in its wording than English, which has evolved to be pretty much where the brain is at. I've shown some concrete benefits here to breaking out the OPT so you can see its relationship to the other OPTs you have and move it around.

I do know I'm getting cold feet on the WHILE <=> LOOP change. And I don't think the arity of WHILE in PARSE should be different from the arity of WHILE in the language, it's jarring.

I'm giving it a shot in the bootstrap and rebmake to see what kind of thoughts it inspires. So far it seems to be around equally good and bad...and since the bad is just largely unfamiliarity which should wear off...that points to a win, especially since it means retaking WHILE.

2 Likes

I want to emphasize that there's a lot of thinking points you see from this.

Here's a little section of code in HELP (that needs revisiting, just in general), where it's breaking down parameters and refinements of a function:

uparse parameters of :value [
    args: across while [word! | meta-word! | get-word! | quoted-word!]
    refinements: across while path!
] else [
    fail [...]
]

When we rewrite the WHILE as OPT SOME it shows us something interesting:

uparse parameters of :value [
    args: across opt some [word! | meta-word! | get-word! | quoted-word!]
    refinements: across opt some path!
] else [
    fail [...]
]

Since our ACROSS goes over something effectively OPT, we could wind up with an empty block. But an empty block isn't as cleanly differentiated as a null. What if we move the OPT outside the across?

uparse parameters of :value [
    args: opt across some [word! | meta-word! | get-word! | quoted-word!]
    refinements: opt across some path!
] else [
    fail [...]
]

Now we know that args and refinements are either null, or non-empty. So testing "are there args" becomes just if args and not the more laborious if not empty? args.

I think it's interesting to see how these transformations jump off the page when you use OPT SOME instead of the atomic-seeming WHILE.

3 Likes

OPT isn't an English word. If you wanted one could try MAYBE

It's not? It's listed as a verb in most English dictionaries that I'm aware of.

1 Like

We have accepted it as one, given that "opt in" and "opt out" are used pervasively.

There's the full word OPTIONAL which we could offer as an alternative spelling, though we really want it to be read as OPTIONALLY

option some space  ; hmmm

optionally some space  ; what we actually mean

MAYBE is shorter and and more appealing, but it does have another use. And we're starting to stretch the limits of how annoying the change is:

any rule
=>
while rule
=>
opt some rule
=>
maybe some rule

I've kind of made peace with OPT, myself.

1 Like

A post was split to a new topic: The Cool New Repurposing of WHILE in PARSE

So actually...something very interesting has happened now that void assignments are no-ops, and MAYBE is becoming more universal (vs. a weird enfix kludge for opting out of variable assignments, sometimes).

OPT SOME and MAYBE SOME are actually different intentions!

; OPT gives you NULL if there's no match, and continues the rules

>> uparse "aaa" [(x: 10) x: opt some "b", some "a"]
== "a"

>> x
; null

; MAYBE gives you VOID if there's no match, and continues the rules

>> uparse "aaa" [(x: 10) x: maybe some "b", some "a"]
== "a"

>> x
== 10

The arguments for separating into two components just keep getting stronger. As I look at the code, I've noticed that it's chronic for people to historically have used Rebol2 ANY when they really meant SOME. Since "zero or more" will always work on situations where you have "one or more", there was no incentive to be clear about the intent...so lazy people would maybe even have favored ANY just because it was one less character. When you make the "zero" part explicit, it means laziness favors writing better code--why type the OPT or MAYBE if you don't actually need it?

This is solidifying pretty clearly IMO. And it's great how UPARSE doesn't have dark corners where you find you have to use a block, or can only use simple rules. I'm finding it satisfying to rewrite things and think it comes off clearer.

uparse "..." [remove [while newline]]
=>
uparse "..." [maybe remove some newline]

So logical. :vulcan_salute: It's really playing to the uniqueness of the medium.

3 Likes

Waitanotherminute.

It occurs to me that there is an arity-1 looping construct... CYCLE. It was chosen to replace FOREVER, since FOREVER loops usually broke (it was a misnomer).

But CYCLE is the same number of characters as WHILE.

>> uparse "aaaccc" [some "a" cycle "b" cycle "c"]
== "c"

It's a bit different semantically because CYCLE in the main language doesn't end until you STOP or BREAK. It's not like an UNTIL where the body result itself can make it stop.

Although PARSE is a bit different in semantics anyway. So stopping the cycle on a failed rule might not be that inconsistent under its rules.

Though...CYCLE could be, the anti-UNTIL

>> n: 1, cycle [print [n], n: n + 1, n < 4]
1
2
3

We could restore FOREVER for the infinite loop (despite its semantic pitfall).

Just some thoughts. I don't know that CYCLE implies "do this as long as it is true", however...more like "do it until I say to stop". It's interesting to remember that we do have another arity-1 looping construct in the mix though.

1 Like