WHILE [Cold Feet]

I'm really certain that ANY should not be a looping construct in PARSE. Rebol's use of ANY everywhere else means "any one of", not "any number of". That applies to the ANY short-circuit-OR operation, to the ANY-XXX! types, and it can come up in PARSE such as:

parse block [some any-value!]

I like the shorthand for this this that works across series types with the TAG! combinator:

uparse block [some <any>]

This meaning any one element. It gets at that English concept that operators like * (or <*>) just don't have.

Plus, the "zero-or-more matches of a rule" interpretation doesn't jibe with how we use ANY in English:

  • "Do you have ANY bananas?"
  • "Yes."
  • "Cool. Can I have one, then?"
  • "No, sorry. I don't have ANY."

But I'm Not Happy With Bending WHILE For This

It seemed appealing at first to say that WHILE would be standardized in the language as arity-1, both in PARSE and in ordinary code loops. This would make UNTIL and WHILE line up, and LOOP could take the arity-2 role that WHILE used to have.

But I've been lamenting just how universally WHILE is arity-2 in pretty much every language and that LOOP doesn't really quite cut it while reading. :-/

Sorry for the flux, but I want to move back to while [condition] [body] as it was. However going through the process has spurred thought...

An Observation: OPT SOME <=> WHILE

It has in the past occurred to me that PARSE's WHILE (or ANY) was really OPT SOME. It's three more characters to say it:

while pattern
opt some pattern

(Note: This is only true in modern Ren-C, as previously the progress requirement differentiated these...that is now broken out into FURTHER.)

...but although it's more characters, "optionally some number of occurrences of the pattern" is pretty literally what you are talking about. In the UPARSE model of synthesized values it's kind of less confusing, because it's clearer what it returns in the case of nothing...the same thing OPT always returns when a rule doesn't match: NULL.

Anyway, I'm feeling remorse and a wish to go back to WHILE for arity-2 loops in the language. But I don't want to go back to ANY in PARSE.

Is OPT SOME really so bad?

I've gotten to wondering if there is a reason we don't have a separate word for "zero or more" in English. You actually have to write out "zero or more" to convey that intent... maybe because the intent is too weird for a single word.

When you just write WHILE it may be that you have a case that's actually supposed to be a SOME but it hasn't really bit you yet. If you're willing to tolerate between 1 and a million of something, the case of no things being there is distinguished...and calling attention to the fact that the rule you have may not match at all can be an asset.

I actually think OPT SOME offers an advantage, because it encourages you to look at it and decide if the OPT belongs there or not. It may feel kind of like a wart, but maybe it's a helpful wart.

(It reminds me a bit of the UNLESS vs. IF NOT situation. Many people felt UNLESS is actually obfuscating nearly everywhere it's used, and that it's better to break it apart even if that means two words instead of one.)

Trying Out The Change, I Noticed...

I actually did find a difference how I read the code. "This entire next section may not be relevant... none of it could match and it would go on." That weight of the OPT is felt more heavily when the word is there than the WHILE...which if you frequently expect the thing to be there, you may assume it will always be there for at least one instance.

You also can see redundancy in OPT more clearly. Things like:

opt [
    while [...]
]

Stand out more if they look like:

opt [
   opt some [...]
]

I think some things really do read more clearly. You can look at this as removing 0 or more newlines at the head of a series via a WHILE:

parse series [
    remove [while newline]
    ...
]

Or rephrase that with OPT SOME:

parse series [
    remove [opt some newline]
    ...
]

But I think it reads clearest when you bring the OPT outside, to say you're optionally removing some newlines:

parse series [
    opt remove [some newline]
    ...
]

More Distinct

ANY and WHILE both had the problem that they had analogues in imperative code. But if SOME remains a PARSE keyword, then this helps better intuit the difference...so the code looks more differentiable.

Compression Is Possible By Other Avenues

I noticed a particularly laborious substitution in %make-zlib.r which extracts the headers and code for zlib using parse, because it often was parsing C code and looking for the pattern while whitespace. This would happen multiple lines in a row and multiple times on a line. When it became opt some whitespace it got more annoying.

But this is kind of a problem anytime you repeat something over and over. Maybe that pattern should have been ws*: [opt some whitespace] and then it would just be ws* to mean "any number of whitespace characters here, including zero".

A Motivated Individual Can Overrule It

Remember, UPARSE is going to let you be the judge. If you want your own keywords, you can have them. Maybe you like MANY (some parser combinators seem to think that 0...N is "many" and 1...N is "some"). Maybe you don't care if WHILE is different. Maybe you don't want to use the ANY parse abstraction that I think is more interesting.

I'm Trying It Out

One can argue there's a bit of a 1984-newspeak to it ("you don't need words like better or worse, use plus-good and un-good and double-plus-ungood"). But we're sort of asking a programming language to be more "nuanced" in its wording than English, which has evolved to be pretty much where the brain is at. I've shown some concrete benefits here to breaking out the OPT so you can see its relationship to the other OPTs you have and move it around.

I do know I'm getting cold feet on the WHILE <=> LOOP change. And I don't think the arity of WHILE in PARSE should be different from the arity of WHILE in the language, it's jarring.

I'm giving it a shot in the bootstrap and rebmake to see what kind of thoughts it inspires. So far it seems to be around equally good and bad...and since the bad is just largely unfamiliarity which should wear off...that points to a win, especially since it means retaking WHILE.

2 Likes

I want to emphasize that there's a lot of thinking points you see from this.

Here's a little section of code in HELP (that needs revisiting, just in general), where it's breaking down parameters and refinements of a function:

uparse parameters of :value [
    args: across while [word! | meta-word! | get-word! | quoted-word!]
    refinements: across while path!
] else [
    fail [...]
]

When we rewrite the WHILE as OPT SOME it shows us something interesting:

uparse parameters of :value [
    args: across opt some [word! | meta-word! | get-word! | quoted-word!]
    refinements: across opt some path!
] else [
    fail [...]
]

Since our ACROSS goes over something effectively OPT, we could wind up with an empty block. But an empty block isn't as cleanly differentiated as a null. What if we move the OPT outside the across?

uparse parameters of :value [
    args: opt across some [word! | meta-word! | get-word! | quoted-word!]
    refinements: opt across some path!
] else [
    fail [...]
]

Now we know that args and refinements are either null, or non-empty. So testing "are there args" becomes just if args and not the more laborious if not empty? args.

I think it's interesting to see how these transformations jump off the page when you use OPT SOME instead of the atomic-seeming WHILE.

3 Likes

OPT isn't an English word. If you wanted one could try MAYBE

It's not? It's listed as a verb in most English dictionaries that I'm aware of.

1 Like

We have accepted it as one, given that "opt in" and "opt out" are used pervasively.

There's the full word OPTIONAL which we could offer as an alternative spelling, though we really want it to be read as OPTIONALLY

option some space  ; hmmm

optionally some space  ; what we actually mean

MAYBE is shorter and and more appealing, but it does have another use. And we're starting to stretch the limits of how annoying the change is:

any rule
=>
while rule
=>
opt some rule
=>
maybe some rule

I've kind of made peace with OPT, myself.

1 Like

Waitaminute.

What if PARSE has WHILE and it's arity-2?

So these two things would be synonyms:

 while rule1 rule2   <=>  opt some [rule1 rule2]

I'm sure I've had the idea before (?) but I don't remember writing up why you would actually want that. It's actually quite neat.

It would make it cleaner to pair up code in a GROUP! with a rule:

GROUP! rules always run their side effect and succeed. So:

opt some [rule (code to run on each match)]

Could instead be written as:

while rule (code to run on each match)

I would use this frequently!

It helps pscyhologically divide a process into two parts: trigger and response

You can of course write things as:

opt some [
     thing1 thing2 [
        thing3 thing4
     |  thing5 thing6
     ]
]

Or:

opt some [thing1 thing2 [
    thing3 thing4
        |
   thing5 thing6
]]

But I think the WHILE structuring into a control half and response half helps you see this better:

while [thing1 thing2] [
    thing3 thing4
        |
    thing5 thing6
]

This Pushes My Vote Over The Edge :ballot_box_with_check:

The existing WHILE will be deprecated in PARSE, and the non-PARSE construct LOOP will be changed back to WHILE when that deprecation has propagated. (Sorry for the churn, but this is how improvements are made!)

Because OPT SOME is a full synonym in PARSE3, I'm going to remove WHILE from PARSE3 now...just to help along with the transition to UPARSE...

When all the WHILEs are likely gone, I'll add the new instruction to UPARSE. Perhaps it will exist as UWHILE just so it can be tried out.

We can keep looking for a good single word that matches the intent of opt some. But this thread lays out my reasoning for why we don't have a word that means "zero or more". "optionally some" is a pretty good capture of the intent, and there are better ways to focus energy than on micro-optimizing it.

(Certainly people who are interested in character micro-optimization will think the entirety of parse is too verbose...)

3 Likes

So actually...something very interesting has happened now that void assignments are no-ops, and MAYBE is becoming more universal (vs. a weird enfix kludge for opting out of variable assignments, sometimes).

OPT SOME and MAYBE SOME are actually different intentions!

; OPT gives you NULL if there's no match, and continues the rules

>> uparse "aaa" [(x: 10) x: opt some "b", some "a"]
== "a"

>> x
; null

; MAYBE gives you VOID if there's no match, and continues the rules

>> uparse "aaa" [(x: 10) x: maybe some "b", some "a"]
== "a"

>> x
== 10

The arguments for separating into two components just keep getting stronger. As I look at the code, I've noticed that it's chronic for people to historically have used Rebol2 ANY when they really meant SOME. Since "zero or more" will always work on situations where you have "one or more", there was no incentive to be clear about the intent...so lazy people would maybe even have favored ANY just because it was one less character. When you make the "zero" part explicit, it means laziness favors writing better code--why type the OPT or MAYBE if you don't actually need it?

This is solidifying pretty clearly IMO. And it's great how UPARSE doesn't have dark corners where you find you have to use a block, or can only use simple rules. I'm finding it satisfying to rewrite things and think it comes off clearer.

uparse "..." [remove [while newline]]
=>
uparse "..." [maybe remove some newline]

So logical. :vulcan_salute: It's really playing to the uniqueness of the medium.

3 Likes