ANY vs. MANY in PARSE...

In at least one Haskell text combinator set, it uses some to mean one or more match, and many for zero or more matches.

I can see why ANY makes more sense...to mean "any number of matches" (including 0). But a disadvantage is that it looks a lot like the common ANY construct in regular code... which kind of has the opposite meaning (non-PARSE ANY means "at least one of the following things, go with the first one that's truthy, else return null").

Because we're sort of dealing in a gray area of learned behavior here, I wonder if the benefit of going with MANY to make parse rules look different is enough to prefer it.

That is for me MANY says more than one. Even more than two if you are strict, as the saying goes, counting: one, two, many.
SOME says yes there should be one or more of these present. ANY is just fine for me expressing any positive number and zero within PARSE. But indeed with ANY [condition1 condition2] there is at least one of the conditions true.
Well a small difference indeed, but to use something like OPTIONAL within PARSE to overcome this, I say I can live with the difference in meaning.

1 Like

I'd rather rename the ANY function in regular code to SOME (FIRST-IF, FIRST-DID, FIRST-TRUE etc.). :slight_smile: I think ANY is the right word for UPARSE; the standard symbol in pattern matching is *.

While it's good to have literate keywords for pattern matching, I think most devs (including newbies) coming to Ren-C will be minimally familiar with the symbols which have been around forever: * (match 0 or more), ? (match 1 item), # (match one digit), and ! (not).

The quoting proposal seems okay to me. Could be an adjustment for some, but not a huge leap for me.

It's worth thinking about a replacement for non-PARSE ANY. But I think it would need to be short. ONE is a possibility, though it makes it sound like it could be that it evaluated all the conditions and checked that one-and-only-one is true.

one [thing1, thing2, thing3] then [...]

If we were willing to say that Rebol's disposition is prefix, ANY could be OR with ALL as AND.

and [thing1, thing2] then [...]

or [thing1, thing2] then [...]

But I don't think that's a good idea.

Note: I like the direction of AND and OR as weird infix operations right now...so I think we should stick with that. I've even been considering that x and y should be allowed so long as Y is not a function with arguments; it can short circuit across the word if it quotes it.

The real question is just how nasty parameter-gathering conventions are willing to get to make your source level experience more comfortable. That irregularity makes the functions harder to reuse...e.g. if you MAKE FRAME! for :AND, you have to realize that you're giving it code that it will short-circuit, and you have to know all the rules for that.

1 Like

I'm not suggesting I necessarily agree with the need to change, but if I were, I'd maybe go for ANY-OF and ALL-OF.

1 Like

In the past, I've thought we might make the PARSE rule convention for ANY just be OPT SOME.

Unfortunately, combinators break that idea as this would be semantically different in capturing...under the rules I'm thinking of.

For example, just thinking about the idea that INTEGER! might transcode from strings:

parse "10 20 30" [numbers: any integer!]
>> numbers
== [10 20 30]

parse "xxx" [numbers: any integer!]
>> numbers
== []

parse "xxx" [numbers: opt some integer!]
>> numbers
; null

The idea is that OPT will set its result to NULL if the rule does not succeed...and give you the combinator product if it does. But ANY would give you an empty block in the case it doesn't succeed at all.

The COPY (or ACROSS) that just gets the span of input wouldn't help smooth that over in this particular case. Because copy opt some integer! gives you a span of the input series, which is text. The combinator product for INTEGER! here on text input is an INTEGER!.

Unfortunately it can't be stylized the other way as some opt integer!, by terminating on NULL... if we are to have some work with "rules that have no products", like some "a". (I've been assuming that no-product rules exist, where the only thing you can do with them is COPY across their consumed input).

2 Likes

Instead of ANY one could say that a certain item is allowed to be present, no matter how many times it even does, so ALLOW comes up.

In a separate post I'm explaining the historical difference between ANY and WHILE in PARSE...and the question of if they should be the same.

If they are the same, then we might adopt WHILE as the name. It's an alternative way of thinking of it to use WHILE to mean "keep running this rule as long as it matches":

parse "aaa" [while "a"]

It seems to be more consistent. Because WHILE means "keep doing this so long as it is true" in both plain DO code and PARSE, whereas ANY's meaning in DO is "match one of these things and then stop".

e.g. ANY [RULE1, RULE2, RULE3] in PARSE would be more consistent as a synonym for RULE1 | RULE2 | RULE3.

(ALL [RULE1, RULE2, RULE3] comes from the implicit sequencing being ALL, e.g. [RULE1, RULE2, RULE3])

Maybe this verbalizes a little more what my problem is with the reuse of the word...it isn't so much that ANY is in both places, but that there is a fairly clear parallel for what the meaning of ANY would be if it applied in both...and that's not what it is.

1 Like

I like WHILE as described here and the improved consistency of ANY and ALL.

1 Like

If you can, please read the most recent summary remarks on ANY vs. WHILE… and NOT END. I'm increasingly feeling certain that WHILE and SOME with no progress requirement are the right primitives. It's running up against my lack of love for the BREAK/ACCEPT/REJECT naming as those kinds of things would need to be invoked more often, but maybe that just needs to get cleaned up too.

Also consider that if ANY were retaken in PARSE it could potentially offer an alternative of a parallel use like:

any ([integer! decimal! block! text!])

vs.

[integer! | decimal! | block! | text!]

The GROUP! to get the BLOCK! as a synthesized product is necessary. If ANY took the BLOCK! as a rule it would not have the power to override the default interpretation as a sequence. I've written elsewhere about this.

I think that this "non-looping-feeling" which the language cultivates about ANY gets at the core of why the word never really sat right with me. It means "pick one thing out of a set of alternatives" everywhere else... ANY-VALUE!, any [even? x, x > 20]... not "loop".

Plus the "progress requirement" slipstreamed into the rule makes the curve from simple cases to harder ones more difficult...where you wind up having to learn WHILE anyway.

(Related thought: Maybe it is interesting to make a SOME loop in the non-PARSE world, which errors if it doesn't run the body at least once? Though having a few words that are common in parse rules but not in plain code can help cue differentiation of which is which, so that's another thing to consider.)

1 Like

I think the die is cast, and WHILE + SOME with the help of generically enforcing parse progress with FURTHER feels clear and complete. You get all 4 possibilities and it's very clear.

  • Potentially 0 matches, progress not enforced: while [...]

  • 1 or more matches, progress not enforced: some [...] (previously not an option)

  • Potentially 0 matches, must consume parse input to count as a match: while further [...] (previously ANY)

  • 1 or more matches, must consume parse input to count as a match: some further [...] (previously SOME)

FURTHER should not be necessary most of the time.

It tightens up when we make ordinary WHILE arity-1...and looks in good shape.

@BlackATTR points out that this might make it easier when generating alternate rules, since you wouldn't have to worry about sticking in the | during generation.

Either way, I'm going to kill off ANY in Ren-C's existing native PARSE, to begin the reclaiming process. I'll add FURTHER to native parse, too.

Customizing the combinator list is how Redbol compatibility parse will work on top of UPARSE. So if anyone wanted ANY back they could get it that way.

I'm thinking about how maybe PARSE would allow macros, so you could say things like:

any: macro [] [return [while further]]

(But you'd want to put it in the combinator list, vs. overriding the global ANY)

2 Likes

A post was merged into an existing topic: TAG!s as PARSE keywords vs. Literal Matches