Resurrecting the "IF" Combinator... as WHEN

hostilefork · September 5, 2024, 8:19pm

R3-Alpha had an idea--carried forward by Red--of an arity-1 IF combinator.

red>> num: 1020

red>> parse [a a a] [if (even? num) some 'a]
== true

red>> parse [a a a] [if (odd? num) some 'a]
== false

As you see, if the expression you give it turns out to be "falsey" then it doesn't continue matching. It skips to the next alternate--if there is one.

red>> parse [a a a] [if (odd? num) some 'b | some 'a]
== true

But I always thought the arity-1 IF was a pretty alien thing that would confuse people. You might think there's a branch, but there's no "branch"... just continuing along with the variadic list of everything that follows until the next | or end of BLOCK!.

I also wondered "where does it end?" With an IF combinator, why not a CASE combinator, or SWITCH combinator?

So when I came up with GET-GROUP! doing arbitrary substitutions of the rule it evaluates to, I thought "hey, that's a lot more general!" We could just say that ~true~ and ~void~ antiforms would continue the parse, ~false~ would stop it, and ~null~ antiforms would trigger an error in case you didn't mean to do that.

What That `:(GET-GROUP!)` Concept Looked Like

(Note that if condition '[...] is equivalent to if condition [[...]]. This is called "soft-quoted branching")

>> num: 1020, rule: null

; generated [some 'b] rule is treated as if it had been written there
>> parse [a a a b b b] [some 'a :(if even? num '[some 'b])]
== b 

; generated ~void~ from non-taken IF gets ignored, and it kept parsing
>> parse [a a a b b b] [some 'a :(if odd? num '[some 'c]) some 'b]
== b

; generated ~true~ signal continues parse, just as ~void~ did
>> parse [a a a b b b] [some 'a :(even? num) some 'b]
== b

; generated ~false~ skips to next alternate (isn't one, so parse fails)
>> parse [a a a b b b] [some 'a :(odd? num) some 'b]
** Error: PARSE BLOCK! combinator did not match input

; treat ~null~ conservatively, use :(maybe rule) for ~void~ to keep going 
>> parse [a a a b b b] [some 'a :(rule) some 'b]
** Error: ~null~ antiform generated by GET-GROUP! in PARSE

Flexible Logic Kills `[~true~ ~false~]`... Breaks That Idea

In the flexible logic model, [TRUE FALSE ON OFF YES NO] are WORD!s, and hence indiscriminately trigger taking the branch in something like an IF when used directly. The ~null~ antiform is the "branch inhibitor", and it's what conditional expressions return when they don't match the condition.

>> 10 > 20
== ~null~  ; anti

I don't think it's a good idea to make substitions via GET-GROUP! (or whatever comes to replace it) silently continue on NULL. If you forgot to set a variable that was supposed to hold something (as in rule above), that should give you an error. But I don't think you should have to write :(maybe even? num)

So Having A Conditional Logic Combinator Makes Sense

I just think that IF is a rather lousy name for it.

So I'll suggest WHEN.

>> parse [a a a b b b] [some 'a, when (even? num), some 'b]
== b

It would be against the premise of flexible logic to have WHEN be biased and assume things like TRUE, YES, or NO should mean it continues or not. I like the idea that you could hold a completely arbitrary word in a variable and say when (word), that means "continue matching when word is set to a non-null value".

Hence you'd have to say when (true? flag) or when (off? toggle) etc. I'm not merely comfortable with this... I am gung-ho about it!

(Of course people can make their own combinators and build in biases of their choosing, the core just doesn't pick sides.)

BYPASS Can Be A Synonym For `[when (null)]`

I didn't like using FAIL for saying when to stop a rule chain and go to the next alternate, because that is used for causing "abrupt failures" in the system.

So I'd been using quasiform ~false~ the state in source (and the antiform if in a variable).

>> parse [a a a b b b] [some 'a, :(if even? num [false]), some 'b]
** Error: PARSE BLOCK! combinator did not match input

>> parse [a a a b b b] [some 'a, ~false~, some 'b]
** Error: PARSE BLOCK! combinator did not match input

But that isn't the model anymore. There is no ~false~ or ~true~ antiform. And honestly it wasn't that literate anyway. when (...) makes it clearer when you're using a variable. And the quasiform just looks confusing.

Searching for a good word that doesn't run into something serving other purposes (e.g. BREAK), I asked Claude.ai for suggestions, and one of those was BYPASS.

I like it. So for example you could write:

>> parse [a a a b b b] [some 'a [:(if even? num ['bypass]) some 'c] | some 'b]]
== b

Although that particular case is clearer as [when (odd? num) ...], but sometimes you have to throw in a bypass rule.

(Amusingly, in Rebol2 the idiom for BYPASS was [end skip], which was a rule guaranteed to mismatch at any position: either you weren't at the tail and the END wouldn't match, or you were at the tail and the END would match but then you couldn't SKIP.)

Where Does It Stop?

I also wondered "where does it end?" With an IF combinator, why not a CASE combinator, or SWITCH combinator?

So I think it's good to just say WHEN.

You don't technically need WHEN if you have BYPASS to skip to next alternate, and ~void~ to keep going (or empty block, if you like... [] will keep going too).

 when (cond) => :(if not cond ['bypass])  ; or :(if not cond 'bypass)

But that forces you to reverse the sense of your logic and write out something longer (and slower). I think if you've got logic that's complex like a case or switch, then writing it out as a splicing rule would have negligible benefit to try and shoehorn as a combinator.

A Potential Weak Spot In `#` for Canon Branch Trigger

It's a given that the ~null~ antiform is the canon "Branch Inhibitor". It may well be the only branch inhibitor (though I'm considering ~NaN~ antiforms might also not trigger branches).

What's more up in the air is what the canon branch trigger is.

Before considering WHEN--I was looking at the impacts of using # on the GET-GROUP! substitution rules that had been in place.

Previously you could do this:

>> parse #{000000FFFFFF} [zeros: tally #{00} :(odd? zeros) some #{FF}]
== #{FF}

>> parse #{000000FFFFFF} [zeros: tally #{00} :(even? zeros) some #{FF}]
** Error: PARSE BLOCK! combinator did not match input

But this becomes fully broken with things like EVEN? and ODD? returning either ~null~ or #.

>> parse #{000000FFFFFF} [zeros: tally #{00} :(odd? zeros) some #{FF}]
** Error: ~null~ antiform generated by GET-GROUP! in PARSE

>> parse #{000000FFFFFF} [zeros: tally #{00} :(even? zeros) some #{FF}]
** Error: PARSE BLOCK! combinator did not match input

The second case didn't match because since things like #a are character literals, # has been used to represent the 0 codepoint. So in BINARY! it matches that.

>> append #{DECAFBAD} #
== #{DECAFBAD00}

>> parse #{000000} [some #]
== #

(Having the # combinator synthesize # vs 0 is debatable. But it wouldn't be #{00}. We don't want matching combinators to make new series--they only return their own series argument which is already allocated.)

And if you had a BLOCK! it would match #

>> data: [a a a # b b b]

>> parse data [some 'a :(true? trailing-b) some 'b]
== b  ; great!  we know the block is all As and Bs!  (oh, WHOOPS!)

This is part of why I was saying the canon branch trigger should be an antiform--because it gets pushed out of band for things like this.

But the inconvenient truth is that tradeoffs are inevitable. Here (and elsewhere) the problem can be addressed by not trying to mix conditional logic with substitution. Substitution needs to be either a legal array element, or a ~void~ antiform to consciously opt out. Conditional logic is now fully driven by non-nullity, meaning you need different instructions to contrast it with full-band substitution.

It still makes me a uneasy that the canon branch trigger isn't an antiform. That will inevitably cause confusion... be accepted where it shouldn't, or have unintended meanings.

>> num: 304

>> compose [flag: (odd? num)]
** Error: Cannot compose ~null~ antiform into array slot

>> compose [flag: (even? num)]
== [flag: #]  ; we allowed something that is likely not what you meant

So perhaps people can be empathetic to why I thought NOTHING would be a better choice for the canon branch trigger!

But this might be an unwinnable fight, and the consequences of reusing the NOTHING antiform are greater than that of getting the occasional # substituted where it should not be...with the burden of inventing a whole new antiform not giving the payoff that putting another part in the mix needs to have.

bradrn · September 6, 2024, 1:13am

A good name — Haskell calls it this too. It’s a very useful combinator, so good to have.

Would you consider adding it to parse dialect too? It has a natural interpretation: parse the next thing only when the condition is true. (Again, this is something Haskell has.)

hostilefork · September 6, 2024, 1:15am

I really should have clocked more time using Haskell to parse stuff.

Problem is when I go off and experiment I tend to tackle too many things at once, I tried to do Haskell on Arch Linux and got bored trying to figure it out.

Hm? This is the PARSE dialect... what do you mean?

bradrn · September 6, 2024, 1:46am

Ah-ha. Do not use Haskell libraries from the Arch package repositories. They are famously broken, mostly because GHC doesn’t play well with dynamic linking. (I say this as someone using Arch myself.)

Instead, I strongly recommend using GHCup.

Hmm, maybe I’m just struggling to understand your PARSE examples. Where in that post did you show an example of WHEN in PARSE?

hostilefork · September 6, 2024, 1:49am

This is the example of using it in PARSE.

hostilefork:

So Having A Conditional Logic Combinator Makes Sense

I just think that IF is a rather lousy name for it.

So I'll suggest WHEN.
>> num: 1020

>> parse [a a a b b b] [some 'a, when (even? num), some 'b]
== b 

I didn't add the failing case, but:

>> parse [a a a b b b] [some 'a, when (odd? num), some 'b]
** Error: PARSE BLOCK! combinator did not match input

I don't know how it would be applied in the main evaluator...

bradrn · September 6, 2024, 2:00am

Oh… whoops, OK, not sure how I missed that. For some reason I thought you were talking about WHEN as a top-level combinator. (Which is also a very useful thing to have!)

hostilefork · September 6, 2024, 2:02am

Well now I'm confused, what would a "top level" WHEN combinator be?

bradrn · September 6, 2024, 3:54am

Basically, when condition block ≡ if condition block []. That is, it would run the given block if the condition is true, and do nothing otherwise.

hostilefork · September 6, 2024, 4:09am

Ah, so the "actual" IF combinator.

We can accomplish the most typical desire with :(if condition block)

When the condition is not null, the block is evaluated and its product used as a rule.
When the condition is null, a ~void~ antiform is returned from the IF, and the evaluator skips it and keeps going.

If we made an IF combinator, you'd have to put your conditional code in a GROUP!. Unless you meant to actually match content with a rule as your condition, like this:

>> parse [hello world 304] [if opt 'hello [word! integer!]]
== 304

So what you used as a condition was what was synthesized by [opt 'hello] ... in this case the WORD! hello that matched (had it not matched it would be null). Then since it was non-null, it matched the next block vs. skipping it.

That's something you can't do with putting the code in a GET-GROUP!. I don't know how frequent that desire is, since most of the desire has been to control a rule based on a guard flag of some kind.

>> worldnumber: 'true

>> parse [world 304] [if (true? worldnumber) [word! integer!]]
== 304

If that's the case, then your advantage can be small...as small as two characters for having the combinator.

:(if true? worldnumber '[word! integer!])
if (true? worldnumber) [word! integer!]

But...I mean, I guess an IF combinator of that sort would be all right to have. Trivial to write. Just perhaps a little leery of the slippery slope of recreating every control construct as a combinator.

Were it to exist, I'd probably find uses for it.

bradrn · September 6, 2024, 4:14am

Oh, looks like there’s been more confusion, sorry… I meant, outside PARSE altogether! That is, at the ‘top level’ of the program.

hostilefork · September 6, 2024, 4:15am

You'll have to use an example then.

In the ordinary evaluator, IF does nothing if the condition is false...so you'll have to illustrate what you mean.

hostilefork · September 13, 2024, 4:50am

Having spent some time with this in practice, I'm wondering if it's doing more harm than good to not allowing NULL in GET-GROUP! substitutions to mean "rule doesn't match".

It's particularly painful with WHILE:

while :(mode = 'read) [some chunk]

; vs.

while when (mode = 'read) [some chunk]

I am uncertain that the safety advantage is really worth it. NULL variables aren't unset variables. Having the state mean "don't keep going" seems useful enough to outweigh the occasional accident of "I meant to have a rule here but forgot it" case.

So I'm changing splicing to be null tolerant (well, tolerant in the sense that it won't abruptly fail, but triggers a rule mismatch).

The bigger issue is that leading colon for this is likely not going to be what it is, given the new general semantics of "optionality" that leading colon implies. So I'm thinking that's going to be doubled groups again.

while ((mode = 'read)) [some chunk]

Resurrecting the "IF" Combinator... as WHEN

What That :(GET-GROUP!) Concept Looked Like

Flexible Logic Kills [~true~ ~false~]... Breaks That Idea