Rethinking IF and IF* - IF/ONLY, IF/OPT, safety vs. complexity

hostilefork · July 19, 2017, 11:42am

UPDATE: This thread is retained for historical purposes, but the issues discussed are now being attacked through various mechanisms... such as quoted branches:
>> if true '[print "hi"] else [print "hi"]
== [print "hi"]

>> if true '[print "hi"] else [print "hi"]
hi
The techniques have grown as the available datatypes have grown, even leading to things like branches that are reduced if they are GET-BLOCK!
>> if true :[1 + 2 10 + 20]
== [3 30]
Hence this thread just shows the genesis of the thoughts that pushed from the R3-Alpha status quo into that direction.

An experimental feature that some people have wound up using--while others have not--is that it was possible to use non-blocks in conditional slots. The rationale being that Rebol's desire for expressiveness exceeded its desire for boilerplate.

For instance, in terms of "character economy", you might prefer to see if (x = blah) 3 foo instead of if x = blah [3] foo. This might make it clearer that 3 is not an argument to blah with foo the body, but rather 3 is the desired result. Or you could write x: default [if condition 4] and compact things. It seems the decision to be permissive runs with other core beliefs--including why one does not need to put the condition of IF in a BLOCK! in the first place.

In the early days of open sourcing, Carl was persuaded, and it was committed to GitHub rebol/rebol.

Yet given the fact that blocks would be executed instead of handled as raw values, it seemed a parallel situation to things like APPEND was arising. If you were just writing if condition value, and you didn't realize the variance in behavior between when value was a block vs not...you might be surprised when trying to write generic code...when it worked fine until all of a sudden you used a block. Adding an /ONLY refinement seemed like how APPEND dealt with this, so it seemed to make sense to give generic code authors a similar tool.

Since that time, conditionals like IF have gathered more features. One of them is a protection on the condition against use of literal blocks; any blocks must come from an evaluation in the condition slot:

 if [x = 1] [print "you can't write this"]

var: [x = 1]
if var [print "okay, but won't run code--just treat the block as truthy"]

 if identity [x = 1] [print "workaround with function that returns its input"]
 ; ^-- may be useful if you're dealing with code generation
 ;      https://en.wikipedia.org/wiki/Identity_function

Complexity Must Be Balanced Against Benefit

One thing I have come to think is that /ONLY is a hard thing to remember to use if you need it; you don't get an obvious error if you've omitted it. Also, if/only condition [x] is not that much better than if condition [[x]], arguably worse. So the question is whether there's a simpler, more effective, and more helpful way to get the desired help with the if/only condition var case.

The compromise I have in mind is similar to other recent compromises, of a kind of "expert mode" and a "casual/convenient" mode. Casual mode would include a check on branch bodies similar to the one for the condition, except it would only permit evaluated elements if they were blocks. Hence should you ever see if condition var it would check you to make sure var was a BLOCK!, and not some other value. It would still tolerate literals like if condition 3.

The "expert mode" would not have the checks on either the condition or the body, and assume you knew what you were doing.

It might seem this would only affect those who have bothered to try using the feature so far, of which there aren't too many instances. But there's an instance it would change with CASE...you could still write case [false 1 | true 2] but not one: 1 | two: 2 | case [false one | true two], because only blocks would be allowed in evaluative branches unless you used CASE*. otoh, one: [1] | two: [1 + 1] | case [false one | true two] would be legal.

hostilefork · July 19, 2017, 1:57pm

I should point out that this protects against one of the bug classes I saw pop up with the original idea of not enforcing blocks...people forgetting to add an either branch, or dropping one and forgetting they did. Note this bug was caught by the change:

github.com

metaeducation/ren-c/blob/18962283d5ec9a60a2ee86b53d281ed449314d00/src/os/host-start.r#L681


        import do compose [module (code)]
    ][
        sys/do-needs first+ code
        do intern code
    ]
    quit ;ignore user script and "--do" argument
]


; Evaluate any script argument, e.g. `r3 test.r` or `r3 --script test.r`
;
either file? o/script [
    trap/with [
        do/only/args o/script script-args ;-- /ONLY so QUIT/WITH exit code bubbles out
    ] func [error <with> return] [
        print error
        return 1
    ]
]
host-start: 'done


; Evaluate the DO string, e.g. `r3 --do "print {Hello}"`

The EITHER was presumably an IF that lost its FALSEY? branch. host-start: 'done would always be run, and a failed condition would just mean an expression would evaluate to the WORD! done and fall off into the ether. So no actual observable problem, but it goes to show how one plays a bit with fire when boilerplate requirements are thrown out.

Of course, that's a typical day in Rebol for you...but, still I think it's a good compromise to raise sensitivity regarding executable slots.

draegtun · July 19, 2017, 3:27pm

I'd wrongly assumed that the IF/ONLY & EITHER/ONLY change only added this behaviour...

>> ten: [5 + 5]
== [5 + 5]

>> if true ten
== 10

>> if/only true ten
== [5 + 5]

>> if/only true {ten}
== "ten"

>> if/only true 10
== 10

This makes sense. But I hadn't realised that the change also added this behaviour...

>> if true {ten}
== "ten"

>> if true 10
== 10

I was expecting those last two would throw an error because they're not using a BLOCK!

hostilefork · July 19, 2017, 5:13pm

I like the elegance of allowing literals. Consider here, where it says:

 unspaced ["(CONSOLE " unless/only proto-skin/updated? {not } "updated)"]

That gets nicer if we start by using SPACED of course, and dropping the /ONLY is an improvement. But if you were to choose where delimiters would help improve it, are you better off with:

 spaced ["(CONSOLE" unless proto-skin/updated? [{not}] "updated)"]

...or...

 spaced ["(CONSOLE" (unless proto-skin/updated? {not}) "updated)"]

...or just having no delimiters at all?

 spaced ["(CONSOLE" unless proto-skin/updated? {not} "updated)"]

In any case, I think having the ability to make these choices is in the spirit of Rebol's "most freeform language" ethos. But having implemented the proposal I give above, it feels ultimately more effective to chop it to only two modalities and add a little more safety to the common one. I am pleased. It feels like getting closer to the essential complexity limits of these kinds of problems, in a generalized way.

hostilefork · July 19, 2017, 8:56pm

There is, however, one drawback with this approach of detection. That is if you want to wrap something and inherit the conditional protective nature of IF.

Consider the classic example of wanting to write IF-NOT (pretending UNLESS was not in the box)

if-not: func [
    return: [<opt> any-value!]
    condition [any-value!]
    body [any-value!]
][
    if not :condition :body
]

It would work if the body was a block. But if you tried to use it with something passed literally to IF-NOT, it would see :body as a point of evaluation. You would not be able to write if-not condition 3. Also, this IF-NOT construct wouldn't have the safety check on if the condition were a literal block, so if-not [x = 1] [print "This code would never run, and not warn you about that"]

I'll point out that this problem of transmitting the evaluated/unevaluated ("semiquoted?") bit isn't a particularly new class of problem. Any function which quotes its arguments has similar problems in chaining...

my-quote: func [:value [any-value!]] [
    print "Do some extra MY-QUOTE stuff"
    quote :value
]

That will always return :value, instead of doing "whatever QUOTE would have done in the calling context".
One might imagine that you could work around this by not using the parameter to make a QUOTE call directly, but to build source through composition...to mimic the callsite pattern, and then DO it.

my-quote: func [:value [any-value!]] [
    print "Do some extra MY-QUOTE stuff"
    do compose/only [quote (:value)]
]

More generally to do such a bypass, one would use APPLY. APPLY has you build the FRAME! as you meant it, rather than going through the evaluator. This completely bypasses the callee's parameter conventions...even hard quoting.

my-quote: func [:value [any-value!]] [
    print "Do some extra MY-QUOTE stuff"
    apply 'quote [value: :value]
]

Those tricks won't work for propagating the evaluated or unevaluated bit. My point is just to notice some powerful features put a burden on the person trying to mimic those powers in a wrapper, and they are problems that can be reasoned about.

One answer might be that fetching a function parameter (through a GET-WORD!, GET, etc.) inherits the evaluated or unevaluated status of that parameter, until it is changed locally. This means that although you couldn't write x: 10 | if condition x without getting a complaint, you could write foo: func [x] [if condition x] | foo 10 and have it work (but not foo: func [x] [x: x + 20 | if condition x] | foo 10).

Anyway, it seems worth a try. Regardless of how the experiments go, I think the old IF/ONLY behavior turned out to be fairly bad so it should go. Beyond that, it doesn't hurt to have conditional callsites obey a couple more rules that might get relaxed later.

draegtun · July 20, 2017, 1:21pm

My comment is dishing on IF (and therefore UNLESS) far too much. Instead it's really aimed at group of errors that EITHER can introduce!

So EITHER currently works like this...

>> either true ["yes"] ["no"]
== "yes"

>> either/only true ["yes"] ["no"]
== ["yes"]

>> either/only true "yes" "no" 
== "yes"

All good. However my compromise suggestion is below should fail (instead of returning "yes")...

>> either true "yes" "no"  
== "yes"

So EITHER sans /ONLY always requires two BLOCK!s

I think this adds extra pragmatism (back) to EITHER. And we lose no expressiveness because we still have IF & ELSE....

>> if true "yes" else "no" 
== "yes"

What d'ya think?

hostilefork · July 20, 2017, 4:06pm

It's elegant to be able to choose in common cases with literals without a refinement, think of:

spaced ["(CONSOLE" (either proto-skin/updated? {was} {wasn't}) "updated)"]

That expression goes with two delimiters, and seems to me gives you more than 4 delimiters with:

spaced ["(CONSOLE" either proto-skin/updated? [{was}] [{wasn't}] "updated)"]

By contrast, note that either true yes no wouldn't be legal already under my proposed rules (nor would if true yes else no) because YES and NO are WORD!s and evaluative, then they evaluate to non-blocks. It would have to be a plain literal, which would be an uncommon thing to just leave stray after an EITHER where you forgot your branch (whereas a word could be a function call sitting after, or as we've seen the stray assignment problem, picked up accidentally as a branch).

Generically in DO, throwaway literals in midstream are uncommon, as are throwaway blocks (which is what provided the previous "pragmatism"--if we're on the same page about that). "Dialects" like SPACED are different, but the consumption of stray elements will have observable impact. I feel the benefit is worth the risk.

Also, by the rules I'm proposing, either proto-skin/updated? msg1: {was} msg2: {wasn't} would also not be legal unless you used the refinement-formerly-known-as-/OPT. This puts some extra safety on the construct for missed second-branches due to its commonality. But I'll point out that the safety is not there for really much anything else in the system, which is part of why the original permissive proposal was endorsed by Carl.

The Real Agenda is Embracing the Ability to Reject Boilerplate

It just feels like literals as conditional branches is something the average user should be using, to appreciate the freedom/elegance.

I'm interested to see how more runtime analysis through the evaluated/unevaluated bit can help. I also think excessive caution in this area would be remedied a lot by a step-by-step automatable debugger, which as I've mentioned, might not be too far out of reach...

draegtun · July 21, 2017, 4:48pm

Now that I know /ONLY is superfluous then that is definitely nicer. However i may go for FOR / ELSE instead especially if we did enforce BLOCK!s on EITHER

spaced ["(CONSOLE" if proto-skin/updated? {was} else {wasn't} "updated)"]

In fact i wouldn't even rule out using PICK here...

spaced ["(CONSOLE" pick [{was} {wasn't}] proto-skin/updated? "updated)"]

Anyway we digress.

Can I clarify your proposal, so the following would become illegal...

>> ten: [5 + 5]

>> if true ten
== 10

>> either true 1 + 1 4 + 4   
== 2

>> either false x: 1 y: 2    
== 2

The above all currently work in Ren/C. And they would continue to work with refinement-formerly-known-as-/OPT is used?

BTW - x is set to 1 in last test! Bug?

hostilefork · July 21, 2017, 4:59pm

Good thinking, though remember that ELSE hadn't been solved as a fully viable either replacement until... about ten hours ago.

The others are illegal, but this one would stay legal. Passing evaluated blocks as branches is part of the name of the abstraction game. We like being able to say if condition body.

My complaint is just that if we get too abstract about it, like if condition value for ANY-VALUE!, people might start using that for arbitrary values that aren't blocks...and then suddenly get bit when it is a block and gets executed. The idea is to break them of the habit with normal IF by only letting them use literals for expedience. Then whenever they see an evaluation in a branch slot, they know it will make sure it's a block.

Correct. We assume people using that know what they're doing, and are aware that blocks execute and other things don't. That kind of awareness exists elsewhere; people who use APPEND have to know that blocks splice and other things don't. Though there, the risks are slightly lower, as it doesn't (immediately) involve code execution.

hostilefork · May 19, 2022, 3:33am