The Simple-yet-Powerful Magic of The Loop Result Protocol

Note: This post has been updated 17-Nov-2018 to reflect the current state of the rules.

Loops return null if-and-only-if they break

One very good reason this rule exists is to make it easy to build custom loop constructs out of several other loop constructs. You can tell "from the outside" whether one of your sub-loops was BREAK'd...this way the the higher level construct is aware it shouldn't run any more of its component loop phases.

(If this rule did not exist, then implementing a loop out of several other loops would have to invasively hook and rebind BREAK in each loop body to its own, and handle that. Even if it were possible--which it probably should be--this would be complex and inefficient. So the simpler rule is better!)

That's the only "hard rule" of loop return results, however...

...many common loops return a BLANK! if the body never ran

>> loop 0 [<unreturned>]
== _

>> for-each x [] [<unreturned>]
== _

To distinguish this case (and the breaking case) from normal running, such loops also "voidify" a body result of blank or null:

>> for-each x [1 2 3] [null]
// #[void] 

>> loop 1 [_]
// #[void]

>> x: true while [x] [x: false] // evaluating to logic false still works
== #[false] // so you can get a "falsey" result even if the body runs to completion

Remember that voids may be cantaknerous values, but they are nonetheless "ANY-VALUE!"s...hence they will trigger THEN and not trigger ELSE. But if you try to use void with conditional logic and test it for truth or falsehood, you'll get an error.

Reacting to BREAKs and empty loops is easy!

Loop aggregators aren't the only place that benefits from being able to tell what happened with a loop from its result. Plain user code reaps the benefits as well.

Right off the bat, if your loop body always returns a truthy thing, you can leverage the result to make sure at least one body ran and there wasn't a break:

all [
    for-each x block [
         if some-test x [break]
         <truthy-result>
    ]
    // ^-- falsey to interrupt ALL if block is empty, or BREAKs
    ...
]

If you don't want to count not having the body run as an equivalent result to a BREAK in a situation like this, leverage the fact that BLANK! is a value to trigger THEN and truthyify that case:

all [
    for-each x block [
         if some-test x [break]
         <truthy-result>
    ] then [true]
    // ^-- only on BREAK is it now falsey to interrupt the ALL
    ...
]

If you're only concerned with whether a loop BREAKs, then ELSE is the ticket:

for-each x block [
   if some-test x [break]
   <truthy-result>
] else [
    // This code runs only if the loop breaks
    // ...so it still runs even if block is []
]

You can combine that with THEN to segregate code when the loop doesn't break:

for-each x block [
   if some-test x [break]
   <truthy-result>
] then [
    // This code runs only if the loop doesn't break
] else [
    // This code runs only if the loop breaks
]

You can use AND and OR to test the logic cases:

for-each x block [
    if some-test x [break]
    <truthy-result>
] or [
    // This code runs if the loop breaks -or- DIDN'T
    // run the body at least once
]

for-each x block [
    if some-test x [break]
    <truthy-result>
] and [
    // This code runs if the loop doesn't break and ran
    // the body at least once
]

Practical example?

Here's a very cool real world case from the console code:

pos: molded: mold/limit :v 2048
loop 20 [
    pos: next (find pos newline else [break])
] then [
    insert clear pos "..."
]

You have up to 2048 characters of data coming back from the mold, ok. Now you want just the first 20 lines of that. If truncation is necessary, put an ellipsis on the end.

loop 20 obviously will always try and run the body at least once. (So the loop will never return blank here.)

FIND will return NULL if it can't find the thing you asked it, so the ELSE runs when you can't get the position. If it makes it up to 20 without breaking, then the THEN clause runs.

So there you go. The first 20 lines of the first 2048 characters of a mold, truncating with "..." I think the THEN really follows the idea of completion, it makes sense (to me) that a BREAK would bypass a THEN (or an ALSO, which is similar) clause.

I encourage people to get creative, to look at ways to use this to write clearer/shorter/better code.

1 Like

I really like it. I'm curious to see what others have to say about it. I'm not the most clever ren-c coder, but I definitely see some of these new constructs enabling me to sidestep some of the bulkier expressions I often write.

1 Like

I've updated the post to convey a new--and I believe less "meddling"--version of the rules.

The previous idea of formalizing that a loop which never ran would return NULL arose from historical Rebol--as well as a desire to avoid fabricating a value when there was none. So when it came time to think of how a loop would signal to the outside that it had been broken, NULL was "already taken". That's how the idea of blank when a BREAK happened came up.

But I thought to look at it from a fresh frame of mind, after the recent reflection on why having failed conditionals return BLANK! is a bad idea. BLANK! is a legitimate value, and doing mutations of it to push such an in-band value to become some other in-band value...just to be able to use BLANK! to signal something else...is bad.

NULL is different. It can't be stored in a block, it's neither true nor false, and historical Rebol wouldn't even let you use it to unset a variable without a special refinement to SET. It is supposed to be "edgy". So should a conditional branch evaluate to a null, or a loop body evaluate to a null, it's not such a terrible sin to convert it to a BLANK!. But it's much worse to convert another value, especially to convert a LOGIC! false into a (truthy) BAR!.

So this new spin on loop rules reassigns NULL to loops getting a BREAK. A loop that never runs gives a BLANK!, which is also a legal value for if the loop body wants to return that. but a loop user who wants to be complicit in distinguishing a loop that never ran from one that did may do so, simply by saying:

 while [...] [
     ...
     true
 ]

That gives null if it breaks, true if it runs at least once, and blank if it never runs. So you get the three states, albeit having to get a little more involved. (Code golfers will generally find whatever the body returned naturally was already truthy, but I think explicitly putting the TRUE is wiser for the common codebase).

I think this is on the whole a better plan--giving basically all the same benefits, but in a clearer way.

Sounds good. You've thought it through very carefully, now we just need to try it on for size and see if it's a good fit.

I must admit I felt a bit uneasy about barification, so I like the new turn.

1 Like

I must admit I felt a bit uneasy about barification, so I like the new turn.

Me too...and hopefully you will also like the end of blankification (now a more limited form of result-mutating, known as voidification).

...which makes me feel less uneasy, also--especially in being able to cut out all those sprawling refinements and *-specializations...

While the premise behind the "loop result protocol" has remained mostly consistent, there has been some shuffle to "perfect" it. :diamond_shape_with_a_dot_inside: I've updated the first post for this thread to convey the subtleties.

TL; DR

  1. BREAK causes NULL for all loops, even MAP-EACH (which previously would keep the in-progress, results, e.g. map-each x [1 2 3 4] [if x = 3 [break] x] would give you [1 2]). That feature was simply not worth making it impossible to tell from the outside if the loop broke.

  2. many loops return BLANK! if they never run the body, MAP-EACH is an exception to this (it returns an empty block if the input block is empty).

  3. The loops that follow (2) distinguish the case where their body-running-loop-result would be BLANK! or NULL by returning void ("voidification"). This means BLANK! is returned if-and-only-if the loop body never runs, and NULL is returned if-and-only-if the loop breaks. Both are falsey, but can still be discerned from each other... e.g. ELSE is only null active, while OR is null and blank (and false) active.

When you compare something like "barification of false, null, or blank" to "voidification of null and blank", I think there's significantly less harm. Also, the void has very high odds of stepping in to warn you of misunderstandings--so it "teaches the protocol".

1 Like

The Simple-yet-Powerful Magic of The Loop Result Protocol

It's two years later and with the dawn of NULL isotopes, voidification is dying.

So it's time to review the loop result protocol and where it stands at this point.

It means a loop whose body returns NULL would return the "heavy" isotope NULL-2.

>> loop 2 [print "Hi", null]
Hi
Hi
; null-2

So that loop would trigger a THEN, but not trigger an ELSE.

>> loop 2 [print "Hi", null] then [print "THEN!"] else [print "ELSE!"]
Hi
Hi
THEN!
== ~void~

If you BREAK it, you'd get the "light" isotope, and trigger ELSE:

>> loop 2 [print "Hi", break] then [print "THEN!"] else [print "ELSE!"]
Hi
ELSE!
== ~void~

Either way, it's a NULL.

>> null? loop 2 [print "Hi", null]
Hi
Hi
== #[true]

>> null? loop 2 [print "Hi", break]
Hi
== #[true]

What About Knowing If The Loop Ran At Least Once or Not?

The theory was that as long as we were taking away NULL, we might try using some other value to indicate the body never ran. BLANK! was used, and then blank was voidified if the body returned it.

I haven't used this in practice...partially because the voiding made me uneasy, and called the whole feature into question. One would think that if it were really useful, I would have likely overcome my uneasiness to try applying it.

But let's get back to top-level motivation. Imagine you have a pipeline of loops, maybe inside an ALL or ANY:

 any [
     while [b < 1020] [...]
     while [j > 304] [...]
     ...
 ]

The question was what loops that don't run their body even once return, when NULL is reserved specificially for breaking.

They could prime themselves with a named void of their name, so you'd at least know where your non-useful-value came from. :-/

>> while [false] [<not run>]
== ~while~

BUT if they return VOID!, then they become ornery to work with in chains like this. The theory was that priming them with another value would make it easier...and BLANK! was the offered choice. So if you had a bunch of loops with conditions and they were looking to calculate values you could just skip the ones whose conditions blocked them from even trying.

We now have the alternative non-ELSE-triggering option of NULL-2 instead of BLANK! to throw in the mix. Since it's not a value you might want to put in a block, it's probably a better choice than BLANK! is.

Either way, with voidification being dead, we need to get rid of the voidification of loop body values. I don't think reserving a value for "if and only if the loop body never ran" is justifiable. If you want to make that convention arise yourself by never returning the value loops are primed with, that's up to you.

So no voidification, let's try priming with NULL-2 and see how it pans out.

>> while [false] [<not run>]
; null-2

>> while [false] [<not run>] then [print "Counts as 'didn't break'"]
Counts as 'didn't break'
== ~void~

My instincts have been that it would be a mistake to conflate "condition didn't run the body" with "the body had a BREAK occur". Because conditions being false is a normal exit condition of the loop.

If NULL-2 doesn't seem to have any particular usefulness then maybe just switch it to prime with void.


Precedent note: Rebol2 and R3-Alpha chose NONE here:

rebol2/r3-alpha>> while [false] []
== none

Red is using UNSET!:

>> unset? while [false] []
== true

The advantages around UNSET! (VOID!) would be safety-motivated, where you are assigning a variable from the result of the loop and you think that erroring is better than continuing with a value that was never set by running a body.

I don't suppose anyone has a strong opinion...

1 Like

This topic should be informed somewhat now by "What should DO [] do".

They're likely changing it. This GitHub ticket has an inventory of Red's compiled and interpreted behaviors.

Boris says:

I haven't found any explanation of why Rebol chose that loop 0 [1] (and other non-evaluated loops) returns none rather than unset . I have however found numerous examples of relying on loops returning last result of their body evaluation. And that makes none more helpful than unset as we can chain loops into if s like unless result: loop n [stuff] [handle empty case] .

Gregg says:

Agreed on returning none consistently where the body is not processed.

Ren-C is currently taking the tactic of returning NULL-2:

>> while [false] []
; null-2

This means that a THEN clause will consider the loop to have "ran", distinctly from if a BREAK occurred. The logic behind that is that all loops will--at some point--reach a condition which causes the loop to terminate, so if having a termination condition be met was enough to trigger an ELSE then one could argue the ELSE should always run. That's not useful.