The Simple-yet-Powerful Magic of The Loop Result Protocol


#1

Note: This post has been updated 17-Nov-2018 to reflect the current state of the rules.

Loops return null if-and-only-if they break

One very good reason this rule exists is to make it easy to build custom loop constructs out of several other loop constructs. You can tell “from the outside” whether one of your sub-loops was BREAK’d…this way the the higher level construct is aware it shouldn’t run any more of its component loop phases.

(If this rule did not exist, then implementing a loop out of several other loops would have to invasively hook and rebind BREAK in each loop body to its own, and handle that. Even if it were possible–which it probably should be–this would be complex and inefficient. So the simpler rule is better!)

That’s the only “hard rule” of loop return results, however…

…many common loops return a BLANK! if the body never ran

>> loop 0 [<unreturned>]
== _

>> for-each x [] [<unreturned>]
== _

To distinguish this case (and the breaking case) from normal running, such loops also “voidify” a body result of blank or null:

>> for-each x [1 2 3] [null]
// #[void] 

>> loop 1 [_]
// #[void]

>> x: true while [x] [x: false] // evaluating to logic false still works
== #[false] // so you can get a "falsey" result even if the body runs to completion

Remember that voids may be cantaknerous values, but they are nonetheless "ANY-VALUE!"s…hence they will trigger THEN and not trigger ELSE. But if you try to use void with conditional logic and test it for truth or falsehood, you’ll get an error.

Reacting to BREAKs and empty loops is easy!

Loop aggregators aren’t the only place that benefits from being able to tell what happened with a loop from its result. Plain user code reaps the benefits as well.

Right off the bat, if your loop body always returns a truthy thing, you can leverage the result to make sure at least one body ran and there wasn’t a break:

all [
    for-each x block [
         if some-test x [break]
         <truthy-result>
    ]
    // ^-- falsey to interrupt ALL if block is empty, or BREAKs
    ...
]

If you don’t want to count not having the body run as an equivalent result to a BREAK in a situation like this, leverage the fact that BLANK! is a value to trigger THEN and truthyify that case:

all [
    for-each x block [
         if some-test x [break]
         <truthy-result>
    ] then [true]
    // ^-- only on BREAK is it now falsey to interrupt the ALL
    ...
]

If you’re only concerned with whether a loop BREAKs, then ELSE is the ticket:

for-each x block [
   if some-test x [break]
   <truthy-result>
] else [
    // This code runs only if the loop breaks
    // ...so it still runs even if block is []
]

You can combine that with THEN to segregate code when the loop doesn’t break:

for-each x block [
   if some-test x [break]
   <truthy-result>
] then [
    // This code runs only if the loop doesn't break
] else [
    // This code runs only if the loop breaks
]

You can use AND and OR to test the logic cases:

for-each x block [
    if some-test x [break]
    <truthy-result>
] or [
    // This code runs if the loop breaks -or- DIDN'T
    // run the body at least once
]

for-each x block [
    if some-test x [break]
    <truthy-result>
] and [
    // This code runs if the loop doesn't break and ran
    // the body at least once
]

Practical example?

Here’s a very cool real world case from the console code:

pos: molded: mold/limit :v 2048
loop 20 [
    pos: next (find pos newline else [break])
] then [
    insert clear pos "..."
]

You have up to 2048 characters of data coming back from the mold, ok. Now you want just the first 20 lines of that. If truncation is necessary, put an ellipsis on the end.

loop 20 obviously will always try and run the body at least once. (So the loop will never return blank here.)

FIND will return NULL if it can’t find the thing you asked it, so the ELSE runs when you can’t get the position. If it makes it up to 20 without breaking, then the THEN clause runs.

So there you go. The first 20 lines of the first 2048 characters of a mold, truncating with “…” I think the THEN really follows the idea of completion, it makes sense (to me) that a BREAK would bypass a THEN (or an ALSO, which is similar) clause.

I encourage people to get creative, to look at ways to use this to write clearer/shorter/better code.


BREAK and CONTINUE propagation
#2

I really like it. I’m curious to see what others have to say about it. I’m not the most clever ren-c coder, but I definitely see some of these new constructs enabling me to sidestep some of the bulkier expressions I often write.


#4

I’ve updated the post to convey a new–and I believe less “meddling”–version of the rules.

The previous idea of formalizing that a loop which never ran would return NULL arose from historical Rebol–as well as a desire to avoid fabricating a value when there was none. So when it came time to think of how a loop would signal to the outside that it had been broken, NULL was “already taken”. That’s how the idea of blank when a BREAK happened came up.

But I thought to look at it from a fresh frame of mind, after the recent reflection on why having failed conditionals return BLANK! is a bad idea. BLANK! is a legitimate value, and doing mutations of it to push such an in-band value to become some other in-band value…just to be able to use BLANK! to signal something else…is bad.

NULL is different. It can’t be stored in a block, it’s neither true nor false, and historical Rebol wouldn’t even let you use it to unset a variable without a special refinement to SET. It is supposed to be “edgy”. So should a conditional branch evaluate to a null, or a loop body evaluate to a null, it’s not such a terrible sin to convert it to a BLANK!. But it’s much worse to convert another value, especially to convert a LOGIC! false into a (truthy) BAR!.

So this new spin on loop rules reassigns NULL to loops getting a BREAK. A loop that never runs gives a BLANK!, which is also a legal value for if the loop body wants to return that. but a loop user who wants to be complicit in distinguishing a loop that never ran from one that did may do so, simply by saying:

 while [...] [
     ...
     true
 ]

That gives null if it breaks, true if it runs at least once, and blank if it never runs. So you get the three states, albeit having to get a little more involved. (Code golfers will generally find whatever the body returned naturally was already truthy, but I think explicitly putting the TRUE is wiser for the common codebase).

I think this is on the whole a better plan–giving basically all the same benefits, but in a clearer way.


#5

Sounds good. You’ve thought it through very carefully, now we just need to try it on for size and see if it’s a good fit.


#6

I must admit I felt a bit uneasy about barification, so I like the new turn.


#7

I must admit I felt a bit uneasy about barification, so I like the new turn.

Me too…and hopefully you will also like the end of blankification (now a more limited form of result-mutating, known as voidification).

…which makes me feel less uneasy, also–especially in being able to cut out all those sprawling refinements and *-specializations…


#8

While the premise behind the “loop result protocol” has remained mostly consistent, there has been some shuffle to “perfect” it. :diamond_shape_with_a_dot_inside: I’ve updated the first post for this thread to convey the subtleties.

TL; DR

  1. BREAK causes NULL for all loops, even MAP-EACH (which previously would keep the in-progress, results, e.g. map-each x [1 2 3 4] [if x = 3 [break] x] would give you [1 2]). That feature was simply not worth making it impossible to tell from the outside if the loop broke.

  2. many loops return BLANK! if they never run the body, MAP-EACH is an exception to this (it returns an empty block if the input block is empty).

  3. The loops that follow (2) distinguish the case where their body-running-loop-result would be BLANK! or NULL by returning void (“voidification”). This means BLANK! is returned if-and-only-if the loop body never runs, and NULL is returned if-and-only-if the loop breaks. Both are falsey, but can still be discerned from each other… e.g. ELSE is only null active, while OR is null and blank (and false) active.

When you compare something like “barification of false, null, or blank” to “voidification of null and blank”, I think there’s significantly less harm. Also, the void has very high odds of stepping in to warn you of misunderstandings–so it “teaches the protocol”.