The Simple-yet-Powerful Magic of The Loop Result Protocol

hostilefork · May 8, 2018, 6:56pm

Loops return null if-and-only-if they break

One very good reason this rule exists is to make it easy to build custom loop constructs out of several other loop constructs. You can tell "from the outside" whether one of your sub-loops was BREAK'd...this way the the higher level construct is aware it shouldn't run any more of its component loop phases.

(If this rule did not exist, then implementing a loop out of several other loops would have to invasively hook and rebind BREAK in each loop body to its own, and handle that. Even if it were possible--which it probably should be--this would be complex and inefficient. So the simpler rule is better!)

To distinguish this case from normal loop results, a NULL loop body evaluation will be turned into a "boxed" NULL, e.g. a null isotope in a parameter pack:

>> for-each x [1 2 3] [null]
; first in pack of length 1
== ~null~  ; anti

>> meta for-each x [1 2 3] [null]
== ~[~null~]~  ; anti

Parameter packs containing NULL cannot be stored in variables, and will "decay" to a normal NULL when assigned to a variable.

...many common loops return void if the body never ran

>> repeat 0 [<unreturned>]
== ~void~  ; anti

>> for-each x [] [<unreturned>]
== ~void~  ; anti

This is also a unique result...you get void in a pack if the loop runs a body that evaluates to void:

>> repeat 1 [comment "hi"]
; first in pack of length 1
== ~void~  ; anti

Note that some loops do not fit this pattern...e.g. an empty MAP-EACH gives an empty block:

>> map-each x [] [print "never runs"]
== []

Reacting to BREAKs is easy!

Loop aggregators aren't the only place that benefits from being able to tell what happened with a loop from its result. Plain user code reaps the benefits as well.

Right off the bat, if your loop body always returns a truthy thing, you can leverage the result to make sure at least one body ran and there wasn't a break:

all [
    for-each x block [
         if some-test x [break]
         <truthy-result>
    ]
    ; ^-- falsey to interrupt ALL if block is empty, or BREAKs
    ...
]

If you're only concerned with whether a loop BREAKs, then ELSE is the ticket:

for-each x block [
   if some-test x [break]
   <truthy-result>
] else [
    ; This code runs only if the loop breaks
    ; ...so it still runs even if block is []
]

You can combine that with THEN to segregate code when the loop doesn't break:

for-each x block [
   if some-test x [break]
   <truthy-result>
] then [
    ; This code runs only if the loop doesn't break
] else [
    ; This code runs only if the loop breaks
]

Practical example?

Here's a very cool real world case from the console code:

pos: molded: mold/limit :v 2048
repeat 20 [
    pos: next any [find pos newline, break]
] then [
    insert clear pos "..."
]

You have up to 2048 characters of data coming back from the mold, ok. Now you want just the first 20 lines of that. If truncation is necessary, put an ellipsis on the end.

repeat 20 obviously will always try and run the body at least once. (So the loop will never return pure void here, only if you said repeat 0)

FIND will return NULL if it can't find the thing you asked it, so the ANY will run the break when you can't get the position. If it makes it up to 20 without breaking, the THEN clause runs.

So there you go. The first 20 lines of the first 2048 characters of a mold, truncating with "..." I think the THEN really follows the idea of completion, it makes sense (to me) that a BREAK would bypass a THEN (or an ALSO, which is similar) clause.

I encourage people to get creative, to look at ways to use this to write clearer/shorter/better code.

BlackATTR · May 8, 2018, 8:42pm

I really like it. I'm curious to see what others have to say about it. I'm not the most clever ren-c coder, but I definitely see some of these new constructs enabling me to sidestep some of the bulkier expressions I often write.

IngoHohmann · June 11, 2018, 2:17pm

I must admit I felt a bit uneasy about "barification", so I like the new turn [of putting null and void in a pack, and having THEN consider a packed null or void a branch trigger].

hostilefork · January 25, 2021, 9:09am

What Do Rebol2 and Red Do Here?

Rebol2 used NONE, but Red is using UNSET! when a loop doesn't run its body:

>> unset? while [false] []
== true

This GitHub ticket has an inventory of Red's compiled and interpreted behaviors.

Boris says:

I haven't found any explanation of why Rebol chose that loop 0 [1] (and other non-evaluated loops) returns none rather than unset . I have however found numerous examples of relying on loops returning last result of their body evaluation. And that makes none more helpful than unset as we can chain loops into if s like unless result: loop n [stuff] [handle empty case] .

Gregg says:

Agreed on returning none consistently where the body is not processed.

Note: Their GitHub issue is from 2016, and still open in 2024...

hostilefork · September 10, 2024, 2:49pm

I was pondering the edge case of getting two voids into the FOR-PARALLEL example, and what it would return given the particular choice of how it was implemented:

That meant it would return whatever WHILE returned when it never ran its body, which was VOID. Currently that is distinct from what WHILE returns when its condition is void...which is NULL, according to VOID-in-NULL-out.

But is that right? Let's revisit the question of what a void input should do for loops:

>> condition: null

>> for-each item (if condition [block]) [...]
== ???

Whether we return NULL or VOID here affects things like ANY or ALL... NULL will count as a branch inhibitor and stop either construct. On the other hand, ELSE considers both pure NULL and pure VOID to be cause to run.

You could make the case this is not a void-in-null-out situation. That was invented for functions that were looking for a particular answer and couldn't give it, so instead give a "soft failure". e.g. LENGTH OF VOID is NULL, because it couldn't get the length. But loops aren't asking any particular question that fails. They're just a conduit to someone else who might be asking a question. It's kind of like how eval [void] doesn't give you back NULL just because you evaluated and got void.

Loop result protocol says we reserve pure NULL for saying the loop encountered a BREAK. Why would we want to conflate opting out of the loop with breaking, here? That could be thought of as when it's useful to think "the question" of the loop actually did encounter a soft failure.

What About "Weird" Loops?

Asking what MAP-EACH should do is a curious one. If a MAP-EACH never runs its body, it gives back an empty block... not void as other constructs do:

>> for-each x [] [fail "this code should never run"]
== ~void~  ; anti

>> map-each x [] [fail "this code should never run"]
== []

Given current thoughts on the semantics of BLANK!, I'm thinking that should give back empty block as well... because an empty string or binary or array all give back empty block. So it doesn't break any rules for blank to do the same:

>> map-each x _ [fail "this code should never run"]
== []

In order to reason about this, we have to think about the applications.

If someone is writing code like:

all [
    ...
    map-each x (whatever) [...]
    ...
]

We can imagine them having the wish to telegraph "opt out" vs. "null" to the ALL. We know that BREAK in the body of the loop telegraphs null--and that is by design. But should you be able to telegraph NULL from the (whatever) slot, or only telegraph VOID?

Let's make it a little more concrete by saying they're using an IF in the slot:

all [
    ...
    map-each x (if condition [whatever]) [...]
    ...
]

Transformation-wise, they could have written that as:

all [
    ...
    if condition [
        map-each x (whatever) [...]
    ]
    ...
]

So getting void from a conditional expression is easy in such a case. Getting NULL is trickier.

The other likely instance is a calculated expression that may be null.

all [
    ...
    map-each x (maybe some long expression) [...]
    ...
]

You'd have to create a variable in order to pull that out, and not repeat the expression. Can we say anything about how likely it is that someone wants to telegraph NULL vs. VOID out of that?

And furthermore: because MAP-EACH is intrinsically a value-synthesizing construct, is this different from the likely needs of FOR-EACH or similar? Does "I made a block, or if I couldn't NULL" make more sense than "I made a block, or if I couldn't VOID"?

I'll add that the existence of MAYBE being able to turn NULL to void offers a good way to turn "I couldn't do it" to "I don't care"

all [
    ...
    maybe map-each x (maybe some long expression) [...]
    ...
]

Perhaps VOID-in-NULL-out should be heeded for loops, too

It conflates void input with BREAK. But since BLANK! acts the same way as empty series, you have that other choice if you want to generate a VOID out of something like a FOR-EACH.

I think I'm going to stick with NULL as the answer for void input, for now. I'll keep an eye out for how well it is working.