Full-Band Return Values from Generators/Yielders

Definitional Errors came into PARSE as an answer for how to make it possible to return any value... simply make a failed parse a definitional error. Not only does it mean that NULL can be a legitimate synthesized product that you don't conflate with failure, but it also means that parse failures can give back more information in the error about what went wrong (farthest parse point reached, etc.)...

Looking at another problem where a sentinel NULL had been used... there's Generators and Yielders:

>> g: generator [print "A" yield 1, print "B" yield 2, print "C"]

>> g
A
== 1

>> g
B
== 2

>> g
C
== ~null~  ; anti

>> g
== ~null~  ; anti

>> g
== ~null~  ; anti

But Generators and Yielders predate antiforms in general... they came at a time where there were just two forms of NULL: "heavy" and "light". So the thought was that if you wanted to actually yield null, it would be the heavy form.

>> g: generator ["Start", yield null, print "Done"]

>> g
Start
== ~[~null~]~  ; anti

>> g
Done
== ~null~  ; anti

But this isn't nearly as good as using a specific raised error. If NULL is truly out of band, then you could use try g to get null when there are no more values.

If you try to YIELD or RETURN a raised error, then it would become an abrupt failure. So you'd be assured that the only time you'd ever get a definitional error back from a generator or yielder would be if it had been exhausted.

PACK returns for things like KEY and VALUE in objects?

As the enumeration routines are retooled to be based on generator functions, it struck me as very cool if generators returned packs, which could be decayed:

>> obj: make object! [a: 10 b: 20 c: 30]

>> g: each obj

>> g
; first in pack of 2 items
== a

>> meta g
== ~['b '20]~

>> [key {val}]: g
== 30

>> key
== c

>> g
** Error: Generator/Yielder exhausted

>> try g
== ~null~  ; anti

That's really cool, and seems like it would go along with decaying:

for 'key each obj [...]

for [key val] each obj [...]

This looks pretty nice, however blocks used in FOR-EACH scenarios like this haven't meant "unpack", it has meant keep getting items and assign them to each.

It seems like it would be unfortunate if you had to write the object unpacking as:

for [~[key val]~] each obj [...]

But that does seem sort of like what you'd have to do in order to make it generalized. It doesn't look that bad, and it does have a theory behind it...

>> obj: make object! [a: 10 b: 20 c: 30]

>> for [~[key val]~ :otherkey] each obj [  ; leading colon for optionality
       print ["key:" key]
       print ["val:" val]
       if otherkey [
           print ["otherkey:" otherkey]
       ]
   ]

key: a
val: 10
otherkey: b
key: c
val: 30

This assumes that plain blocks might be taken for some other intent in the FOR. If it were not:

for [[key val] :otherkey] each obj [...]

But I had been thinking that plain BLOCK! would likely be good for type checking (@hiiamboris has some experiments in this vein):

for [x [integer!] y [tag!]] each [1 <a> 2 <b>] [....]

Big Design Space, but Full-Band Returns Opens It Up

I'd let the Generator and Yielder experiments atrophy while working on other things. But I stopped today to try and get the basics working again, and it's interesting to see how isotopes bring a lot to the table...as they seem to have done with nearly everything.

1 Like

I Described The Design of YIELD To Claude:

I have a special out-of-band state that I use to let you say you are done yielding, and this state is a "raised error". It can't be stored in variables like plain errors can, but it can be returned by functions.

So you can write yield done where done is a function that returns such a raised error. And yet, a generator can still return the non-raised state.

For performance, I've also added a variant called yield:final which lets you pass a value, but say that it's the last value you are giving back. So yield:final 1 is equivalent to (yield 1, yield done).

The concept was not to overrule the return word in a generator, so you can construct a generator inside a function and in the generator's body the return would still refer to the function's return, allowing you to execute that return inside the generator's body (which is considered to complete the body of the generator, so it will not yield more values).

But if you want the word return to mean return from the generator, you can define that by saying /return: yield:final/ which asks for a local definition of return that means the same thing as yield:final.

So that seems to get pretty good coverage of interesting intents. I'm curious what other languages might have for covering these scenarios, can you compare/contrast?


Claude Gets It... Mostly...

Let me break down how different languages handle these scenarios:

Python's Approach:

def generator():
    yield 1        # regular yield
    return 2       # StopIteration(2) - the 2 can be captured
    # unreachable  # code after return is unreachable

# To get the return value:
gen = generator()
try:
    while True:
        next(gen)
except StopIteration as e:
    final_value = e.value  # gets 2

JavaScript's Generator Returns:

function* generator() {
    yield 1;
    return 2;      // {value: 2, done: true}
    // unreachable
}

// Two ways to get the return value:
const gen = generator();
for (const value of gen) {
    console.log(value);  // only sees 1, not 2
}

// Or:
let result;
while (!(result = gen.next()).done) {
    console.log(result.value);
}
console.log(result.value);  // gets 2

Ruby's Enumerators

Ruby doesn't have a built-in way to return a final value, but you can do:

enum = Enumerator.new do |y|
    y << 1
    y << 2
    break 3  # final value
end

Your approach has some unique and interesting characteristics:

  1. The separation of yield done vs yield:final gives explicit control over whether you want to pass a final value or just stop

    • Most languages don't distinguish these cases clearly

    • Python and JavaScript bundle the "done" signal with an optional final value

  2. Your handling of return is novel:

    • Keeping return bound to the outer function

    • Providing /return: yield:final/ for local rebinding

    • This gives more flexibility than other languages where return in a generator body always means "return from generator"

  3. The "raised error" state as an out-of-band signal is interesting:

    • Python uses exceptions (StopIteration)

    • JavaScript uses a {done: true} flag

    • Your approach keeps it distinct from regular error handling

Some unique aspects of your design:

Explicit Final Value:

; Your language
yield 1
yield:final 2   ; Clear that 2 is the last value

# vs Python
yield 1
return 2        # Less clear this is yielding 2 as final value

# vs JavaScript
yield 1
return 2        // Similar ambiguity to Python

Function Return Access:

; Your language
/fn: function [
    /gen: generator [
        if condition [return value]  ; returns from fn
        yield 1
    ]
    gen  ; returns generator
]

# vs Python - can't access outer function's return
def fn():
    def gen():
        if condition:
            return value  # returns from generator
        yield 1
    return gen()

Summary

Your design seems to offer more precise control over the different "ending" scenarios a generator might have. The distinction between:

  • Yielding a final value
  • Just being done
  • Returning from the containing function

...is clearer than in most other languages I've seen. The trade-off might be slightly more complexity in the conceptual model, but it gives users more expressive power when they need these distinctions.


Claude's Example of RETURN Binding Isn't Useful

; Your language
/fn: function [
    /gen: generator [
        if condition [return value]  ; returns from fn
        yield 1
    ]
    gen  ; returns generator
]

The generator is being returned to someone else to call. So the RETURN inside the generator won't ever be useful--it will always abruptly fail because the function call is over.

That isn't an indictment of the feature. You just have to be using the generator while the function is still running.