Forward to Multiple Return Values! (and return value forwarding?)

Users of languages that have a multiple return value feature seem to rave about it. And it's something that users from languages that don't have it will ask about, complain about, and constantly try to find ways to work around.

Even before an actual implementation of SET-BLOCK! existed, I had inklings that Rebol could do them in a way that is mindbending and original. But there were nagging technical and semantic concerns for every version of the idea I could envision.

Now I have to say that after having the opportunity to do experimentation with the enfix left-quoting hack for multiple return results... and seeing what Ren-C is capable of...

...multi-return is a must-have :heart_eyes_cat:

I'm convinced we'd be wasting SET-BLOCK! terribly in the evaluator if [x y]: [1 2] was only for "setting x to 1 and y to 2". While I've been concerned in the past about consistency in naming with set [x y] [1 2]... we either forget worrying about that or just use something else to do that. e.g. assign [x y] [1 2], problem solved.

In terms of other problems that seem addressed comfortably: I knew that we needed to be able to return NULL to values in multi-return scenarios, but that would be a problem with even the reduced case if we tried to return a BLOCK! to match the SET-BLOCK! in size:

>> [x y]: function-returning-10-and-20
== [10 20]
>> x
== 10
>> y
== 20

>> [x y]: function-returning-null-and-30
== [#[void] 30]  ; can't put null, so... VOID! in first slot?
>> x
== null  ; not the same thing as what the block said...seems bad, yes?
>> y
== 30

But when I disconnected the idea from the old behavior of set [x y] [10 20], the Gordian Knot was cut, and the evaluative result of a SET-BLOCK! had no need to be a BLOCK! with all the values...but merely one value. This made great sense, because since BLOCK! is always truthy, you'd not really be able to make useful conditional behavior be based on a BLOCK! return anyway!

>> [x y]: function-returning-10-and-20
== 10
>> x
== 10
>> y
== 20

>> [x y]: function-returning-null-and-30
== null  ; so you could meaningfully say `if [x y]: whatever [...]`
>> x
== null
>> y
== 30

The question of whether it has to be the first value is an open one. If you could annotate the elements of the SET-BLOCK! that might be a way of saying otherwise. And perhaps the single return doesn't have to be coupled to the multi-value return case at all (?)

An Important Feature: Return Awareness

The old school "multiple return" method was to pass in WORD!s to set as variables. Such as DO/NEXT:

 r3-alpha>> value: do/next [1 + 2 10 + 20] 'pos
 == 3

 r3-alpha>> pos
 == [10 + 20]

You see that DO could check for the presence of the /NEXT refinement and behave differently. It knows whether it has one return value or two. Based on that knowledge, many routines might have more optimized implementations when not all the possible return results they could give are wanted.

Previously I phrased this with the example:

 >> [x y]: sum-and-difference 20 10

Let's say that gives back 30 and 10. But if the user omitted return results, it might be possible to be more efficient:

 >> x: sum-and-difference 20 10
 >> [x]: sum-and-difference 20 10
 >> [x _]: sum-and-difference 20 10
 ; ^-- let's say all are equivalent ways of asking for the sum
 ; but not the difference.

 >> [_ y]: sum-and-difference 20 10
 ; ^-- asks for the difference but not the sum

How would it be able to tell? It seems like it would be bad to give it the actual BLOCK! it's assigning into, because it could look at the prior values...and it feels like the words in the return block should not be used to pass information as a out-of-band input.

Thus perhaps a multi-return function gets a proxy block, which has VOID!s in it where variables are, and blanks where the blanks are. It's the multi-return's job to assign those slots, and then the evaluator takes care of mapping the values back.

A Huge Question: Forwarding

The premise this is operating on is that multiple return-valued functions are a fundamentally different beast, and any single value goes through assignment normally to the first slot:

>> [x y]: [1 2]
== [1 2]

>> x
== [1 2]

>> y
; null (or maybe void?)

This magical property of functions has a bit of a problem, though...

my-wrapper: func [arg1 arg2] [
     return wrapped-function arg1 arg2

If WRAPPED-FUNCTION has the property that it would assign multiply to a SET-BLOCK! target, then you'd lose that multiple-return-ness. Much like the need that motivated the existence of APPEND/ONLY, you can't have it both ways...a value like a BLOCK! can represents a single multi-valued item, or multiple values, but not both at the same time.

It seems clear there'd need to be some kind of operator for unpacking and repacking a multi-valued function's results into a BLOCK!.

 >> one: multi-returns-10-and-20
 == 10  ; drops 20, not requested

 >> both: unpack multi-returns-10-and-20
 == [10 20]

 >> [x y]: repack [10 20]
 == 10

 >> x
 == 10

 >> y
 == 20

That doesn't help with the wrapping problem, though. You'd need some kind of forwarding return. Perhaps something like a return/pack?

It might seem nice to make forwarding the default, but this would lead to counter-intuitive situations, like these two being different:

 return some-function

 x: some-function
 return x

It feels like you'd want some kind of signal before having these act differently.

So many questions open, but still I think this has to happen

I'm sure enough that I wouldn't suggest using SET-BLOCK! in the evaluator for anything else.

If we say we have to punt on it design-wise for a while and make it an error, then that's fine. It's just time to scrap the idea of [x y]: [1 2] setting x to 1 and y to 2. That's such a waste! So keep one's thinking hats on with this...


Wait...a Possible "Eureka" Moment...

Summarized in a comment I just wrote to set up the premise idea:

//==//// SET-BLOCK! //////////////////////////////////////////////////////==//
// The evaluator treats SET-BLOCK! specially as a means for implementing
// multiple return values.  The trick is that it does so by pre-loading
// arguments in the frame with variables to update, in a way that could have
// historically been achieved with passing a WORD! or PATH! to a refinement.
// So if there was a function that updates a variable you pass in by name:
//     result: updating-function/update arg1 arg2 'var
// The /UPDATE parameter is marked as being effectively a "return value", so
// that equivalent behavior can be achieved with:
//     [result var]: updating-function arg1 arg2

So all you have to do is mark a refinement as an output parameter. That's all! Then check to see if it's null or not, and assign it if it's a WORD! or PATH!...the same way you ever would have. You can use it in the old style (like a DO/NEXT being passed a position to update) or you can use the SET-BLOCK! syntax and let the evaluator do the magic. The order matters of parameters marked like this--as with normal arguments.

Through the power of libRebol I have committed a prototype of the behavior within just a few hours of having the idea! Please test it out and add to the tests, there are just a few here as an example...but more from your own heads would be ideal. Note that error messages and such will be very coarse during the prototype phase, but I do want to know about crashes or bad behaviors.

There's still a lot of details, I'm sure...but this feels like a very pleasing direction to go in!

Genesis of the Idea

I was looking at TRANSCODE, which is a fairly complex in terms of its parameterization. This is the basic routine that exposes the scanner and turns UTF-8 into Rebol values. (Cool sidenote: in Ren-C with UTF-8 Everywhere you can now use it on strings, while R3-Alpha could only use it on binaries...)

A typical TRANSCODE operation turns a string into a block, so conceptually:

 >> transcode "1 [2] <3>"
 == [1 [2] <3>]

 >> transcode "[1 [2] <3>]"
 == [[1 [2] <3>]]   ; always a block of values, even if only one block value

Sometimes people want to transcode just one value at a time, so there is TRANSCODE/NEXT. As with DO/NEXT, this introduces another useful output... which is where you want to write the advanced position to do further processing. Let's look at it in that old style (none of this is real code in Ren-C, so just read it, don't run it):

 >> transcode/next "1 [2] <3>" 'new-pos
 == 1

 >> new-pos
 == "[2] <3>"

There's yet another potential output variation coming from the /RELAX switch. This means that if you have gibberish, it will skip that token and return an ERROR! value:

 >> transcode/next/relax "4bad [2] <3>" 'new-pos
 == make error! [...whatever...]

 >> new-pos
 == "[2] <3>"

That's not ideal, as with some kind of ANY-CONTEXT! literal solution you might actually find an ERROR! value in a scan.

... and now, the realization ...

What if multi-returns are arguments in a frame?

e.g. what if the way you do a multi-return is to mark a parameter a "return" parameter, e.g. an output? The parameter is set either to null, or something to assign--like a refinement--in order to indicate it is requested. Then at the end of the function when it tears down the frame, those values are set.

So this way if you have a WORD! or PATH! you want to assign, you can pass it in as an argument. But if you use the multi-return convention, it will implicitly load those slots with variables out of the block on the left.

That means these two situations would appear equivalent to the insides of TRANSCODE:

>> value: transcode/next/relax "1 [2] <3>" 'next-pos 'error

>> [value next-pos error]: transcode "1 [2] <3>"

And then, these two would appear equivalent:

>> value: transcode/relax "1 [2] <3>" 'error

>> [value _ error]: transcode "1 [2] <3>"

This means that there needs to be a way of marking arguments as returns in the function spec, and their order matters (just as order of ordinary arguments matters).

This is a great duality. It gives you the option to work with named arguments if you want, but provides a shorthand...where SET-BLOCK! collaborates with the evaluator to push into those slots. A similar revolution is forthcoming in APPLY where you should be able to use named arguments even when something is one of the ordered parameters.

For that matter, there's probably nothing saying we couldn't put names on the left too. Off-the-cuff syntax:

>> [value /relax 'error]: transcode "1 [2] <3>"

This could help separate inputs from outputs, while being more commentary and not tying it to an order.

Such is another change powered by Pure and Refined...simplifying the nature of arguments in frames down to one-name-per argument and one-value-per lets us play with this equivalence. We can imagine function frames expanding via AUGMENT to add new return values...

Cool, huh?


Very cool. Thank you— I can see this being really useful for dialects (being coerced out of text, which is relevant for me), and almost anything which helps dialects is a good thing, in my opinion.

It's quite amazing to see the benefits to TRANSCODE right off the bat!

Last night I repeated that success with LOAD, where if you want a header you can just say:

[data header]: load %some-file.reb

And if you don't want a header, you just say data: load %some-file.reb as normal. This replaces the way the output changed in historical Rebol, where LOAD sometimes returned a BLOCK! with a header as the second item, and sometimes it did not.

rebol2>> load {[Rebol [Title: "hi"] a b c]}
== [a b c]

rebol2>> load/header {[Rebol [Title: "hi"] a b c]}
== [make object! [
        Title: "hi"
        Date: none
        Name: none
        Version: none
        ; ...etc, etc...
        Language: none
        Type: none
        Content: none
    ] a b c

What's also an improvement by setting variables directly over the BLOCK! approach is that you don't have to put BLANK! in a slot, you can really return NULL:

>> [data header]: load "<no> #header 'here"
== [<no> #header 'here]

>> data
== [<no> #header 'here]

>> header
; null

Pursuant to my recent post on making BLANK! used more narrowly, I think NULL is the better result for "thing that wasn't there"...because you have to consciously TRY it to have it ignored by operations that poke around at it.

On that note: I think we've dodged a bullet by making the "true nothing" null not cause errors on WORD! access, and switched to using an UNSET!-like value for "ornery" response. I was rather led astray by seeing compose [1 (#[unset]) 2] be [1 2] that I believed my new NULL concept was "The REAL unset" that I didn't realize it was much better modeled as "The REAL none!"

One must wonder what would have happened if that realization had happened sooner (!). But things were learned along the way, and it's all shaping into place now. Just have to keep up the pace...