TRANSCODE and LOAD Alternative Multi-Return Concept

hostilefork · April 10, 2020, 8:20pm

UPDATE 2022: This is no longer a pattern supported by the multi-return machinery itself, due to problems arising with how feeding information from an output request affects composition (e.g. things like an ENCLOSE). Making multi-returns pure outputs that are atomically returned via a single "isotopic" package turns out to be much more solid.

However, it can still be accomplished using enfix functions...for those who would find it useful. You end up taking responsibility for doing the necessary SET's yourself (since the enfix consumes the SET-WORD! or SET-BLOCK! on the left). But if enough people thought it was interesting, common mechanics for that could be factored out.

(Whether this is "dialecting" is subject to debate, but I wanted to preserve the writeup here, vs. in "Show and Tell" where it might suggest this is an active feature of todays transcode.)

[...In April 2020...] I was looking at TRANSCODE, which is a fairly complex in terms of its parameterization. This is the basic routine that exposes the scanner and turns UTF-8 into Rebol values. (Cool sidenote: in Ren-C with UTF-8 Everywhere you can now use it on strings, while R3-Alpha could only use it on binaries...)

A typical TRANSCODE operation turns a string into a block, so conceptually:

 >> transcode "1 [2] <3>"
 == [1 [2] <3>]

 >> transcode "[1 [2] <3>]"
 == [[1 [2] <3>]]   ; always a block of values, even if only one block value

Sometimes people want to transcode just one value at a time, so there is TRANSCODE/NEXT. As with DO/NEXT, this introduces another useful output... which is where you want to write the advanced position to do further processing. Let's look at it in that old style (none of this is real code in Ren-C, so just read it, don't run it):

 >> transcode/next "1 [2] <3>" 'new-pos
 == 1

 >> new-pos
 == "[2] <3>"

There's yet another potential output variation coming from the /RELAX switch. This means that if you have gibberish, it will skip that token and return an ERROR! value:

 >> transcode/next/relax "4bad [2] <3>" 'new-pos
 == make error! [...whatever...]

 >> new-pos
 == "[2] <3>"

That's not ideal, as with some kind of ANY-CONTEXT! literal solution you might actually find an ERROR! value in a scan.

Then Multi-Returns Were Re-Conceived As Refinement Doppelgangers

The way you would do a multi-return is to mark a parameter a "return" parameter, e.g. an output. The parameter is set either to null, or something to assign--like a refinement--in order to indicate it is requested. Then at the end of the function when it tears down the frame, those values are set.

So this way if you have a WORD! or PATH! you want to assign, you can pass it in as an argument. But if you use the multi-return convention, it will implicitly load those slots with variables out of the block on the left.

That means these two situations would appear equivalent to the insides of TRANSCODE:

>> value: transcode/next/relax "1 [2] <3>" 'next-pos 'error

>> [value next-pos error]: transcode "1 [2] <3>"

And then, these two would appear equivalent:

>> value: transcode/relax "1 [2] <3>" 'error

>> [value _ error]: transcode "1 [2] <3>"

This means that there needs to be a way of marking arguments as returns in the function spec, and their order matters (just as order of ordinary arguments matters).

It was a great duality, giving you the option to work with named arguments if you want, but provides a shorthand...where SET-BLOCK! collaborates with the evaluator to push into those slots.

It was quite amazing to see the benefits to TRANSCODE right off the bat!

Then I repeated the success with LOAD...

If you want a header you can just say:

[data header]: load %some-file.reb

And if you don't want a header, you just say data: load %some-file.reb as normal. This replaces the way the output changed in historical Rebol, where LOAD sometimes returned a BLOCK! with a header as the second item, and sometimes it did not.

rebol2>> load {[Rebol [Title: "hi"] a b c]}
== [a b c]

rebol2>> load/header {[Rebol [Title: "hi"] a b c]}
== [make object! [
        Title: "hi"
        Date: none
        Name: none
        Version: none
        ; ...etc, etc...
        Language: none
        Type: none
        Content: none
    ] a b c
]

What's also an improvement by setting variables directly over the BLOCK! approach is that you don't have to put BLANK! in a slot, you can really return NULL:

>> [data header]: load "<no> #header 'here"
== [<no> #header 'here]

>> data
== [<no> #header 'here]

>> header
; null

REMINDER: Not a supported feature anymore in the multi-return mechanics due to compositional problems, but still a trick you could do if you wanted to for a special purpose!

BlackATTR · April 9, 2020, 12:03am

Very cool. Thank you— I can see this being really useful for dialects (being coerced out of text, which is relevant for me), and almost anything which helps dialects is a good thing, in my opinion.