Compatibility MAP-EACH (and problems therewith)

hostilefork · August 9, 2019, 7:21pm

In trying to make a shim where MAP-EACH splices and has /ONLY, I thought the easiest way of doing it might be to redo it in terms of a COLLECT of a FOR-EACH.

This delves some into the ambition of Ren-C to raise the bar for Rebol, so it raises some questions. First, let's try a naive approach:

map-each: function [
    {https://forum.rebol.info/t/1155}
    return: [block!]
    'vars [blank! word! block!]
    data [any-series! any-path! action!]
    body [block!]
    /only
][
    collect [
        for-each :vars :data [
            keep/(only) do body
        ]
    ]
]

One reason this won't work correctly is because BODY is executed via a "link" of DO instead of being embedded into the body of the FOR-EACH. That means it won't bind to the VARS variables. And in a definitional-break-and-continue world (which I've been considering) it won't have words like CONTINUE or BREAK bound.

We can address that by embedding the code in, let's say we just put it in as a GROUP!

    collect [
        for-each :vars :data compose [
            keep/(only) (as group! body)  ; only inside path, not a compose/deep
        ]
    ]

This simple implementation supplements the body of the MAP-EACH with additional code. It does it by composing in the body code as a GROUP!, so that it will pick up any bindings the FOR-EACH would add.

Fair enough, but it has a couple of problems. One problem: what if I said map-each keep [1 2 3] [...]? :-/ Our supplemental body adds code that the user doesn't see at the callsite, so they don't know to avoid usage of words that are in that supplemental body for their loop variables. This gets worse the more supplemental code you have.

I think we need to expose something lower-level than FOR-EACH

Really it seems like what you need here is a tool that lets you set up whatever binding object a loop is going to use, gives you a chance to bind code to that object, then lets you run the iteration independent of binding. Something like:

    collect [
        [context looper]: make-loop-stuff :vars :data
        bind body context
        while [looper] [
            keep/(only) do body
        ]
    ]

The imagined MAKE-LOOP-STUFF would give you two things back: a context to bind any code into that you wanted to see the changes to variables in, and a function you could call that would update the values of those variables as long as there was more data.

....Just another epicycle of the binding problems...

Binding in Rebol will always be Rube-Goldberg-like, and so the question is how to maximize the fun and minimize the annoyance, while still getting decent performance. I think if people can think of FOR-EACH as a higher level "macro" which makes a lot of assumptions in order to be ergonomic, they can realize that writing their own loop is going to involve digging deeper.

Something like the pattern above could be used to implement FOR-EACH, MAP-EACH, or REMOVE-EACH...though they could retain their native optimized versions. There's still worries about mutating the bindings on passed-in code (the bind body context above) so a "good" answer would be something like body: in context body where that was understood to not modify the original, but give a rebound "view" at a lower cost.

hostilefork · August 12, 2019, 1:34pm

    [context looper]: make-loop-stuff :vars :data
    bind body context
    while [looper] [
        keep/(only) do body
    ]

I'll point out that this would be cleaner if LOOPER could be both a function and an object. This might suggest being done with FRAME!:

    looper: make-loop-stuff :vars :data
    bind body looper
    while [do looper] [
        keep/(only) do body
    ]

But the current idea of frames is that they expire after you execute them, since it is generally understood that Rebol functions can mutate their arguments; so once a function finishes it may have completely trashed its state so it can't run again. You can DO COPY of a FRAME! however, but then there's no way to have frames preserve state across multiple invocations. :-/

Functions can accrue state by means of referencing some external object where the state lives. But there's not any meaning currently to binding to a function. If you could:

    looper: make-loop-stuff :vars :data
    bind body looper
    while [looper] [
        keep/(only) do body
    ]

JavaScript lets functions act as objects. Rebol would have a bit of a conundrum with that, because pathing on functions is used to specify refinements...not members. But this is an interesting case where being able to let a function expose some properties (the vars data of the internal state) would be of use.

You could also have the function have some mode where it gives back the object where its internal state lives:

    looper: make-loop-stuff :vars :data
    bind body looper/state  ; /STATE would be a refinement
    while [looper] [
        keep/(only) do body
    ]

Not quite as elegant, but could work.

Various things worth thinking about. :-/

MAKE-LOOP-STUFF could perhaps be called ITERATES:

looper: iterates :vars :data

hostilefork · August 13, 2019, 5:41pm

This works for the variables, but raises the question of meaning for BREAK and CONTINUE.

Today's BREAK and CONTINUE climbs the call stack and looks for a loop that's listening for it, which means the while [looper] would react to them as expected. But imagine a hypothetical variant MAP-TWICE:

>> accumulator: 0
>> map-twice x [1 2] [
       accumulator: accumulator + 1
       x + accumulator
   ]
== [2 3 5 6]

You might implement MAP-TWICE like:

collect [
    looper: make-loop-stuff :vars :data
    bind body looper/state  ; /STATE would be a refinement
    while [looper] [
        loop 2 [keep/(only) do body]
    ]
]

Now what happens when you BREAK? The BREAK would not break the entire MAP-TWICE operation as desired, but just the LOOP.

The BREAK-is-the-only-way-to-get-NULL protocol for loops helps here:

    while [looper] [
        loop 2 [keep/(only) do body] else [break]
    ]

And CONTINUE happens to work incidentally; but the fact that you can't tell from outside a loop if it continued or not is almost certainly a problem for constructs that have an unusual concept of what CONTINUE means.

I'm sure there's some CATCH-based answer that could let people rig something up to manually control the reactions, but what I'm mulling over more is the implications for definitional BREAK and CONTINUE. Since the looper is not on the stack, there's no way to jump up to it. You'd have to make the looper implicitly do the code:

looper: make-loop-stuff :vars :data
bind body looper/state  ; would include BREAK and CONTINUE
looper [
    keep/(only) do body
] else [
    ; stuff to do if something in BODY ran a BREAK
]

You could wire in some cleanup code that would happen on CONTINUE or BREAK by hooking the functions in LOOPER/STATE/BREAK and LOOPER/STATE/CONTINUE.

Beyond other reasons of "just being cool", I think that this ability to tailor the BREAK and CONTINUE in customized loop constructs is looking like a very strong argument for making them definitional; e.g. specific functions generated for each loop that know which loop it is supposed to be broken or continued.

hostilefork · November 19, 2019, 8:24pm

I found these remarks from Gregg "interesting", in the sense that I agree with it 100%... but...

For the record, I would be fine with forall and forskip being mezzanines, as in R2. Forall is a subset of forskip and became a one-liner when it was redesigned. Before the redesign, both were dead simple. e.g. this was forall's original body:
while [not tail? get word][
    do body
    set word next get word
]
Forskip just used skip instead of next. The redesign added arg checking, resetting the series word, and returning the body's result in forskip. Then forall became: throw-on-error [forskip :word 1 body]

(...)

For me, an important aspect is leverage. The more we write in R/S, the more work it is to maintain and extend (in most, but not all cases), and the speed or features needs to be justified. In this case, you're interpreting a body no matter what, so you take a speed hit. And if handling break/continue is easier to do correctly in R/S, we should fix that for Red. We can cover a lot of control func cases ourselves, but allowing others to write them is a key feature of the language.

... but ... but ... you don't arrive at those solutions through random repetitions of what's been done already, and disregarding the body of knowledge and design concerns of the more sophisticated community members.

As cartoonist Tom Toro wrote:

history-repeat-tom-toro

hostilefork · August 5, 2020, 9:51pm

hostilefork:

I'll point out that this would be cleaner if LOOPER could be both a function and an object. This might suggest being done with FRAME!:
looper: make-loop-stuff :vars :data
bind body looper
while [do looper] [
    keep/(only) do body
]
But the current idea of frames is that they expire after you execute them, since it is generally understood that Rebol functions can mutate their arguments; so once a function finishes it may have completely trashed its state so it can't run again.

Revisiting this problem made me wonder if there's another possibility...namely making BIND to an ACTION! bind to that function's binding.

This isn't necessarily too crazy. Consider this R3-Alpha behavior where you can "bind a word to another word", which is just a way of saying "have the same binding as that word":

>> obj: make object! [x: 10 y: 20]
== make object! [
    x: 10
    y: 20
]

>> x-word: bind 'x obj
== x

>> get x-word
== 10

>> y-word: bind 'y word
== y

>> get y-word
== 20

We're currently able to bind functions to FRAME! (which powers things like definitional return) and OBJECT! (which is the means for derived binding). So why couldn't this hypothetical make-loop-stuff be a generator-like thing that is bound to a context with the loop variables?

You'd have to write it like this, which saves the DO on looper for execution, but makes you use a GET-WORD! in the bind statement:

looper: make-loop-stuff :vars :data
bind body :looper
while [looper] [
    keep/(only) do body
]

Though you could avoid that with:

bind body looper: make-loop-stuff :vars :data
while [looper] [
    keep/(only) do body
]

The most natural return value for looper is probably the bound object it is manipulating, and then NULL when it has no more.

This doesn't answer how to deal with the case where body is a function. But the ability to use functions as bodies for things like conditional branches and loops is a Ren-C invention, which may be less useful in loops.

I'm wondering if this could go along with a notion of iterators:

 >> iter: iterates [1 2 3]
 >> iter
 == 1
 >> iter
 == 2
 >> iter
 == 3
 >> iter
 ; null

 >> iter: iterates/vars [1 2 3] [x y]
 >> iter
 == make object! [
     x: 1
     Y: 2
 ]
 >> iter
 == make object! [
     x: 3
     y: '   ; e.g. null
 ]
 >> iter
 ; null

It gets a bit weird if BREAK and CONTINUE are part of the "loop stuff", however. Firstly they would show up in this object (unless they were hidden or put in some other special place). But further...if you're choosing where to run the do body, then if a BREAK happens where would it go?

But modulo BREAK and CONTINUE needing to be thought out, it seems like a potential step in a solution to the stated problem. ITERATES seems generally useful, and ITERATES/VARS seems a nice enough presentation of "MAKE-LOOP-STUFF"