Syntax for Subverting Virtual Binding

hostilefork · December 29, 2020, 2:45pm

In the closing weeks of 2020, I've been committing myself to "Virtual Binding's Last Stand". The results are encouraging. While it's a bit premature to declare that "it works!"...it can bootstrap itself and run the tests without crashing. (This is with applying virtual binding to all XXX-EACH, USE, and MAKE OBJECT! cases.)

Despite the fact that what I have now is kind of the most clunky form of itself, it hasn't degraded performance noticeably. Unfortunately, the main reason it looks so acceptable is because the method it was using from R3-Alpha there before really, really sucked.

Anyway...no matter how good virtual binding gets, it will always have more cost than referring to a fixed variable. So if you have a variable you want to reuse, we need a way to say that.

Red's "Idea": Reusing Variables is the Default (and only) option

Here's how Red rolls:

 red>> print-nums: func [] [foreach x [1 2 3] [print ["Value is" x]]]

 red>> x: 10

 red>> print-nums

 red>> x
 == 3

So DocKimbel's answer to the performance and semantic inconvenience of copying-and-rebinding the loop body is "ignore it". You have to define the variable.

 red>> print-nums: func [/local x] [foreach x [1 2 3] [print ["Value is" x]]]

 red>> x: 10

 red>> print-nums

 red>> x
 == 10

This is different from Rebol2 and R3-Alpha, which bear the cost of allocating a new variable, and then copying and rebinding the body to it:

rebol2>> body: [append [] <test>]
rebol2>> foreach x [1 2] body
rebol2>> body
== [append [] <test>]

red>> body: [append [] <test>]
red>> foreach x [1 2] body
red>> body
== [append [<test> <test>] <test>]

(As a point on why this is semantically slippery, notice I didn't say body: [insert body <test>] in the Rebol2 case. The reason is that since the body is copied, you don't have a name for the loop's copy of the body to see the effects on it... so that would change the original body. You have to modify an array stored inside the body to see what's going on. Just another point of complexity in making copies behind-the-scenes...)

Red's Choice is a Bad Default (right?)... but it should be an Option

Even though virtual binding does not copy the body, it won't be without cost. (Its existence isn't really about performance primarily anyway...it's about semantics, and not corrupting block bindings on a whim.)

So you should have a way of asking to just use an existing variable if you need to.

Early on I used quoted words as a way of telling the FOR-EACH to reuse a variable:

 print-nums: func [<local> x] [
     for-each 'x [1 2 3] [print ["Value is" x]]]
 ]

When using a BLOCK! spec, you could mix and match...so some fields were reused and others were part of a generated object:

 print-nums: func [<local> x] [
     for-each ['x y] [1 2 3 4] [print [x y]]]
 ]

But I've mentioned that for reasons of one's "formal" senses (or necessary suppressing the behavior of a left-quoting WORD!) that you might want to use quoted words as the argument to FOR-EACH anyway. On top of that, I don't know that we've come to a full consensus that it shouldn't be required...though I'm leaning heavily to saying it shouldn't be.

So what other choices are there? GET-WORD! is already taken to mean "subvert quoting, this word holds the name of the variable(s) to use"... so for-each :x is a shorthand for as a shorthand for for-each :(x)

for-each /x [...] [...]
for-each .x [...] [...]
for-each @x [...] [...]

None of these really scream "use an existing variable" to me. And for-each/reuse wouldn't let you mix and match.

The spec dialect itself could be fancier...maybe importing things #with, and you could use commas to break it up:

for-each [#with x] [...] [...]  ; reuse X
for-each [#with x, y] [...] [...]  ; reuse X, virtual bind Y
for-each [#with x y] [...] [...]  ; both X and Y are reused

Any Other Ideas? Comments?

I'm going to assume everyone agrees that virtual binding should be the default, giving a Rebol2/R3-Alpha-like experience. If you didn't agree with that, and wanted Red's behavior as default, you might suggest things like for-each x: [...] to request a new variable. But that's flat-out confusing, and for-each 'x: [...] isn't much of an improvement.

I'm kind of bummed the distinction of for-each x vs. for-each 'x isn't an option. Something about that seemed nice and clean.

We're still shaping up exactly what the @-family does...in particular if it's going to mean anything w.r.t. datatypes. Maybe that's not its destiny. So best option is probably going to be for-each @x. It leans somewhat on the "as-is" meaning (use the variable as-is).

You won't expect to see this all that often, unless someone is trying to optimize something for performance.

rgchris · December 29, 2020, 5:07pm

I'm not quite clear why for-each x vs for-each 'x isn't an option—it does seem like the cleanest choice.

hostilefork · December 29, 2020, 5:27pm

Please see "Speaking With Tics" (and have an opinion on it!)

rgchris · December 29, 2020, 6:12pm

My conservative hat would say that the proposals therein aren't worth the cleaner expression up front. It's difficult to be certain due to a dearth of code written with those idioms—they may arguably be more expressive on the whole, but they come at a cost.

That said, I'm not wholly a conservative (that's just the part that is still only productive in Rebol 2), I'm open to seeing how this would resolve.

hostilefork · December 29, 2020, 8:14pm

Helps to be more specific. It's a long post.

You think t: 'type of x is preferable on the whole to t: type of x?

I'd think that you would side on the idea that words should be the currency with as little decoration as possible.

rgchris · December 29, 2020, 9:58pm

I'd say if type of pushes the cost of clarity elsewhere, then it's more problematic that it's worth.

If for-each 'thing [...] [...] is the natural way to express this behaviour (which it seems so to me—at least compared to the alternatives), then it'd be worth revisiting type of.

I do. I see quoted words in the for-each example as a reasonable tradeoff to express the difference in behaviour moderately consistent with other uses in the language—especially if that is the exceptional case.

hostilefork · February 10, 2021, 8:53am

This question of a syntax for subverting virtual binding comes up again when doing LET, due to multiple returns.

If you want to define two all new variables to receive the results of some calculation, that can work:

let [value pos]: parse "aaaabbb" [some "a"]

But what if only one of these variables is new, and you want the other to be assigned to the existing binding? You need some way of using the position, but escaping the LET.

Quoting words remains a possibility.

let [value 'pos]: parse "aaaabbb" [some "a"]

Which would wind up acting like:

let value
[value pos]: parse "aaaabbb" [some "a"]

I've gone ahead and implemented this. The more general rule is that whatever it is you quote gets unquoted by one level...the LET assumes the quoted thing is not for its consumption, but for whatever is going to be consuming the SET-BLOCK! later. That could be a multi-return, or anything else that left-quotes SET-BLOCK!s.

If this sticks, it would be consistent if the FOR constructs discerned ticks on the argument to mean "reuse". If you put a tick on a GROUP! that could mean to reuse a calculated variable:

for-each (first [x]) [...] [...]   ; create new binding labeled X
for-each '(first [x]) [...] [...]   ; reuse whatever binding is on the X

There are only two seeming arguments against this:

People might think FOR-EACH 'X makes more sense because you are not "using" the X, and the callsite should reflect that. Hence FOR-EACH X should either be illegal, or pass in the word/block/whatever that X holds.
People might think FOR-EACH 'X and FOR-EACH X should mean the same thing, with people being able to have the tick at the callsite if they know the thing they are using is a left-quoting operation which might otherwise quote the FOR-EACH before the FOR-EACH can quote it.

If you happen across a left-quoting situation, you can still beat it with for-each ('x)...or for-each 'x if you were planning to reuse the binding anyway.

Anyway...the gravity is now on the side of saying a tick reuses the existing binding. At least there's a philosophy and explanation behind the choice.

hostilefork · February 16, 2021, 12:17pm

I'll point out that this is yet another defeat to the idea of using a BLOCK! the state for single-stepping.

let [x 'y]: whatever

If you single step, the LET will create a virtual binding for X and leave Y alone. Then LET itself vanishes. What remains is:

[x y]: whatever  ; with virtual binding on x

If the evaluator is limited to having to return the state in terms of a position in the original block, it wouldn't have the option of giving this back as the state for the next step... since there is no [x y]: there. Just [x 'y]:

My feeling is that the evidence is pretty strong on the side of saying evaluator state should be more flexible than a BLOCK!. We must keep looking at how to empower code that needs to single-step (I've described ASSERT and TLS EMIT) but accept that what comes back is a FRAME! and not the previous simplistic BLOCK! of a position that the historical DO/NEXT had.