LET Binding? (A Limited Form of Virtual Binding?)

hostilefork · December 15, 2020, 7:55am

I've previously lumped LET in with the question of how we might avoid copying loop bodies every time the loop runs.

They seem like similar problems. You have a named variable (or variables) that you want to allocate space for, and then bind to code that hadn't previously been aligned with an interest in those variables.

But maybe there's value to thinking about LET as a separate problem.

Today's "Fake" LET

Consider the following situation:

foo: func [code] [
    let a: 1
    do compose [a + (spread code)]
]

>> a: 1000
>> foo [a + 20]
== 1021

In today's world, LET is a no-op. All it does is signal FUNC...while it's making a "relativized" copy of the body...to collect a in that list. That's exactly how old FUNCTION once looked for SET-WORD!s), meaning the above is to if you had said foo: func [code <local> a]. It's considered less harmful since we use LET more narrowly than we use SET-WORD!

Though a new copy of the function body is created, it's a copy that is re-used every time the function runs. That allows frames be resolved against the blocks in a cascading fashion as they are delved into.

So the reason you see 1021 and not 1002 is because the walk of FOO's body doing the local binding for a never saw [a + 20], it just saw code. Manual binding with the a after composition would mean both notions of a would be forced to match:

foo: func [code] [
    let a: 1
    do bind (compose [a + (spread code)]) 'a
]

>> a: 1000
>> foo [a + 20]
== 1002

That's lossy..and the reason you don't see many people messing with BIND, and just leaving the bindings as-is. Incidental name sharing can create havoc outside of carefully controlled circumstances. You're better off saying DO CODE instead of composing material live into places where the bindings can get mucked with.

One Small New Idea For a Virtual LET

The "real" LET idea considered that LET isn't a no-op, but that it somehow augments the context as it runs...making the variable it adds visible in the flow after it:

foo: func [code] [
    ; at this point, there is no `a` defined
    let a: 1
    ; going into the DO statement, there's an ambient new `a` that
    ; is in effect now
    do compose [a + (spread code)]
]

One issue with virtual binding is the cost of consulting a side-table for what might be every word lookup...to know if it has been overridden or not.

But another issue is semantic. How would it know which a to override, and which to leave alone? If it overrides all the As, then it's like we've got that lossy BIND operation running on everything.

However...what if we don't think of LET as being fully general? For starters, let's imagine a LET that only can be used if it occurred inside of a relativized function body (as this one does). It could then say that its override only applies to other A that also originated out of the original function body.

It's a mechanic that would work on a somewhat limited basis. New LETs that you conjure out of thin air won't work. But then again, "pre-scanning for SET-WORD!s" didn't let you just suddenly fabricate new code once the function was running and gather set-words as local. At least here you could get an error... "your LET is too late, not relativized in a function body."

So in the above you could say do compose [let a: 10, a + (spread code)] but you couldn't put a LET in the code, compose it in at runtime, and have it work. At least it wouldn't work the same way. But I think perhaps the best idea is to have it not work at all.

Performance And Semantic Implications

At first glance this looks a lot like what would happen if you gather things locally to the frame. Though one difference is that the runtime nature of LET would scope its influence to blocks:

a: "global"

foo: func [] [
    block: [print a]
    do block  ; prints "global"
    loop 2 [
        let a: "local"
        do block  ; prints "local"
    ]
    comment [do block]  ; prints "global"
]

That also points out a cost factor: with that LET happening each time through the loop, you are generating a new unique identity.

two-as: func [] [
    result: collect [
        repeat n 2 [
            let a: n
            keep 'a
        ]
     ]
]

>> data: two-as
== [a a]

>> reduce data
== [1 2]

Even keeping an open mind, that's pretty brutal to think about. Also these lets would be strangely promiscuous, willing to bind against any relativized material from the function body despite having such an identity.

A tidier notion might be to say that the dynamic frame would only create one instance of these LETs. It would still have the "global" and "local" distinguishing behavior.

Anyway, just a little more thinking about the possibilities. There might be something here to thinking that LET runtime augments frames in a scoped way, but if the LET creates new variable identities on every loop iteration it seems to cause more problems than it would solve.

hostilefork · February 2, 2021, 7:42pm

I might have an idea for making this work at least a little better. We could attach the LETs in an additive fashion to the frame of the block they are being used in.

Blocks have a place to put this information because while developing stackless, I had to throw out a previous efficiency trick. The trick was that when a BLOCK! was running, there wouldn't be a dynamically referrable frame to talk about.

So that would mean if you ran:

do [((do [breakpoint]))]

You could refer to a FRAME! for the outer DO, and a FRAME! for the inner DO. But the BLOCK!s and GROUP!s...even though they had corresponding "stack levels"...did not have an object (a FRAME!) you could talk refer to them with user code.

Hence, for better or worse, these frames started paying more cost, for...heap nodes. So these nodes are available to serve as the place where LET variables live.

A BLOCK! that might choose to cache the number of LETs it saw previously, to make a better guess for how much space to allocate when it sees them next time. There's not a lot of bits to go around for this (with filename and line number already chewing up bits). But since arrays get allocated in multiples of 2 in the memory pool, you could cover the most common cases with just a few bits.

Virtual Visibility Problems

This would imagine that if a block had any LETs in it, that you could use the same virtual bind element. In fact, the BLOCK!/GROUP! frame node itself could be the virtual bind "patch"...it would just need to be able to store the links for the chain somewhere. Solvable problem.

What's harder would be discerning which LET bindings were in effect, and which weren't, if you had LETs in the middle of blocks. e.g.:

 a: 10
 b: 20
 (let a: 1000, code: [a + b], do code, let b: 2000, do code)

Should that print 1020 twice...or 1020 and then 3000?

If you want it to be 1020 twice, there'd need to be something in the binding broadcast to code to say the B in [a + b] was between those lets. This would mean the link in the virtual binding chain would have to "capture the moment", e.g. by having a high-water mark on how many fields had been added to the block. That's basically a new identity, and you have to pay for that identity somehow.

Without that high-water mark, you'll have later LETs coming in and influencing extant bindings. So the 1020 and 3000 situation.

But...How Bad Would That Be, Though?

JavaScript's LET does the same thing:

> var x = 10
> var y = 20
> function foo() {
    let x = 1000
    let f = function() { return x + y; }
    let y = 2000
    return f()
}

> foo()
<- 3000

The variables aren't actually set until they're reached, but their "binding" comes from their scope.

This burns us on caching, however. It means objects would need something like "tick on which last expansion of object was done" and then virtually bound things would need "tick of last cache".

Anyway...I just wanted to mention the possibility of piggybacking onto an already-existing node. Could make things slightly more in the realm of practicality.