LET there be VARs (?)


#1

I have been increasingly thinking of being able to accomplish this:

>> x: 10 (use [x] [x: 20 print [x]]) print [x]
20
10

With code that looks like this:

>> x: 10 (var x: 20 print [x]) print [x]
20
10

How would this mystical VAR work, you ask? Well you can do it poorly today by making VAR a variadic which takes all the ensuing code in its feed, turns it into a block, binds that block, and evaluates it. It’s pretty ugly, though…inefficient, plus you’ll have VAR on the stack while the rest of the code is running.

We could do better, though, if VAR did its binding operation on the variadic feed itself…and the evaluator picked up the hint. e.g.:

 var: function [name [set-word!] args [any-value! <variadic>]] [
     evaluate/set args name // this is just an upgraded do/next
     bind args name
     return get name
 ]

So all the VAR did was the assignment, and then it gets off the stack. But the evaluator could notice the binding request on the feed. Now it knows that as it goes along, it needs to weave in this extra binding information. That weaving is essentially the problem virtual binding is supposed to address.

I think this is pretty interesting. Should it be called LET, or VAR?


#2

Although I would never name a word (variable) VAR, VAR seems a bit pseudocode-like — it would probably be assumed to be a generic “variable” rather than “variadic”.
So while I prefer VAR for its intuitiveness— it does seem more direct in meaning, I think the usage here warrants a word more formal like LET.


#3

That is the intended meaning. var x: 1 is only variadic as an implementation detail of how it manages to pass its “variable declaration” on to influence the stream of code that comes after it.

But…can a “virtual” approach actually work?

Bindings Need to be Preserved

Rebol constantly passes blocks of symbolic code from one routine to another routine to execute or inspect. Even though the implementation of the called routine may have some local variables expressed symbolically with the same name–there can’t be interference.

do-code: func [code <local> x] [
    x: 2 | do compose [-- "DO-CODE's x is" (x) | (code)]
]

my-print: func [arg <local> x] [
     x: arg | do-code [-- "MY-PRINT's x is" x]
]

If you call my-print 1, you will get:

-- "DO-CODE's x is" x: 2
-- "MY-PRINT's x is" x: 1

This is foundational. It has to be the case that inside the body of DO-CODE, both x’s preserve their intended binding–so long as DO-CODE doesn’t “mess with x” inside a block that defines x.

Can a naive VAR implementation do that?

Let’s try the same thing using VAR instead of :

do-code: func [code] [ // now no `<local> x`
    var x: 20 | do compose [-- "DO-CODE's x is" (x) | (code)]
]

my-print: func [arg] [ // also no `<local> x`
     var x: arg | do-code [-- "MY-PRINT's x is" x]
]

An inefficient implementation of VAR could build a block out of everything that came after it, bind it, and DO it. That would work because there’d be a clear moment in time where it ran. It would essentially do the same thing as USE, so it would do this transformation:

 var x: arg | do-code [-- "MY-PRINT's x is" x]
 => use [x] [x: arg | do-code [-- "MY-PRINT''s x is" x]

 var x: 20 | do compose [-- "DO-CODE's x is" (x) | (code)]
 => use [x] [x: 20 | do compose [-- "DO-CODE's x is" (x) | (code)]

In today’s world, that’s quite a lot slower than using a fixed binding to a local. The USE has to deep copy and rebind the block, create a new separate context, bind the Xs into it, etc.

But yes, it works. Outside of being slower, you wouldn’t know a difference from an explicit <local> or gathered one… except that the X would only apply from where its VAR was declared up to the end of the block or group it’s in. And this is actually a fairly desirable property; if you want it to apply more widely, you apply it at a more outer scope.

But could a savvier VAR work efficiently?

Let’s imagine a VAR that doesn’t actually bind anything in-the-moment. Instead it puts a little bit of “magic” onto the executing frame, via the suggested trick of binding a VARARGS! feed. This would end up gluing the magic onto values as the frame went along in processing its current level.

I’ve used ~1 and backslashes to denote the augmentation of the ensuing values:

 var x: arg | do-code [-- "MY-PRINT's x is" x]
 => ~1\do-code\ ~1\[-- "MY-PRINT''s x is" x]\

So now, when DO-CODE is passed the BLOCK!, that block has some extra baggage. Its own body has a VAR, that puts its magic on too. We’ll denote that with ~2:

 var x: 2 | do compose [-- "DO-CODE's x is" (x) | (code)]
 => ~2\do\ ~2\compose\ ~2\[-- "DO-CODE's x is" (x) | (code)]\

This magic is supposed to be a no-op on anything it doesn’t apply to, so ~1\do-code\ just runs DO-CODE with its existing binding, same for the DO and COMPOSE.

So our interesting question is what happens when the “magic” conflicts. Whatever ~ means, it can’t mean “override any binding of anything under me named x”. This would risk ~2 beaming its influence down into the composed code and overriding the ~1 that is intended to guarantee MY-PRINT's x is 1.

I don’t offhand have the answer to this…

…and don’t have time to wrestle it today. But this is the phrasing of the problem, and it resembles other problems that mechanics have been created to solve (definitional returns and derived binding both use variations of this “magic”.)

I’m getting the feeling that if there were a good solution to it, that it could well be a better answer to how to deal with locals than the “locals-gathering” that FUNCTION does. While being able to do such gathering may make sense for some constructs–and Rebol should certainly be able to do it–it has a number of huge downsides when it collects things it shouldn’t. It would be a real victory to get down to just one type of FUNCTION…with a working VAR…that’s still in line with Rebol’s model and permits the cool feats.


#4

Looking at this in practice there are some unfortunate limitations.

while [var x: find blah blah] [...]

Not only do you not want to be declaring the variable every time you do a loop iteration, the x would not be visible in the body–which is where you’d likely want to use it. Because its scope would end at the end of the block. You’d have to write:

var x
while [x: find blah blah blah] [...]

That’s not the end of the world, and possibly worth it if everything else can be made to work efficiently.

The issue of safety does come up with locals that aren’t gathered, which is one of the problems historically with FUNC. While today’s FUNCTION may err on the side of creating too many locals for the function frame (especially when defining objects or nested functions inside the body), it does keep you from writing stray values into the global context.

This suggests that even in a world of working VARs. maybe keeping a distinct FUNC and FUNCTION, where FUNCTION would unbind every SET-WORD! in the body by default that wasn’t part of the arguments, <local>s, <with>s, or <static>s.

In any case, it’s still a promising idea…but this shows a couple more considerations to add into the pile.


#5

Another name might be LOCAL.

That said, I think being able to switch gathering locals on and off im FUNCTION itself is a real asset.


#6

Everybody used FUNC all of the time, FUNCTION hardly ever was used. That had a reason and in Red the FUNCTION was reclaimed to behave like FUNC.
I like the automatic gathering of locals very much, certainly for the many quick scripts one makes. When scripts should be used in a production environment, and being maintained, I can see code review demanding declaration.
As Ingo points out to be able to switch this behaviour on and off there could be some tag like to be added when automatic gathering should be off, forcing the programmer to declare the variables beforehand. If it should be itself if you already declare some locals has to be considered, perhaps in a testing phase one wants to be able to add some helper words in the local context without the need to declare and making sure you do not keep them in the declarations when shipping.


#7

One thought that crossed my mind: for debugging it would be great to be able to inspect which words have been gathered. Is this possible?


#8

This has been part of the design of FRAME!..the idea is that with frame contexts you can reflect and work with actions in all kinds of ways, both in the debugger and as part of your creation of runtime constructs.

foo: function [return: [block!] arg] [
   local: 10
   frame: binding of 'return // idiomatic choice...could use arg, local, frame
   return words of frame
]

>> words of :foo
== [arg] // asking ACTION! gets only the interface

>> foo 20
== [arg local: frame: return:] // asking FRAME! gets locals as SET-WORD!

(Note: Be merged up before trying this…the ability had atrophied somewhat. While FRAME! has to be in working order for pretty much any of the system to run, frame reflection can get all kinds of confused. Thanks to this post, there is now one test–which is infinitely better than zero. But more would be great if anyone experiments and wants to put those experiments in as tests of what-should-work.)

This could be a weakness for VAR, though…

If you can virtually add locals to the binding stream, how would that be made visible? Would the frame appear to be augmented? Would that frame have a different identity than a non-augmented frame?

I hadn’t thought about this aspect of the question, so good that you brought it up! Will have to take it under consideration.