R: a very Rebol-like language

bradrn · January 6, 2024, 7:49am

Since my first post above, I’ve been contemplating cases such as this:

> add_quasi <- function(arg1, arg2) quo({{ arg1 }} + {{ arg2 }})
> test <- function(arg) {
+     x <- 10
+     add_quasi({{ arg }}, x)
+ }
> x <- 5
> eval_tidy(test(x))
[1] 15

Here, add_quasi creates a quosure from both its arguments (using rlang’s quasiquotation syntax). test then passes its argument into add_quasi, alongside a local variable x. Finally, we call test with the global variable x. This gives a quosure containing code x + x, and the result of evaluating this is 10 + 5, i.e. 15.

Now, in Rebol this wouldn’t be too surprising, since each WORD! has its own binding. In R, however, it is: quosures associate one environment with a whole syntax tree, so we might expect it to evaluate to either 5 + 5 or 10 + 10. How does this work?

The answer becomes obvious if we print the quosure itself:

> result <- test(x)
> result
<quosure>
expr: ^(^x) + (^x)
env:  0x55dc115b55e8

This is a nested quosure! And of course, each one carries around its own environment:

> quo_get_expr(result)[[2]]
<quosure>
expr: ^x
env:  global
> quo_get_expr(result)[[3]]
<quosure>
expr: ^x
env:  0x55dc11710650

So the first x here is global, while the second is looked up in a local environment. (And the top-level quosure has a different environment yet again, in which + is looked up.)

If we choose, we can also collapse it into one big expression with a single environment, causing it to all be evaluated in the same scope:

> squashed <- quo_squash(result)
> squashed
x + x
> eval_tidy(squashed)
[1] 10

In fact, we should really ask rlang to warn us about such a destructive operation:

> quo_squash(result, warn=TRUE)
x + x
Warning messages:
1: Collapsing inner quosure 
2: Collapsing inner quosure

Actually evaluating nested quosures in R is slightly non-obvious, since quosures aren’t part of the base language. Insofar as I can tell, rlang implements eval_tidy using a bit of syntactic trickery — "quosure" is defined as a subclasse of "formula", a built-in variety of quoted code which uses a tilde for evaluation. eval_tidy then rebinds the tilde operator to evaluate quosures in their environments.

What would such a system look like in the context of Rebol? I think it would be a system where bindings can be associated with more than one type: not just WORD!, but also BLOCK! (and presumably also GROUP!, etc.). A WORD! would be evaluated in the context of its own binding if it has one; otherwise it would get looked up in the environment of the first containing BLOCK!/GROUP! which has a binding.

Considering such a system more seriously, I believe it would cope quite easily with code like that mentioned in Rebol And Scopes: Well, Why Not? :

 global: 10
 x: <not an integer>

 wrapper: func [string] [
     return do compose [interpolate (string)]
 ]

 foo: func [x] [
     let local: 20
     return wrapper {The sum is $(x + local)}
 ]

 foo 30

Starting with foo, it would bind its body block to a new environment. Then, foo 30 would cause the block to be evaluated — the words within it don’t have their own bindings yet, so everything executes within foo’s environment. foo proceeds to create local within its environment, and then it makes a string which is bound to its environment (since strings would also require bindings in order for string interpolation to work). Then, it calls wrapper, which similarly has bound its body to its own environment. As with foo, every word in its body would be looked up in wrapper’s environment — except for the variables in string, because string already has its own binding.

It also copes with this case:

Because you can still rebind whatever you want within a block. So a dialect can create its own environment, populate it with whichever words it wants, and then rebind words within a block to refer to that environment. And it all just works, because the rebound words will use their own bound environment, and the other words will use their parent block’s environment.

So, it seems to me like this system could just possibly work. But I’m admittedly bad with reasoning about this stuff — are there any edge cases I’ve missed? Is there some obvious reason why this wouldn’t work?