Operational semantics of REBOL/Red/Ren-C?

I have thus far been unable to find a formal or informal document describing the operational semantics of REBOL/Red/Ren-C. Of course, one could point to the implementation of e.g. REBOL3 or Ren-C, but to my taste there are too many implementation details for the source code to satisfy this concern.

1 Like

There isn't really a middle-ground document that is somewhere between reading the evaluator source and following tutorials or examples. There probably should be, where it's a list of each datatype and what it "does".

But there is sort of a fundamental challenge in characterizing a language whose whole purpose is to bend, and be "the most freeform computing language ever invented". If someone comes and asks you "how does a SET-WORD! act", the answer is going to depend on where that SET-WORD! is.

As a simple example of the challenge, consider how you might want to tell people that a SET-WORD! evaluates one expression unit to its right, and then stores that into the word--evaluating the overall expression to the stored result. That's nice for having a way to do multiple-assignment that is clean-looking, and not some kind of special syntax:

x: y: z: 10 + 20

; acts the same as...

x: (y: (z: (10 + 20)))

This might give people comfort that after such an operation, they can state some "truth" like x = y = z.

But functions can look back one unit left, and mess with that. Here is a simplified version of DEFAULT (null-reactive only). ENFIXED means the function is marked as getting its first argument from the left. The tick mark on the 'LEFT indicates it's asking the callsite to suppress evaluation of that element:

default: enfixed func [
    'left [set-word!]
    right [block!]
][
    (get left) else [set left do right]
]

You get an interesting construct as a result...not built-in, and something that users could casually come up with:

>> x: null

>> x: default [1 + 2]
== 3

>> x
== 3

>> x: default [100 + 200]
== 3

>> x
== 3

But what if the operator returned something different than what it set LEFT to?

default: enfixed func [
    'left [set-word!]
    right [block!]
][
    (get left) else [set left do right]
    return <boo!>
]

Now there's no enforcement that x = y = z.

>> x: y: z: default [1 + 2]
== <boo!>

>> x
== <boo!>

>> y
== <boo!>

>> z
== 3

(Being able to do the left-quote of SET-WORD! is a Ren-C-ism...R3-Alpha couldn't quote left, only evaluate.)

Further with Ren-C-isms, variadics could swoop up all the SET-WORD!s and do something with them:

swoop: func ['items [<variadic> any-value!]] [
    while [set-word? first items] [  ; only one unit of lookahead (FIRST) allowed
         print ["Taking!" take items]
    ]
]

>> swoop x: y: z: 10 + 20
Taking! x:
Taking! y:
Taking! z:
== 30

So this sort of captures the greatness and terribleness of it. Quoting, dialects, and variadics pull the rug out from anyone hoping for solid ground. Should you tell people about the SET-WORD! rule first, or the quoting rule first? Who is the audience for the document, and how can it achieve the goal of being less obfuscating than the evaluator code it's trying to characterize?

There are of course some rules that keep the system running. Ren-C's interpreter design enforces things more strictly than R3-Alpha: the evaluator is limited to a sliding window in which it can only look one unit back and one unit forward (it doesn't use random seeks in the array it's evaluating, R3-Alpha does and hopes it doesn't mean multiple DO/NEXTs in a row act different than a monolithic DO to the end of the array). Leftward quotes win over rightward ones, unless the leftward construct has no rightward arguments (so you can say help -> and it doesn't think you're trying to make a lambda function with HELP as its single named argument). Etc.

But whatever characterization one put together about the rules would not look much like a lambda calculus specification.

I understand that it would look different than the lambda calculus and in my opinion it would not have to be formal to be useful (at least not initially). Your response already contains a wealth of information I had not previously seen, like the one item look-back etc.

For C we have an operational semantics based on a simplification of a computer with heap, stack, CPU, registers, instruction pointer, etc. For pure functional languages we can use graph reduction or e.g. the G-machine. It seems to me that this in currently somewhat hidden or implicit and making it more explicit seems helpful...

2 Likes

I'm definitely in favor of seeing something to this effect happen! I have my hands full with what I'm doing, but would can help how I can; there's a lot written here and I can kind of "index" it to answer most questions (often the answer is "here's the point where that issue reached a blocking point".)

To me, binding is the gaping hole in Rebol semantics. It's left out of the user guides...which consist of a lot of "what" with almost no "why". You learn how to mutate an unusual data structure, and are given some turnkey functions for internet and graphics that worked well cross-platform. It was a good zero-installation power tool for its time. Especially useful is PARSE--I don't know if you've messed with it much but it's quite a good example of a dialect in action.

Regarding binding: One of the problems that motivated me to start hacking on what became Ren-C was the issue of "definitional returns". I describe it in this post. It might be another illuminating example, for understanding the difference between Rebol2/Red and Ren-C.

More binding info:

I'd be happy to see it all replaced by better ideas, but these are the best I have so far for making the user experience that Rebol tried to give have a more legitimate basis.