How "Quoting Levels" are Implemented (Isotope, Normal, Quasi, Quoted)

hostilefork · January 4, 2024, 6:16pm

The "game" of Rebol is played with cells that are the size of four platform pointers. So on a 32-bit platform a cell is 16 bytes in size, and on a 64-bit platform they are 32 bytes in size.

I've illustrated Ren-C's spin on this "game" previously:

The bits and bytes in the header are arranged in a platform-independent way. Regardless of the endianness of the machine, the bits in the header will be in the same order. The first byte is chosen with a pattern that specifically will never occur as a leading byte in a UTF-8 sequence...allowing an arbitrary pointer to be discerned as pointing to a cell or to the beginning of a UTF-8 string.

The "payload" is specifically aligned to a 64-bit boundary on both 32-bit and 64-bit platforms. This is important if it contains something like a double precision floating point number. It is also a union, which means that if it has constituent fields, they must be read from exactly the same union definition which was used to assign them. The "extra" is separate, meaning it is decoupled from the payload and can be assigned and read on its own terms (e.g. BLOCK! and WORD! could have a "binding" in extra that is read and written in common, without invalidating their payloads).

The HEART_BYTE encodes what we would think of as the underlying datatype, and cues the interpretation of the contents of the cell. For instance the byte corresponding to a BLOCK! tells us that the payload consists of a pointer to an array of more cells as well as an index into the block.

Enter the QUOTE_BYTE

Bits in the header are scarce. And at one time, quoting was implemented by only two bits... for quoting levels of 0, 1, 2, or 3. Higher quoting levels were achieved by changing the HEART_BYTE to indicate QUOTED!, and then the payload was changed to point to a single-element array that held the quoted cell, and an integer of the quoting level up beyond millions. It was tricky to do, but it worked.

Eventually, the complex mechanics behind flipping to a different payload for higher levels of quoting was scrapped, and an entire byte in the header was sacrificed for the quote level. This permitted from 0-255 levels of quoting, and I decided that was more than enough.

When isotopes were originally introduced, there was a flag taken to say something was an isotope. However, I realized that something should not be quoted and be an isotope at the same time. Hence the isotopic state could be thought of as a special value of the QUOTE_BYTE.

Initially I chose 255 for isotopes, leaving 0-254 as the ordinary quoting levels. But the theory of isotopes evolved to where not only were there isotopes, but there needed to be a form of quoting that would produce isotopes under evaluation...so-called quasiforms. And it began to make sense to think of the isotopic state as being obviously "less" than other quoting levels, so it became 0.

What it worked out to was:

Quoting byte of 0 is an isotope: #define ISOTOPE_0 0
Quoting byte of 1 is plain not-quoted: #define UNQUOTED_1 1
Quoting byte of 2 is a quasiform: #define QUASI_2 2
Quoting byte of 3 is single quoted plain form: #define ONEQUOTE_3 3

A quoting byte of 4 is a single-quoted quasiform. e.g. there's no such thing as a quasi-quoted, just a quoted-quasi: '~foo~ is legal but ~'foo~ is not.

So the interpretation of the QUOTE_BYTE proceeds like that.

Quoting byte of 5 is a double quoted plain form
Quoting byte of 6 is a double quoted quasiform
Quoting byte of 7 is a triple quoted plain form
Quoting byte of 8 is a triple quoted quasiform

etc.

So although mechanically the byte holds a value of 0 for isotopes, it still conceptually is what we'd think of as "negative one".

Evaluator drops one level of quoting, with the base case that quasiforms produce isotopes and the normal form does whatever its evaluator rule is (WORD! looks up, etc.)

The QUOTE operator won't work on isotopes and the UNQUOTE operator won't work on quasiforms. Instead you have to use the META and UNMETA operations, which handle those exceptions but just act like QUOTE and UNQUOTE otherwise.

hostilefork · January 5, 2024, 5:33am

Sanity check. Isotopes are weird, and creating them (or tolerating them) when you didn't mean to is good to avoid.

Unstable isotopes are particularly weird...for instance isotopic BLOCK! cannot be stored in variables. In normal assignments it decays to its first element, while multi-assignments and other particular constructs interpret it as multiple values:

>> quasiform: '~[1 2]~
== ~[1 2]~

>> x: unmeta quasiform
== 1

>> [x y]: unmeta quasiform
== 1

>> y
== 2

This is how multi-return functions are implemented.

What's nice about UNQUOTE vs UNMETA is that if you are really not suspecting you're dealing with a quasiform that should become an isotope, you can avoid the potential confusion caused.

And QUOTE has a similar convenience of not tolerating an isotope when you didn't think there'd be one.

A narrower operator of QUASI and UNQUASI exists to make it clear when you're only adding or removing a quasi state. So you can QUASI only plain non-quoted things, and UNQUASI only non-quoted quasiforms. Just helps readers get their bearings on what's going on if that's all that's happening.

hostilefork · January 5, 2024, 7:29am

A post was split to a new topic: Why Have Both BLOCK! and GROUP!