QUOTED! arrives (formerly known as "lit bit")

#1

I'm pleased to say that the day has come that you can use apostrophe (sigh, apostrophe) to escape any value...not just WORD! and PATH!. And since I said any value, that means you can also quote quoted values...as deeply as you like!

>> '(1 + 2)
== (1 + 2)

>> ''(1 + 2)
== '(1 + 2)

>> quoted? first [''{double quoted text!}]
== #[true]

>> quotes of first ['''''''''''''''''<whoa!>]
== 17

It opens up a whole new box of parts for dialecting, and makes your every day code shorter and clearer (when used properly). But don't be fooled by the fact that you can use a stupid number of quotes if you need: This isn't a frivolous thing, and being able to truly escape any value--hence including any quoted value--is integral to the feature.

The code that most directly benefits from having a clean/fast way to put an escaping bit on any value is language bindings, such as the C and JavaScript API clients. You may not realize how important this escaping is, because average script code generally assembles things that fetch out of WORD!s...where there's no risk of "double-evaluation". Users of the libRebol API aren't so lucky. You wind up putting QUOTE calls everywhere--slowing things down, junking it up, and fundamentally changing the types and shape of what you're working with. If you write sophisticated enough Rebol routines you've almost certainly run up against this problem too--but it's an issue on nearly every call into the API.

The quoting implementation is optimized to the point of being nearly free for depths less than 4, using something I call "in-situ escaping". Because of how I've made the C++ build check it at compile time, it significantly reduces the risks of such a tricky performance hack. Another very neat design point allows these new "QUOTED!s" to efficiently participate in binding...if their "contained" item is bindable.

Be warned, this is a radical change!

I'm sure you'll love it when all is said and done. But it changes the typeclass membership of LIT-WORD! and LIT-PATH!. They are no longer ANY-WORD! or ANY-PATH!, but instances of a new fully generalized quoted type. This will cause some speedbumps.

Long term, you wouldn't expect to see the terms "LIT-WORD!" and "LIT-PATH!". Instead you would use 'word! or 'path!, or something that fits into a more general scheme. But I've tried a good faith effort to get a smattering of compatible behaviors, which were able to wedge things in to keep them about as compatible as could be expected. More will be needed, so let me know.

For one thing, LIT-WORD! and LIT-PATH! are temporarily PARSE keywords. If you have a parse rule that previously worked with ANY-WORD! or ANY-PATH! or something of the sort, and you need to keep it working, PARSE recognizes these two explicitly. So put them in an alternates list:

 parse ['foo 'foo/bar] [[any-word! | lit-word!] [any-path! | lit-path!]]

In addition to being defined as parse keywords for that purpose, they are variables whose values hold the quoted datatypes that you'd get answers from TYPE OF for:

 >> lit-word! = type of first ['x]
 == #[true]

LIT-WORD? and LIT-PATH? are defined as well, along with TO-LIT-WORD and TO-LIT-PATH. Hopefully time and experience will give us ideas for how to do all these things better.

Generic type testing in action specs for quoted types isn't implemented. R3-Alpha only had 64 bits for datatypes, and there's an infinity of potential quotes you might be interested in. So there's just the one type for now--QUOTED!. You have to check what's in the quoted container after the call. (It's kind of how you can't ask to get "just a block of INTEGER!" today, only say BLOCK! and then check it.)

But for compatibility there is a trick. If you put 'word! or 'path! in your type spec (or LIT-WORD! or LIT-PATH! which evaluate to that), just those two can still be type checked by the system. For all other types, you have to just take a QUOTED! value, and examine yourself in the function body. This will be improved eventually, as type checking would have to grow up someday from the 64-bit limit...so quoting will be one of the things that factors into that design.

Since you could do things like GET on a LIT-WORD!, or APPEND to a LIT-PATH!, I've tried to set up some mechanisms for the cases I thought of. I even threw in some new weirder ones, like letting you add directly to a quoted integer and get a quoted integer at the same level back:

 >> add lit '''''1 2
 == '''''3

I didn't see a good general rule for this. It seems FIND on a quoted BLOCK! should return a position in the quoted block that is still quoted. But SELECTing or PICKing a value out of the block should ignore the container's quoting. It just seems like it has to be done on a case-by-case basis, for the semantics that make sense for the operation.

Basic Mechanics

When you ask the value for its type, the base type will itself be quoted with the level of depth of the value:

>> type of first [''(1 + 2)]
== ''integer!

To get the number of quote levels, use QUOTES OF. To get rid of any quoting present on any value, use DEQUOTE.

>> quotes of first ['''{triply quoted string}]
== 3

>> dequote first ['''<some-tag>]
== <some-tag>

All that happens with multiply quoted types is that each time the evaluator sees it, it will peel off one quote level:

 >> ''(1 + 2)
 == '(1 + 2)

 >> '(1 + 2)
 == (1 + 2)

 >> (1 + 2)
 == 3

This means inert types which are singly quoted get evaluated and lose the distinction from the plain inert type. So if you have a function that takes an evaluated argument (e.g. foo: func [x] [...]) you can't provide special behavior for foo '[block] that is different from foo [block]. The only way a called function will see the bit is if it quotes the argument, or if it's inside a dialect block (like a PARSE rule).

You can, however, get special behavior for foo ''[block], as it will receive a singly quoted block as an argument. And of course, it's now more practical to escape GROUP!s, so it might be worth it to start defining distinct behavior when groups are used since they'll be so easy to pass! (I have some ideas about this.)

Name Switcheroo: LIT <=> QUOTE

I explained in another post that 1 is a literal integer, while '1 is a quoted integer. So who knew: we've been using the terms LIT-WORD! and LIT-PATH! wrong all this time!

So in the new world, when you write something like literal x that should give you back x... a "literal word".

In the transitional period, I haven't redefined QUOTE yet. This should help you find existing places where you used it, and either change to LIT or just use apostrophe, as you can now do it for any type.

When QUOTE does make a comeback, it will add a quoting level to whatever it gets as an argument, with that argument being evaluated normally:

 >> x: 1
 == 1

 >> quote x
 == '1

 >> lit x
 == x

But until it comes back, this functionality is given by a function called UNEVAL. Don't get attached to that name...it'll just be renamed when the time comes.

Having a shorter way of getting literals without generalized quoting makes me feel better about apostrophe. if word = 'isn't [...] was an example of that made me unhappy, and if word = lit isn't [...] is tighter than if word = quote isn't [...]. So in addition to being more terminologically correct, it looks better.

QUOTE in PARSE is a synonym for LIT for the time being

Due to some annoying bootstrap issues, we can't update QUOTE in parse right away. But start migrating to LIT. (The full word LITERAL is also a keyword synonym...but is this necessary?)

You can use the rule ((uneval x)) for what will ultimately be expressible as quote x when that time comes.

MATCH added to PARSE for quoted types.

There's some troubles that we'll have to work out related to whenever something wanted to use DATAYPE! behavior to mean "match value of instance of datatype" vs. "match datatype itself". Using quoted datatypes to represent quoted instances runs afoul of the fact that you might want to do either thing.

My gut tells me we don't want to investing in a DATATYPE!-specific field which encodes its quotedness, that's distinct from quoting itself. That just seems too complex for its own good.

As an example of a simpler way of thinking about it, I've expanded MATCH to work with quoted types:

>> match [''integer!] (first [''2])
== ''2

Then, MATCH is also available as a PARSE keyword.

>> did parse [1 '2 ''3] [some [match [integer! 'integer! ''integer!]] end]
== #[true]

You can read more about MATCH here.

Hopefully this will be enough to get started using generalized quoting, and we'll learn as we go.

1 Like
TAG! - COMPOSE is IT (tag-specific composition)
NULL in the libRebol API...and VOID? => NULL?
The Thought that Won't Go Away: @ acting as LIT-WORD! acts?
#2

Generic quoting has proven to be pervasive and powerful; and a particularly strong ally in the API for controlling the evaluation of slots. So I thought I would record some "quotes" on it from the brain trust over at Red (to make it harder for them to change their minds, and to cement Ren-C's edge in this matter):

The triggering remark was that they were discussing some extremely convoluted and non-generic ways of doing literal branches. So @draegtun offered up Ren-C's solution:

Using QUOTE-ing offers consistency, so no need for variations of IF, EITHER, CASE, etc

>> case [false '[a] true '[b] false '[c]]
== [b]

Of course, he should know better than to think they can grasp interesting solutions! They're much too busy reliving their glory days and self-congratulating on the pure "genius" of Rebol2 ALSO (which is both trivial and confuses every piece of code it's used in, so no wonder they love it so much. ELIDE and invisibles trounce it.)

From Gabriele, on quoting:

"blocks are already un-evaluated by default. The non-quoted version would be paren! . I don't find that kind of "quoting" to be a sane solution, sorry.

The issue is when you are putting something in a slot and you don't know what it's going to be. Maybe it's a WORD!, maybe it's a PATH!, and maybe it's a BLOCK!. Or what if you have a PARSE rule, and you want to match a BLOCK! literally instead of treating something as a rule? And @draegtun has shown you yet another use.

Yet somehow, coming up with alternate names like either* or refinements that have to be put on every control construct is more "sane"? Nope, sorry.

From the legendary Gregg:

Quoting, to prevent evaluation, should be the rare exception, same as lit args. We have blocks, we are unevaluated by default, and funcs that evaluate them because that's the most useful thing to do for those functions .

We don't want to be more Lispy. Lit-word syntax is specific to words, not other values.

What we need to look at are real-world use cases, where you want to prevent evaluation, and compare those.

He brings up looking at real-world use cases. But not only are these people too prideful to study the vast corpus of analysis here in the first place (e.g. how to do splicing in the API), but here they are looking at a real world case and proposing wild workarounds. All we see is a tautology in saying
"Lit-word syntax is specific to words, not other values". Why? Because...Rebol2. That's the quality of thought you're looking at here.

And Nenad of course must join the chorus

Exactly. Another function for selective evaluation is clearly a simpler option than resorting to new esoteric lexical forms.

Funny to think that a generalized quoting facility is an "esoteric" form. It seems to be pretty regular to me.

(Of course when you've coded yourself into a corner and CAN'T change the system to add new features, it serves your interests to say any deviation from exactly what you've written is bad. You just don't have the chops to do better!)

So there you have it folks. Generic quoting (and soft-quoted branches) are a Ren-C exclusive, powering the libRebol API.

#3

My inclination with reflection is that types return the same no matter the quotedness:

>> type of 1
== integer!

>> type of '1
== integer!

And that the quotedness should be determined separately:

>> quotes of 1
== _ (or null, or whatever the flow blanking mechanism is)

>> quotes of '1
== 1

>> quotes of ''1
== 2 ; etc.

In Parse, you would need to match the exact number of quotes:

parse [1 '1 ''1 '''1] [
    '1
    quoted '1
    quoted quoted '1
    quoted quoted quoted '1
]

The return of quoted datatypes seems problematic.

Unrelated: is that a quoted number I see on the last example here?

#4

We can discuss that, but... I'd strongly advise against it. Firstly I'll point out that does create incompatibility with history:

rebol2>> type? 'x
== word!

rebol2>> type? quote 'x
== lit-word!

rebol2>> (type? 'x) = (type? quote 'x)
== false

And that's even with a pretty lax equality operator:

rebol2>> equal? 'x quote 'x
== true

Of course incompatibility is not something I worry about too much (unless there's no way to implement a compatible behavior!)

But my technical concern is that if you conflate a quoted type with its underlying type, then you would be winding up by default with dialects that inadvertently treat quoted things and non-quoted things identically. This just seems a bit chaotic to me, and it also will create problems down the road when one decides to start using quotes creatively in dialects.

Imagine that people using your dialect realize you haven't checked both the type and the quotedness.
So as a user, they don't bother stripping off the quotes when they are generating code (since an INTEGER! will accept a quoted INTEGER! anyway, why should they?) But then later, imagine you do decide you would like to distinguish the meaning of quoting to have it mean something distinct.

Added to all of this, the design has been very particular for performance so that the type-plus-quotedness check is efficient. It would be something of a setback if not done that way.

I think a big thing to resolve here is the nature of DATATYPE!. Given the desire for extensibility, the difficulty of coming up with a representation for them, and how they've been historically conflated with WORD!s...it makes me wonder if they should be WORD!s (PATH!s, TUPLE!s, BLOCK!s...?) How many routines are polymorphic, in the sense of taking something that may be a datatype, or may be a WORD! ?

This has been sort of a pattern internally to Ren-C; to take a lot of various things that were their own enumerations (e.g. action IDs) and go ahead and just use WORD!s for those IDs.