QUOTED! arrives (formerly known as "lit bit")


#1

I'm pleased to say that the day has come that you can use apostrophe (sigh, apostrophe) to escape any value...not just WORD! and PATH!. And since I said any value, that means you can also quote quoted values...as deeply as you like!

>> '(1 + 2)
== (1 + 2)

>> ''(1 + 2)
== '(1 + 2)

>> quoted? first [''{double quoted text!}]
== #[true]

>> quotes of first ['''''''''''''''''<whoa!>]
== 17

It opens up a whole new box of parts for dialecting, and makes your every day code shorter and clearer (when used properly). But don't be fooled by the fact that you can use a stupid number of quotes if you need: This isn't a frivolous thing, and being able to truly escape any value is integral to the feature.

The code that most directly benefits from having a clean/fast way to put an escaping bit on any value is language bindings, such as the C and JavaScript API clients. You may not realize how important this escaping is, because average script code generally assembles things that fetch out of WORD!s...where there's no risk of "double-evaluation". Users of the libRebol API aren't so lucky. You wind up putting QUOTE calls everywhere--slowing things down, junking it up, and fundamentally changing the types and shape of what you're working with. If you write sophisticated enough Rebol routines you've almost certainly run up against this problem too--but it's an issue on nearly every call into the API.

The quoting implementation is optimized to the point of being nearly free for depths less than 4, using something I call "in-situ escaping". Because of how I've made the C++ build check it at compile time, it significantly reduces the risks of such a tricky performance hack. Another very neat design point allows these new "QUOTED!s" to efficiently participate in binding...if their "contained" item is bindable.

Be warned, this is a radical change!

I'm sure you'll love it when all is said and done. But it changes the typeclass membership of LIT-WORD! and LIT-PATH!. They are no longer ANY-WORD! or ANY-PATH!, but instances of a new fully generalized quoted type. This will cause some speedbumps.

Long term, you wouldn't expect to see the terms "LIT-WORD!" and "LIT-PATH!". Instead you would use 'word! or 'path!, or something that fits into a more general scheme. But I've tried a good faith effort to get a smattering of compatible behaviors, which were able to wedge things in to keep them about as compatible as could be expected. More will be needed, so let me know.

For one thing, LIT-WORD! and LIT-PATH! are temporarily PARSE keywords. If you have a parse rule that previously worked with ANY-WORD! or ANY-PATH! or something of the sort, and you need to keep it working, PARSE recognizes these two explicitly. So put them in an alternates list:

 parse ['foo 'foo/bar] [[any-word! | lit-word!] [any-path! | lit-path!]]

In addition to being defined as parse keywords for that purpose, they are variables whose values hold the quoted datatypes that you'd get answers from TYPE OF for:

 >> lit-word! = type of first ['x]
 == #[true]

LIT-WORD? and LIT-PATH? are defined as well, along with TO-LIT-WORD and TO-LIT-PATH. Hopefully time and experience will give us ideas for how to do all these things better.

Generic type testing in action specs for quoted types isn't implemented. R3-Alpha only had 64 bits for datatypes, and there's an infinity of potential quotes you might be interested in. So there's just the one type for now--QUOTED!. You have to check what's in the quoted container after the call. (It's kind of how you can't ask to get "just a block of INTEGER!" today, only say BLOCK! and then check it.)

But for compatibility there is a trick. If you put 'word! or 'path! in your type spec (or LIT-WORD! or LIT-PATH! which evaluate to that), just those two can still be type checked by the system. For all other types, you have to just take a QUOTED! value, and examine yourself in the function body. This will be improved eventually, as type checking would have to grow up someday from the 64-bit limit...so quoting will be one of the things that factors into that design.

Since you could do things like GET on a LIT-WORD!, or APPEND to a LIT-PATH!, I've tried to set up some mechanisms for the cases I thought of. I even threw in some new weirder ones, like letting you add directly to a quoted integer and get a quoted integer at the same level back:

 >> add lit '''''1 2
 == '''''3

I didn't see a good general rule for this. It seems FIND on a quoted BLOCK! should return a position in the quoted block that is still quoted. But SELECTing or PICKing a value out of the block should ignore the container's quoting. It just seems like it has to be done on a case-by-case basis, for the semantics that make sense for the operation.

Basic Mechanics

When you ask the value for its type, the base type will itself be quoted with the level of depth of the value:

>> type of first [''(1 + 2)]
== ''integer!

To get the number of quote levels, use QUOTES OF. To get rid of any quoting present on any value, use DEQUOTE.

>> quotes of first ['''{triply quoted string}]
== 3

>> dequote first ['''<some-tag>]
== <some-tag>

All that happens with multiply quoted types is that each time the evaluator sees it, it will peel off one quote level:

 >> ''(1 + 2)
 == '(1 + 2)

 >> '(1 + 2)
 == (1 + 2)

 >> (1 + 2)
 == 3

This means inert types which are singly quoted get evaluated and lose the distinction from the plain inert type. You can't thus provide special behavior for if condition '[block] that is different from if condition [block]. The only way a called function will see the bit is if it quotes the argument, or if it's inside a dialect block (like a PARSE rule).

You can, however, get special behavior for if condition ''[block], as it will receive a singly quoted block as an argument. And of course, it's now more practical to escape GROUP!s, so it might be worth it to start defining distinct behavior when groups are used since they'll be so easy to pass! (I have some ideas about this.)

Name Switcheroo: LIT <=> QUOTE

I explained in another post that 1 is a literal integer, while '1 is a quoted integer. So who knew: we've been using the terms LIT-WORD! and LIT-PATH! wrong all this time!

So in the new world, when you write something like literal x that should give you back x... a "literal word".

In the transitional period, I haven't redefined QUOTE yet. This should help you find existing places where you used it, and either change to LIT or just use apostrophe, as you can now do it for any type.

When QUOTE does make a comeback, it will add a quoting level to whatever it gets as an argument, with that argument being evaluated normally:

 >> x: 1
 == 1

 >> quote x
 == '1

 >> lit x
 == x

But until it comes back, this functionality is given by a function called UNEVAL. Don't get attached to that name...it'll just be renamed when the time comes.

Having a shorter way of getting literals without generalized quoting makes me feel better about apostrophe. if word = 'isn't [...] was an example of that made me unhappy, and if word = lit isn't [...] is tighter than if word = quote isn't [...]. So in addition to being more terminologically correct, it looks better.

QUOTE in PARSE is a synonym for LIT for the time being

Due to some annoying bootstrap issues, we can't update QUOTE in parse right away. But start migrating to LIT. (The full word LITERAL is also a keyword synonym...but is this necessary?)

You can use the rule ((uneval x)) for what will ultimately be expressible as quote x when that time comes.

MATCH added to PARSE for quoted types.

There's some troubles that we'll have to work out related to whenever something wanted to use DATAYPE! behavior to mean "match value of instance of datatype" vs. "match datatype itself". Using quoted datatypes to represent quoted instances runs afoul of the fact that you might want to do either thing.

My gut tells me we don't want to investing in a DATATYPE!-specific field which encodes its quotedness, that's distinct from quoting itself. That just seems too complex for its own good.

As an example of a simpler way of thinking about it, I've expanded MATCH to work with quoted types:

>> match [''integer!] (first [''2])
== ''2

Then, MATCH is also available as a PARSE keyword.

>> did parse [1 '2 ''3] [some [match [integer! 'integer! ''integer!]] end]
== #[true]

You can read more about MATCH here.

Hopefully this will be enough to get started using generalized quoting, and we'll learn as we go.


TAG! - COMPOSE is IT (tag-specific composition)
NULL in the libRebol API...and VOID? => NULL?
The Thought that Won't Go Away: @ acting as LIT-WORD! acts?