"Escaping" vs. "Literal" vs. "..."


#1

Allowing any value to be escaped an arbitrary number of times came up a couple of years ago. At the time it was called “lit bit”…the idea that any value could carry the literal bit. LIT-INTEGER!, LIT-BLOCK!, etc.

This took for granted the idea that calling these “literals” was a good idea in the first place. But in Rebol, what’s a “literal” anyway? In the following code:

 data: [1 foo [?]]
 bar: third data

Isn’t the 1 a “literal integer”? Isn’t foo a “literal word”? Aren’t [1 foo [?]] and [?] examples of “literal blocks”?

They are certainly acting that way in the sense most programming languages would refer to “literals”. Says the all-knowing Wikipedia:

In contrast to literals, variables or constants are symbols that can take on one of a class of fixed values, the constant being constrained not to change. Literals are often used to initialize variables, for example, in the following, 1 is an integer literal and the three letter string in “cat” is a string literal:

int a = 1;
string s = "cat";

Should we call backslashes escaped values?

I was thinking about a reflector to tell you how many backslashes something had, and I kind of liked escapes of.

 >> escapes of quote \\\"cat"
 == 3 // no evaluation due to quote

 >> escapes of \\\"cat"
 == 2 ; evaluation peels one escape off

 >> escapes of "cat"
 == 0 -- yes, I'm demonstrating comment alternatives

If we did so, then the operation might be called “escape” instead of “lit” or “literal”:

 >> escape escape "cat"
 == \\"cat"

 >> escape/depth "cat" 5
 == \\\\\"cat"

That conflicts with the name for the ASCII character code. But this could be resolved as was done with NULL being changed to NUL to avoid a conflict. ESCAPE the CHAR! could just become ESC, which is an industry-standard notation.

Conflict in API shorthands, rebE() ?

In the giant universe of concerns, one is that ESCAPE and EVAL start with the same letter. This is unfortunate, since indicating you want an API value to be evaluated is done with rebE():

Previously, asking for “escaping” was done with rebUneval, a.k.a. rebU().

How does unevals of \\\"cat" and uneval "cat" sound? I think it’s kind of presumptive :crying_cat_face:. Who said you were ever going to evaluate it? It’s a property of the value, you could use it for whatever dialected meaning you like–evaluation isn’t the only reason.

There’s nothing wrong with having UNEVAL be a synonym for ESCAPE, which helps indicate situations where that’s why you are escaping it. It’s commentary.

Calling it backslashes of treads into some kind of absurdism. It would be like calling […] a BRACK! (or (...) a PAREN!) :stuck_out_tongue:

Should we just accept the abuse of the term LITERAL! ?

I’ve aired my grievance that I think 1 is a literal integer, and \1 is explicitly not a literal integer. I have sort of a gut feeling that going against the standard terminology in languages–of “escaping”–and using another standard term wrongly, will bite us.

But LIT is a short word:

 switch type of quote \\\[1 + 2] [
     lit lit lit block! [ -- can't say \\\block!, that's word not datatype
         print ["Is there any better way to say this?"]
     ]
     ...
 ]

And we can call the reflector that counts them lits of, I guess. But that feels kind of… il-LIT-erate.

I’m hammering through the mechanics, but the terminology needs help, so please chime in.


#2

2 cents proposal: QUOTED, remembering QUOTE


#3

That’s interesting–“the QUOTED! datatype”. :+1:

It’s subtle, but so are SPACE and SPACED. rebQ() seems a nice API shorthand for rebR(rebQuoted(…)).

>> quotes of first [\\cat]
== 2

>> quoted "cat"
== \"cat"

>> quoted/depth "cat" 5
== \\\\\"cat"

Since it’s a backslash and not a quote mark, one has to think a bit abstractly. But there’s no quote marks when you quote, either:

>> quote x
== x -- no quotes

>> quote quote x
** x is undefined -- (or whatever, it quoted QUOTE)

If we somehow thought quote marks were essentially related to “quoting”, then we would be saying quote x should return “x”. Not only that, but “apostrophes” aren’t actual quote marks in the first place.

Personally, I like the subtlety. QUOTED and QUOTE are clearly different words, as SPACED and SPACE are.

But it’s still a bit verbose. Going back to my hypothetical example:

 switch type of quote \\\[1 + 2] [
     quoted quoted quoted block! [
         print ["Is there any better way to say this?"]
     ]
     ...
 ]

This really speaks to the need for a literal datatype notation of some kind:

 switch type of quote \\\[1 + 2] [
     \\\#[block!] [
         print ["Or something."]
     ]
     ...
 ]

But great suggestion…this might be the way to go!


#4

I think that when you think of what the word “literal” can mean at all in Rebol, it’s a way of looking at things. The following would actually make a lot of sense:

 >> literally (1 + 2)
 == (1 + 2)

And it seems reasonable to allow a shorthand:

 >> lit (1 + 2)
 == (1 + 2)

Today we call that operator QUOTE. But at the very least, I think QUOTE should be:

 >> quote (1 + 2)
 == \(1 + 2)

But having it not evaluate its argument isn’t very useful if lit is around…you could have said lit \(1 + 2) and it would be more clear. An actually useful QUOTE would evaluate its argument and then add an escaping level:

 >> x: quote (1 + 2)
 == \3

Then you can have the complementary operation, UNQUOTE:

 >> unquote x
 == 3

To get rid of all quotedness, no matter how deep, perhaps DEQUOTE:

 >> dequote \\\\\\\\\\%foo.txt
 == %foo.txt

Pursuant to @giuliolunati’s suggestion, we would call the data “kind” QUOTED!.

 >> kind of first [\\\\\x]
 == #[quoted!]

 >> type of first [\\\\\x]
 == \\\\\#[word!]

This has a pleasant readability to the type testing operator, where it really looks like a question:

 >> quoted? first [\\\\\x]
 == #[true]

I know this is reversed from how people thought of “QUOTE” vs. “LITERAL” before. But my argument seems to hold water. 1 is a literal integer, \1 is a quoted integer. foo is a literal word. \foo is a quoted word.


#5

For TEXTs, the quotation concept has another possible sense, say {x} => {{x}} and so on…
I see 2 issues:

  1. Might be useful some “link” between {{x}} and \{x} ?
  2. But the quotation transformation is ambiguous; how to choose between {{x}} and {“x”} etc. ?

#6

You mean as an alternate meaning for QUOTE/quotation? e.g.

>> quoted "x"
== {"x"}

I could see the argument that quote would be #{"} and QUOTED would put strings in textual quotes. It would be like the relationship between SPACE and SPACED.

It would certainly have uses. I don’t know how many. If you were writing code in which you were working more with strings than with meta-code, you might use this naming system and move QUOTE out of the way to somewhere else (even just use as LIB/QUOTE if you need to)

Might be useful some “link” between {{x}} and {x}

I don’t know what you’d have in mind. But one thing I’ve mentioned is that while I once wondered if literals would have different behavior for different types, e.g. stay on the value if it had no evaluator behavior:

>> do [\(1 + 2) \[1 + 2]]
== (1 + 2) \[1 + 2]

I realized this would really break important usages of it. So the only thing the evaluator can do with it is take one level off of anything. Other dialects could do alternative things with it, though.