The Thought that Won't Go Away: @ acting as LIT-WORD! acts?

#1

Today it's not legal to write @foo or @foo/bar in code. But people want it, and it's legal in Red. This brings us to the question of what the evaluator behavior of such a thing will be in Ren-C.

Something I haven't been able to shake since the beginning of seeing the usage of apostrophe for LIT-WORD! is that it is too slight for something as important as it is. There's a big difference between foo and 'foo but it's rather hard to see.

The rise of the idea of using apostrophes purposefully in names also throws in a wrench. Not only is there isn't coming very soon to wide use, but I do kind of like the mathematical convention of ticking apostrophes on the tail of things to make "derived" versions of them (f, f', f'')

block: [isn't "Hello"]
switch first block [
    'isn't [print "This runs"]
]

foo': specialize 'f [arg1: 10]
foo'': specialize 'f' [arg2: 20]
action: get 'f''

What if the evaluator treated @ like it treats apostrophes today?

block: [isn't "Hello"]
switch first block [
    @isn't [print "This runs"]
]

f': specialize @f [arg1: 10]
f'': specialize @f' [arg2: 20]
action: get @f''

While it is "blocky", I feel there's something more "solid" about it.

@foo has to be "wordlike" anyway, for FAIL

The idea of "follow this link, to a word" being the meaning of @ is something I am eager to use with <skip>-able parameters for FAIL, to say where an error should be reported. (so indicate the callsite, instead of where the fail is).

Today:

foo: function [number] [
    if number > 1020 [fail/where "Value is too large" 'number]
]

Soon:

foo: function [number] [
    if number > 1020 [fail @number "Value is too large"]
]

This would rule out EMAIL! unification

The evaluator goes by type, so the quirky-and-not-completely-figured out unified handle! proposal would have to be scrapped. @foo would be a LIT-WORD!, @foo/bar would be a LIT-PATH!.

It would also help with things like LIT-GROUP!s

If you look at something like:

code: '(
    #foo [baz bar]
    whole bunch of stuff ...
)

That super-lightweight apostrophe is easy to miss. Compare with:

code: @(
    #foo [baz bar]
    whole bunch of stuff ...
)

It is slightly more labor to enter on most keyboards

I'm not sure how much of a concern it is, in the sense of I don't know how many LIT-WORD!s people have to enter (compared to +, or { or other shifted characters in various hard to reach places). They're used often, but not that often.

Leading apostrophe compatibility would be needed for redbol

I'm fairly aware there's going to need to be some kind of runtime switches to enable wider redbol compatibility. Continuing to treat leading apostrophes as LIT-WORD!s would be a part of that.

But in the near term, there'd not be any specific pressure to retake 'foo and 'foo/bar. I'm not sure exactly what they'd be used for once degrading to a normal word character:

'foo: chain [foo | function [x] [x + 1]]
''foo: specialize @'foo [arg2: 20]

But whatever they're used for "meatier" at symbol makes it viable, and helps differentiate from ''foo being a lit-word whose spelling is 'foo :-/

My other idea variant was FAR more radical

The other thought I had when first questioning ' was what if double quotes weren't used for strings, but for quoting arbitrary material literally:

block: [isn't {Hello}]

switch first block [
    "isn't" [print {This runs}]
]

It would look better when quoting SET-WORD!s as "foo:", and provide clearer and more visible delimiting on GROUP!s:

code: "(
    #foo [baz bar]
    whole bunch of stuff ...
)"

Because it would be part of a paired delimiter, you wouldn't have to worry about nested quoted material.

While this is sort of "interesting" it's completely incompatible. It makes you type two delimiters when a common case only required one. And from my experiments trying using only braced-strings to see if quotes could be freed up for something else, I found the braces were far too heavy when used for every string.
Plus the presence of two styles letting you conveniently have things like a string with a single brace "{".

So I dropped that one, but have kept the @ idea consistently in mind. I'm feeling pretty serious about trying it. Any thoughts?

#2

This may be an unhelpful, half-baked reply. I usually pronounce the @ symbol in my head as "at", so I'm not in love with the idea of using it to represent a lit-word. It's not a big deal, I'd get over it. :slight_smile:
I've sometimes thought that the @ symbol could be used as a glyph for anything that can be addressed, including the location of source-data. The @ could be at the beginning of a URL, email, port, etc.

I don't have much reasoning behind this, but could the # symbol be used to denote a lit-word? It's very eye-grabbing.

#3

I brought it up just because it's an idea that I have whenever I get frustrated with ' and look at a particular case where it falls down. The recent frustration triggering the thought was having a variable d' and then quoting it and getting 'd' and thinking how bad that looked, as if it were a string or character literal. This then ties into the fears of letting apostrophe-words become common, like isn't, and how that's going to play.

But when you look at a lot of average-case code, the near-equivalence of 'foo and foo becomes a feature instead of a bug:

parse [apple banana banana] ['apple some 'banana]
parse [apple banana banana] [@apple some @banana]

Maybe so long as you make sure QUOTE is always available as an option to fall back on, it's not so bad. 'd' is terrible, but quote d' is legible. Escaping is a hard general problem, that pops up everywhere...you can write http://http.http//http.http.

Anyway, reason I bring it up is mostly that when you give some of these ideas a try you notice things that lead to thinking that creates the "real" better answer. Or at least you find the disproof that makes you settle on holding your peace.

Whatever comes of @, I am eager to see it used for "pointing at":

 func [param] [fail "An error which indicates this line"]
 func [param] [fail @param "An error which indicates the callsite"]

So it needs to have a binding.

could the # symbol be used to denote a lit-word? It’s very eye-grabbing.

I've become a big fan of using #issues in places that do not intend to look up places, because the fact that they don't decay in the evaluator increases their usability for that purpose.

Consider something like in ZeroMQ, where a type constant is passed by WORD!. The consequence of that is that when you splice it into code, you have to take pains to unevaluate it (done there with rebUneval(), which is a fancier-shaped tool than quote for API use). But if it had been an ISSUE!, you wouldn't have to worry as it would be completely reduced.

Similarly, I feel TAG! should be picked up in a lot of places people casually use LIT-WORD!s today. It puts more diversity in the source and helps hint that you aren't dereferencing.

If @foo is in the same family as foo@bar, then it would be inert, and give another one of these categories. It's probably the right thing to do. But I had to get the thought of using it for quoting out in the open, even if just to shoot it down.

#4

@hostilefork +1 on your second post.

#5

@giuliolunati brought up an old issue that had been talked about a while ago, under the topic name "lit bit". It was the idea that all values should be able to have literal forms. So, LIT-INTEGER!, LIT-BLOCK!, whatever.

More ambitious versions of the proposal suggested any number of these bits (although it wasn't clear where in the value cell you'd be putting all these bits).

This revived some of the old complaints I've had about how much I hate apostrophe for this purpose. A LIT-STRING! doesn't look good as '"foo". And a LIT-LIT-STRING! doesn't look good as ''"foo"

So back to the symbol drawing board, with @"foo" or ^"foo" or even the instantly unpopular idea of *"foo".

But What About... the Neglected Backslash?

Backslash is currently unused. We've thrown around ideas for backslash from time to time. It was the original expression barrier concept, before | took that over.

Despite its common uses for escaping in other languages and mediums, it's never come up as the idea for being the generic literal escape. But, it actually seems pretty well-suited to it.

if word = \isn't [...]
if word = 'isn't [...]
if word = ^isn't [...]
if word = @isn't [...]

if string = \"something" [...]
if string = '"something" [...]
if string = ^"something" [...]
if string = @"something" [...]

>> compose [(1 + 2) \(1 + 2) \\(1 + 2)]
== [3 (1 + 2) \(1 + 2)]

block: [a b c]
switch block [
    \[a b c] [print "literal match!"]
    ...
]

Benefits:

  • Doubled-up usage can't be visually confused with another symbol (like how two apostrophes look like quotes, or an apostrophe and quotes can look like quotes and an apostrophe)
  • It doesn't require shift to type on most keyboards (though a little bit of a pinky-reach)
  • Known usage for escaping in other languages

(Note: The benefit of it being used for escaping in other mediums also makes it a bit of a drawback, as when you're typing in code in some text or markdown environment and use a backslash it can trip it up...because it thinks you're trying to escape. Of course you run into that problem with any code... like typing * and having it think you're trying to bold something, etc. Still, backslashes can be one of the trickiest things to get these mediums to actually display how you want.)

Caret isn't bad-looking either. But the big problem with caret is that you don't want the escaping method used for weird characters in strings to interact with commonly-used characters in code. Because code is often carried in strings (or molded to it, etc.) It's best if caret occurrences are few and far-between, but literals would be a lot of places. You've got to diversify your escaping mechanisms, otherwise you have to escape your escapes more often.

This seems promising...

So if any development on the modern form of "lit-bit" is tried, it might go with this. Anyone want to throw in any extra thoughts? Sadness over the loss of their favorite potential application of backslash they were hoping for?

1 Like
I love arbitrary escaping, but backslash may not be The One
#6

Yum, sounds good! +1

#7

I've churned through and pulled off a quite clever implementation of arbitrary-escaping. It can put up to 3 levels of escaping in the cell without needing to "pop out" and make another cell. But after that, it makes a cell off to the side, and is pretty smart about sharing that cell between varying quoted levels of that same entity when it can.

It may sound trivial to "put a couple of bits in a cell indicating the escaping". It was NOT trivial :japanese_ogre: for many reasons.


One major angle of non-triviality is having a situation where the same bit pattern for a whole cell must be seen through multiple lenses. Most high-level code wants it to be a LITERAL! and a distinct type...it doesn't want to ask what type \\x is and get back REB_WORD and treat it the same as a word. But other code wants to be able to operate on the cell in place without the overhead of copying it into a new cell where the type bits have been adjusted to what it expects. Think of molding code that gets the cell, sees it's an escaped word, but then wants to pass it to the word molding routine...which needs to understand it's a word and what kind it is, vs. thinking it's a REB_LITERAL and erroring.

Being able to enforce those views safely is a testimonial to the awesome powers that C++ validation has given the C codebase. The type system can have subclasses viewing the same bits that say "you can't ask me what my type is in a way where I might answer LITERAL!", you have to use a different accessor that does a calculation of what kind of literalized alias I am". It's the power to enforce layers and protocols at compile-time, with no run-time added cost for the service. Without it I could NOT pull off these features so quickly and trust in their accuracy.


There's comments in the code for those interested, but I hope to write up an essay on hostilefork.com about it and the techniques.

But I think backslash might not be the right choice. I had an extended discussion of that here, but I've moved it to a new thread, as we're now long gone from @ being discussed.

#8

Final line of the four-examples code block should be "C:\Projects\foo\baz\bar.txt", no?

1 Like
#9

I've buried the hatchet, and apostrophe has been finally accepted, for the purpose it was originally intended.

The acceptance is based on three premises. First, to accept apostrophe's slight appearance as a benefit. In @rgchris's words:

"I am a member of the apostrophe fan club--I think its inherent discretion is part of what makes Rebol a more elegant language."

Secondly, to get a shorter way to make a literal value in source. We now use LIT instead of QUOTE.

 if word = lit isn't [...]

It may be that even the quoted values could find an alternate rendering...like how "^"" will be picked to render as {"} by molding. So perhaps 'isn't or ''double-quoted-double-primed'' could have some options too.

Thirdly, stop using WORD! for enumerations. If you used something weightier like an ISSUE!, the fact that apostrophe is so skinny wouldn't come up.

 switch mode [
     'read [print "too skinny!"]
     #read [print "better..."]
     <read> [print "also better..."]
     %read [print "fine too."]
     @read [print "coming someday..."]
 ]

If you only use words when you need evaluation, you'll need escaping less. And you'll be doing yourself a favor in COMPOSE situations where you won't constantly have to be worrying about the word getting its quote knocked off and then evaluating in some other situation.


The full post announcing the "finalized" (cough) generic quoting mechanism is here:

1 Like