Introducing:The:New:CHAIN!:Datatype

As part of a giant overhaul of function application, we're getting rid of SET-XXX! and GET-XXX! as fundamental types. Instead, we'll have a new sequence that's like PATH! and TUPLE!...except that uses the colon character as its interstitial delimiter.

these:<will be>:1020:"called CHAIN!"

:alien:

:three: is a magic number.

It gives us a nice symmetry of having three ANY-LIST! types, and three ANY-SEQUENCE! types

  • [BLOCK!] (GROUP!) {FENCE!}

  • /PATH!/ .TUPLE!. :CHAIN!:

CHAIN! is Higher Precedence Than TUPLE!, Lower Than PATH!

Pleasingly, the hierarchy of the interstitials is in accordance with their "heft"... / is heftier than : which is heftier than .

 foo/bar.baz:mumble:frotz

...that would be a 2-element PATH! with a 3-element CHAIN! in the second position that has a 2-element TUPLE! in its first position.

:: and : Will No Longer be SIGIL!

This means sequences of all colons will be WORD!s, like [/ // /// etc] and [. .. ... etc]

It was perhaps a bit suspicious that colons could appear on the beginning of a word or the end of the word. That makes it seem more like an interstitial sequencing character than a SIGIL!

This means that you can put SIGILs on chains, including the trivial chains @:word or $(group): which previously needed two cells, e.g. (@ :word) or ($ '(set-group):).

  • Looks tighter, evaluates faster, and you don't have to quote evaluative things

  • You can communicate more information with the single value if it can carry the sigil along

Notationally Competes with TIME!

We have to decide what 10:20 is.

When a parallel problem happened with 1.2 needing to be either DECIMAL! or TUPLE!, the decision was to make it a DECIMAL!. So maybe CHAIN! reserves cases where numbers are at the head for other types?

But it's a shame because it seems this is made for dialecting on much more interesting level:

for x 1:10 [print x]

for x 1:(end) [print x]

for x (start):10 [print x]

for x 0:10/2 [print x]   ; /2 could mean bump by 2

I wonder if TIME! could be moved out of the way somehow. Maybe the weird forms that interfere with generality of sequences could be the syntaxes like #[10:00] or @time/<10:00> ... is TIME! really used so often and so critically that it can't be delimited out of band? If contextually you know a spot should have time, could it just be <10:00> and you add the @time/ to it for processing?

We can debate it. The easiest compatibility rule would be just to know that there are no 2-element chains that start with numbers.

URL! is NOT a Conflict

You might imagine http://example.com is a 3-element PATH!:

  • First slot is a CHAIN! of [http _]

  • Second slot is a BLANK!

  • Third slot is a TUPLE! of [example com]

But I think it's too valuable to be able to express URL!s easily. So this seals the deal that you can't put BLANK!s internal to ANY-SEQUENCE!. So there's no foo//bar paths or foo...bar tuples or foo:::::bar chains. With // being illegal in path, it's free to be a URL!.

(We were already saying that / and // and ... etc. are WORD!s, so the bias against doubling up these interstitial delimiters in sequences has been building up over time, and this just finalizes that.)

Yet strangely enough, http:/example.com with just one slash is a valid PATH! of 2 elements, with a CHAIN! in the first slot and a TUPLE! in the second. Maybe this was destiny all along, and why URL!s have the two slashes. :monkey_face: (I actually do believe the two slashes were chosen to make URL!s "stand out" visually, so it may not be a complete coincidence that it works out well for us here.)

But... no more URN!

The subset of legal URN!s that we can represent would be CHAIN!.

How many URN!s would be ruled out? Interestingly the first two examples I found off stackoverflow of URN!s would be valid chains:

URN (not URL): urn:oasis:names:specification:docbook:dtd:xml:4.1.2
URN (not URL): tel:+1-816-555-1212 (disputed, see comments)

But I'm not really concerned if it can't represent all of them. The world doesn't revolve around URNs anyway. (Maybe they were supposed to be important at one point, but I haven't noticed them ever being so.)

We have things like @urn/"tel:1-816-555-1212" if we need them, as inert forms

Cuts Down On Total Number Of Fundamental Types

What's added:

  • a:b ... CHAIN!
  • @a:b ... THE-CHAIN!
  • $a:b ... VAR-CHAIN! (should things like this be called e.g. BIND-CHAIN! ?)
  • ^a:b ... META-CHAIN!
  • &a:b ... TYPE-CHAIN!

(I do think that perhaps calling e.g. $XXX a BIND-WORD! is better than VAR-WORD!, because VAR is an abbreviation and it doesn't look as classy.)

Compare that to what's removed:

  • :[a b] ... GET-BLOCK!
  • [a b]: ... SET-BLOCK!
  • :(a b) ... GET-GROUP!
  • (a b): ... SET-GROUP!
  • :{a b} ... GET-FENCE!
  • {a b}: ... SET-FENCE!
  • :a ... GET-WORD!
  • a: ... SET-WORD!
  • :a/b ... GET-PATH!
  • a/b: ... SET-PATH!
  • :a.b ... GET-TUPLE!
  • a.b: ... SET-TUPLE!

And despite sacrificing those types, due to past labors there's no loss in storage efficiency.

Fewer basic types, but a lot more fun parts.

Resolves Questions That Have Plagued Rebol For Decades

There was always this question about "why can't you put SET-WORD! or GET-WORD! in a PATH!", and people argued about it.

But now the answer is that they're not fundamental parts, and you're just dealing with the laws of interstitial delimiters. You can't put a path "in" a path, but you can merge them. So now it's just that you can't put a chain in a chain, only join them.

So this means a test like SET-PATH? doesn't make any sense. Chains always live underneath paths! But there are what are effectively SET-TUPLE (!) because with .a: you have the tuple under the chain.

Hence we're resurrecting things like a/:b but with a whole different logic behind it. Because you can't run the risk of putting a GET-WORD in the first slot of a GET-PATH! because there are no "GET-PATHs".

This finally makes perfect sense.

This Means We Won't Have Undecorated IPV6 Literals

Red is pursuing the idea of IPV6 literals, but this won't work with that. We'll need another way.

Since the @ sigil makes things evaluate to themselves, it's going to be the likely basis of extensible literals:

>> @ipv6:<FE80:CD00:0:CDE:1257:0:211E:729C>
== @ipv6:<FE80:CD00:0:CDE:1257:0:211E:729C>

:FOO: Actually Looks Kind of Cool

When it comes to dialecting, .FOO. doesn't seem like something you'd use as a token. It's more like a notation that would just sort of come up if your rules happened to intersect ("starting with a dot does X, ending with a dot does Y, starting and ending with a dot means you want both X and Y").

/FOO/ is a bit better. I can imagine it labeling something, while not using up BLOCK! or GROUP! or FENCE! to get the look

[
   /FIRST/ this could be (some code)
   You could do [something here]
   /SECOND/ more stuff.
   {your fence here}
]

But :FOO: seems like something you might use. As does :[foo]:, -:[foo]:-, etc. Neat things to have in the parts box.

4 Likes

Hah, I always knew these were a bad idea! I’m not surprised to see a more flexible and powerful system arise from unifying them. (Much more flexible than my own suggestions were, even.)

2 Likes

Glad it aligns with your intuition! Hopefully now that there are fewer of these, you won't be as eager to murder the remaining ones. :dagger: :slight_smile:

In my early days of involvement I tried to convince people there were no SET-PATH! or GET-PATH!... only PATH! with SET-WORD!s in the last position and GET-WORD!s in the first position. It was all about trying to get away from the ambiguities, which still exist in Red:

red>> p1: to path! [a b c:]
== a/b/c:

red>> p2: first [a/b/c:]
== a/b/c:

red>> type? p1
== path!

red>> type? p2
== set-path!

(Of course you can generate all kinds of gibberish in historical Redbol with paths, which is why sequences in Ren-C are immutable, can't be less than two elements, and do not carry a series position so they are always "at their head")

red>> to path! []
== 

red>> p3: 'a/b/c
== a/b/c

red>> skip p3 3
==

>> to get-path! [:a b c:]
== ::a/b/c:

I don't recall what all the objections were to my proposal about killing off SET-PATH! and GET-PATH!. But people didn't like it. I'm sure one of the objections was that you had to "dig" to find out if a path had set-like or get-like behavior.

But with tuples and chains, you can appropriately ask :a or a.b: if they have a terminal colon or leading colon, and find out what you want to know in a way that nests right. The chain is on top of both. And having a/b: "hide" the colon under the path isn't a concern, because that's not how you do assignment.

Sure, I’ll give them a reprieve. For now…

So I was a little nervous about the evaluator-activeness of a GROUP!-headed CHAIN!.

One might assume it would be for:

verb: pick [append insert] 1

(verb):dup [a b c] [d e] 10

But if you wanted that to run a function, that seems like a good job for slash!

verb: pick [append insert] 1

/(verb):dup [a b c] [d e] 10

I don't think it would come up that often, compared to how often you would want to specify ranges, and have it not assume you're trying to run a function.

Having it be inert is best for dialecting, because what's in the group may not even be "code".

my-dialected-thing (*):2

So WORD!-headed CHAIN! would be the only evaluator-active kind.

1 Like

We're about to face some radical new mechanics... :melting_face:

Blank-tailed CHAIN!s will turn SET-WORD? into a type constraint...not a fundamental type. And blank-headed chains for "GET-WORD?" won't be a get-word any more, it will be a refinement. Would the 2-element CHAIN! foo: still consider itself an "ANY-WORD?" at that point...? Is :foo:bar a refinement as well?

More weirdness is afoot: "converting" a "set-word" to a word just means picking the head off a chain... and "converting" a "word" to a SET-WORD! would likely be done as compose $(word): or similar:

>> word: 'foo

old>> to set-word! word
== foo:

new>> compose $(word):
== foo:

Such things may seem quite unfamiliar right now. But just looking at that one random instance, it's actually one less character. And it looks a bit "tighter" (to me). Certainly it is more general, and learning how to do it will serve you better when you want to build more complex things. (Though people who want TO-SET-WORD can have it).

I think the answer is we will be doing less naming, and more pattern-matching. Does it matter what you call it, if you say:

foo: func [
    setter [(word!):]
][
    print "would [{word!}:] be a better choice, leaving (...) for COMPOSEs?"
]

I'm sure there are lots of useful routines to be invented. Some imagination:

>> item: 'a.b:c.d:1:(2)

>> dissect item ${chain?}:1:(2)
== a.b:c.d

>> item: 'a.b:c.d:1:{2}

>> dissect:() item $(chain?):1:{2}
== a.b:c.d

But it's definitely going to be an upheaval.

One Step At A Time... :man_walking:

First step is to keep the whole system working as closely as possible to what it is, just with new mechanics under the hood... so names like SET-WORD? will be sticking around in the near term.

Conversions like to set-word! would have to be redone as to-set-word due to the absence of the fundamental type to pass in.

After the dust settles from all that, I guess it will just be time to go little by little to see what makes for effective and understandable code.

As a Haskeller, I say: bring it on! Pattern-matching is enormously powerful, and Ren-C has the syntax to make it really ergonomic.

(Let me put it this way: in many languages, pattern-matching is already a separate dialect in all but name. What could a language with proper dialecting support do with it?)

1 Like