Taming the Pathology of PATH!


#1

PATH! has long been a thorn. Because it has been considered an ANY-SERIES!–with a position and an index–you can get into all kinds of trouble. Such as decaying into something indistinguishable from a WORD!, and then to nothing at all. In Rebol2/Red:

>> p: to path! [a b]
== a/b

>> type? next p
== path!
>> next p
== b  ;-- ack, looks like a WORD!

>> type? next next p
== path!
>> next next p
==    ;-- uhhhh, nothing?

That’s a glaring problem, but there’s many other reasons it makes a bad generic array. Try putting WORD! variations in it:

red>> to get-path! [a b]
== :a/b
red>> p: to path! [:a b]
== :a/b
red>> to get-path! p
== ::a/b

Worse still, put a PATH! in a PATH!.

>> left: clear make path! [x x]
>> append left 'a/b
>> append left 'c

>> right: clear make path! [x x]
>> append right 'a
>> append right 'b/c

>> left
== a/b/c
>> first left
== a/b

>> right
== a/b/c
>> first right
== a

New Paradigm: PATH! is NOT an ANY-SERIES!

I did something that swept away a big pile of these concerns. I took PATH! out of the ANY-SERIES! category and made them immutable. Since there’s a controlled number of points that can make paths, you can set rules for them (e.g. no fewer than two elements, no paths-in-paths). And since there are no direct modifiers, they can’t be changed to disobey this rule. Since there is no INDEX OF due to it not being an any series, you can never think of it as being anywhere but “at the head”.

It’s not as limiting as it may sound at first. You can still PICK elements out of a path by index, or use FOR-EACH on them. If you ever get to a point where you really want to rearrange and restructure a path, you can convert it to a BLOCK! or GROUP! and then back. And while making operators that remove items from paths might be a little tricky, aggregating them together is not.

Surprisingly (or perhaps not?), this didn’t actually cause that much of a ripple. Basically nothing was using PATH! as a generic container anyway–because compared to GROUP! and BLOCK!, paths were really bad at being generic containers. They’re never all that long, because they ignore newline handling (embedded blocks/groups can have newlines, but at the level of the slashes in the path itself, there are no newlines).

It’s been great so far, and I think there’s no going back.

How Many Constraints Should There Be?

I mentioned length of at least 2, and no paths-in-paths. Those are pretty obvious.

But what else? We can stop ::a/b from ever existing. But historically, the following has been idiomatic and accepted as a common and correct syntax:

 a/:b: c

I’ve wondered if a/(b): c is superior to the point that the path creation rules prohibit embedded get-words. If you couldn’t put any GET-SET-LIT inside path elements, it could stop ambiguities.

Furthermore, some types (like FILE! or URL!) have slashes in them. Should inserting them into paths be an error, or at least use those slashes to point out where path segments are and split along them?

Why this is in “Philosophy”: The Role of PATH!s in Dialects

One thing that got me to think about this is that I’ve got a dialect which lets you define BLOCK! rules or PATH! rules:

 e: 'j/k/l
 h: [m n/o p]
 dialect [a/b/c [d e f] g/h/i]

Pathing means “AND these things together”. Blocks mean “OR these things together”. And like PARSE rules, if you look up a word and get to a BLOCK! or PATH! that’s just recursed on and used as if you’d written the rule right there.

Some of the elemental rules were GET-WORD!. If GET-WORD! weren’t legal in paths, that would put a constraint on this dialect regarding its elements that the block wouldn’t impose.

But…you can work around this with a block.

 dialect [[:a]/b/c [d :e f] g/h/[:i]]

That feels very…clean. Now you have a generic solution where you’re using PATH!s as a dialect component that doesn’t lose any capability BLOCK! or GROUP! had, without worrying about tapdancing around gibberish paths.

And we actually are entering an era of what are called “mirrored types”, which would allow 1-element blocks and 1-element groups that are immutable to fit entirely in a cell with no dynamic allocation or pointer to elsewhere.

Mirrored types were invented so /foo could be a PATH! and cost no more than the old word-class REFINEMENT! did. But seeing them in action, it suggests applying it for GROUP!s and BLOCK!s too. Those embedded blocks could cost no more than a plain GET-WORD! today. With PATH! being immutable, making those blocks and groups immutable makes sense too. (By default on scanning I mean… if you make a path with a length-1 immutable block under the path level, it can preserve that mutability.)

When you put all these concepts together, it feels like it ties up loose ends and ambiguities. Will people miss a/:b:…or can the likes of a/(b): and a/[b]: or :(a)/b and :[a]/b cover pretty much everything?


Mirrored Type Bytes, Explained
#2

The only thing I can think of that would make me be upset about losing a/:b and having to use a/(b), is having that get involved in COMPOSE/DEEP when I didn’t mean it to.

 compose/deep [
      .../(don't want composed): [(want composed) ...]
 ]

But we have better solutions to this today.

 compose/deep <*> [
      .../(don't want composed): [(<*> want composed) ...]
 ]

…and a shallow compose won’t see groups in paths. I think that is enough for me.

The other issue is that right now GET refuses to fetch paths if they contain any GROUP!s. We could update this rule to make it refuse to fetch paths if they contain anything that runs any ACTION!s, so any inert groups would be fair game.


#3

What is the problem you see with get-words in paths, and set-words at the end ?


#4

Ambiguity. If GET-WORD!s can be put in paths, then you can’t tell if :a/b/c is an ordinary PATH! with a GET-WORD! at the beginning, or a GET-PATH! with an ordinary WORD! at the beginning.

Same for a/b/c:… is that an ordinary PATH! with a SET-WORD! at the end, or a SET-PATH! with an ordinary WORD! at the end?

Every now and again it has been wondered if this suggests that there shouldn’t be a SET-PATH! and GET-PATH!, but that those should simply be ordinary PATH!s with SET-WORD!s at the tail and GET-WORD!s at the head. This breaks down when you want a/b/(c + d): because you’d need a SET-GROUP!, or a/b/1: because you’d need a SET-INTEGER!, etc. for all types. It also breaks down because it inhibits the cheap/easy transformation of these path types into each other by flipping one byte without affecting the shared path array itself.

States that don’t seem ambiguous, like ::a/b/c are still quite ugly…and actually can still be ambiguous. e.g. is that a three-element GET-PATH! with a GET-WORD! :a at the head, or a two-element GET-PATH! with a GET-PATH! :a/b at the head. I also think things like a/b/:c: are awful-looking, and don’t have good bones for the language.

But the good news of all of this is that I think I have an answer for all of this with immutable paths, that are checked for properties at time of creation, to address all these issues…and I may be able to do it quite efficiently.