What is ~/~ : A PATH! or a QUASI-WORD?

hostilefork · September 22, 2024, 1:15pm

I've mentioned that there are not going to be quasiform or antiform sequence types.

We want tildes in paths ~/home/whatever.txt
- Paths in particular...but they could be useful in tuples (~.foo) or chains ([a]:~)
- Quasiforms might be nice for some purposes too (~null~/a/b/c), (foo:~[a]~)
The known very important usage of tilde in paths, plus all the as-yet-unknown and untapped other uses, far outweigh any advantage I can think of for antiform/quasiform sequences.

So that's settled. But there's a bit of an edge case with the WORD!s like / or . or :

And there, I'm going to have to say that ~/~ is a quasiword.

I want all words to have quasiforms (though not antiforms... only system-blessed "keywords" will be allowed to become antiforms). And so I don't want to exempt any words from that.
I don't think excluding ~/~ as a path form has any tremendous consequences
Allowing it to be a quasiword fits in with ~///~ and other such words being quasiwords, and those aren't valid paths.

So we open up a wider field of parts, keep all words having quasiforms, and lose one weird path representation.

Though it's not "lost"--if you're writing dialect code that has some logic for what a PATH! of two tilde-trash would mean, you can specially recognize the quasiword. You'll only run into representational trouble if in the same dialect you need a meaning for every possible quasiword that deviates from what the path interpretation would be.

(I'd say I'd bet money that's never going to happen, except I've seen exactly weird things like that happen often enough to know better. So instead I'll say "it might happen someday, but I don't care.")

Anyway, trying to shore this up a bit, some sample errors:

>> to path! [~ a]
== ~/a

>> to path! [~ ~]
** Script Error: Sequence would conflate with WORD! form: ~/~

>> to path! [a _]
== a/

>> to path! [_ _]
** Script Error: Sequence would conflate with WORD! form: /

>> to path! [_ a _]
== /a/

>> to path! [a _ b]
** Script Error: BLANK! only legal at head and tail of sequence

hostilefork · September 23, 2024, 11:58am

I should mention that dialect-wise, if you're doing quasiform annotations to mean something for variable references (for example), and you get to the point where you need tuples, then instead of ~var~ you say ~(obj.field)~.

For the sake of simpler processing in your dialect, you might just always mandate the group, e.g. ~(var)~ . But that's up to you based on what look you are going for and the other tradeoffs in your dialect.

I prefer this "reified escaping" to "lexical escaping"

Lexical escaping would be something like this:

>> tuple: to tuple! [~ a]
== ~.a

>> quasituple: quasi tuple
== ~\~.a\~  ; or some other arbitrary escaped-rendering method in the system

>> unquasi quasituple
== ~.a

Rethinking that with reified escaping:

>> tuple: to tuple! [~ a]
== ~.a

>> to group! reduce [tuple]  ; a.k.a. ENVELOP/GROUP a.k.a. ENGROUP
== (~.a)

>> quasigroup: quasi engroup tuple
== ~(~.a)~

>> first unquasi quasigroup
== ~.a

This resembles other places that I've taken this bias.

It represents a shift in my thinking from a decade ago when I advocated for lexical escaping: I thought the system needed to do things like permit words with spaces:

r3-alpha>> word: to word! "word with spaces"
== #[word! "word with spaces"]

r3-alpha>> type? word
== #[datatype! word!]

 r3-alpha>> setword: to set-word! word
 == #[set-word! "word with spaces"]

 r3-alpha>> make object! compose reduce [setword 10]
 == ... oh what a tangled web we weave ...

With hindsight and a greater vocabulary of containers, I no longer believe this is a good idea. (Datatypes are another area that have undergone the shift, with word! defined as &[word] being a TYPE-BLOCK!)

In the grand "freedom to vs. freedom from" balance of things, it's better to err on the side of solidity. The overall systemic Quality-with-a-capital-Q becomes higher by tightening the fundamental bricks...and using reified escaping.

 >> type of '[spaced words in block]:
 == &[set-block]  ; ...note: will be &[chain]