Should DATATYPE! Be ANY-WORD! (such as @integer)

This historical thread covers the decision to migrate of "DATATYPE!" toward becoming a TYPE-BLOCK!, e.g. an actual structural form in the language.

Ultimately the representation chosen was to use &:

 >> type of 1020
 == &[integer]

The discussion is kept for reference, but see notes about the migration of DATATYPE! to this new idea.


Datatypes have problems. One is that there's no clear representation for them literally, so they tend to just look like WORD!s:

rebol2>> word? first [integer!]
== true

rebol2>> type? 10
== integer!

rebol2>> word? type? 10
== false

As a band-aid over this conflation with words, Ren-C has been using construction syntax in the console...but it's quite ugly:

>> type of 10
== #[datatype! integer!]

We also face the question of extensibility of types... and it's hard to think of a better extensibility mechanism to use than the existing symbol engine that is behind ANY-WORD!.

But making them plain WORD! is not very desirable. We are used to writing things like make object! [...]. And we don't want MAKE to have to quote its argument. Hence, OBJECT! would have to evaluate to something.

We could say (object!: 'object!) but that seems like it would get into some ugly situations.

What about the new @WORDs?

Let's say @integer could act as a datatype, as well as an ANY-WORD!.

It has the advantage of being inert in the evaluator:

if @integer = type of 1 [print "Sanity prevails..."]

Redbol compatibility could still work:

none!: @blank

So the main question would be, how often is it that you have a slot that might want an @word -or- a datatype? How frequent is that need for polymorphism?

When I look at the landscape of problems, it seems to me that an evaluator-inert type that has an intrinsic extensibility like this would be pretty much ideal for denoting a datatype. No construction syntax, just good ol' reliable behavior.

So would this mean:

parse [1 @integer 3] [some @integer]

...would be successful?

Does this preclude any other potential usage for symbols in Parse?

Unfortunately, yes, and there's already competition for this form in PARSE.

PARSE is probably not the only example of a dialect where having a distinct otherwise-unused notion of DATATYPE! could come in handy, where the slot could be either symbol or datatype.

No, because @integer being interpreted as a datatype for purposes of the dialect, would only be a parse instruction for integer matching.

However:

parse [1 @integer 3] [@integer '@integer @integer]

Would reach the end of the series (truthy).

We would presume that integer! would be a WORD! that evaluated to @integer . So you could still say:

parse [1 @integer 3] [integer! @integer integer!]

It would work by the same general theory that any WORD! which is used in a PARSE rule that's not a keyword is fetched to be the associated rule.

Should DATATYPE! be killed off (in favor of @integer, @word, etc?)

Something to think about with this question is regarding a more OBJECT!-like notion of what a datatype is, where that object might have various descriptive properties (like the numeric limits of a type, or precision, or other characteristics).

If "types" are ANY-CONTEXT!, then they can do this. But if they are simple @word!-s, then you'd have to go through a lookup (at minimum) to get an object that matches the descriptor.

This may point us to a notion of a difference between the kind of a value, and the type of a value. The kind would be something more mechanical regarding the cell structure (like to say that something is an OBJECT!). Whereas the "type" could be richer.

>> book!: make object! [author: title: _]

>> item1: make book! [author: "Aldous Huxley" title: "Brave New World"]

>> item2: make book! [author: "George Orwell" title: "1984"]

>> kind of item1
== @object!

>> object! = kind of item1
== #[true]

>> type of item1
== make object! [author: _ title: _]

>> (type of item1) = (type of item2)
== #[true]

Hence it may be that these wordlike things are only suitable for the kinds. But this is very speculative. We're talking about features and concepts that were never designed (!), yet people would probably like to see fleshed out somehow.

While demoing some of the quoting in the conference, it was clear that dealing with that is going to be a bit of a puzzle for some minds--and so we should do everything we can to make it more obvious.

But... using quoted types to reflect quoting on values has some counter-intuitive properties in evaluative contexts, e.g. with the now evaluative SWITCH...

 switch (type of first ['''foo]) [
     ''''@word [
         print "...needing four quotes here strikes me as non-obvious."
     ]
     the '''@word [
          print "...and needing THE undermines datatype as inert @WORD!"
     ]
 ]

This led me to a concept I outlined in chat that we make use of the somewhat superfluous-seeming member of the new inert @ family, the @[...] blocks...

 >> type of first ['''foo]
 == @['''word]

If we use this particular part in the box as a datatype! replacement, we get the opportunity for richer structure. @[matrix 20x20]. And if we let the quoting level be carried on the first element of the type, we also get a fair amount of leverage from that.

Of course we can still keep our shorthands, perhaps adding more:

 integer!: @[integer]
 integerQ!: @['integer]

And it frees up @(...), @word, @some/path for other applications in dialects, because only the @[relatively unloved "@-block"] would be taken (and only in cases where that dialect had reason to be dealing in datatypes).

1 Like

I'm still liking this direction. There are a few implications to notice.

One is that TYPE OF is coming back with something more granular than just QUOTED! for things that have quote levels. This preserves the historical (and desirable) behavior. Consider that we likely want:

rebol2> (type? first ['x]) = (type? first [x])
== false

rebol2> (type? first ['x]) = (type? first ['x/y])
== false

If the type of 'x were simply @quoted, then though the former would be false, the second would be true. This suggests a different operation that would say that 'x and 'x/y were both instances of QUOTED!.

But TYPE OF returning an ANY-ARRAY! has the properties we're looking for. It's a distinct type from BLOCK! that is nonetheless irreducible, we can do comparison, and symbolic manipulation via QUOTED and UNQUOTED.

 >> unquoted @['integer]
 == @[integer]

 >> (quoted integer!) = type of first ['10]
 == #[true]

Yet the fact that sometimes we might want to know something is QUOTED! and not care about the specifics raises a bigger point: that there are various depths of patterns to match in "types".

This already existed before arbitrary quoting: e.g. ["a" 10] and [<b> 20 #foo] are both BLOCK!s, but if you were expecting a two element block with a TEXT! and an INTEGER! then they are effectively not matching types. It's the precedent of LIT-WORD! and LIT-PATH! that forced having to have a solution for quoting. But people have wanted something for this as well.

As time has gone on, the applications of @word @pa/th @[bl o ck] and @(gr o up) have become wider. The "useless" @[bl o ck] datatype is no longer useless, and has many non-datatype applications.

This isn't to say we can't still use @[...] for datatypes.

Issue is that we have to remember that a key to this datatype is that dialect-wise, we have to be willing to sacrifice it for the meaning of "match this datatype". That's easiest to do if it has its own notation...but the whole question of this thread is whether that's avoidable, especially given the need to encode quotedness into the type.

Of course it's still a deep thought...which is why we're still using #[datatype]...but the thoughts continue.

1 Like

To have clean, non overloaded way to describe datatypes, they would need a dedicated symbol for datatypes.
Actually I've never used "!" in words myself, so I wouldn't have a problem with enforcing that only types can have "!" in their name.
Just mentioning it.

I think this turned out to be inevitable. So what I'm working with right now is ampersand, as &[xxx].

At the moment it's not a full family of ampersand-things (e.g. TYPE-XXX!). It's just a rendering of datatype to look like a block! class. But the idea would be to inch toward the whole family.

It's technically possible to remove exclamation points from legal ordinary word characters, and introduce it as a terminal sigil...like a SET-WORD!:

type-word!

That could be generalized to the other types that carry sigils:

[type block]!
(type group)!
type.tuple!
type/path!

But there seem to be a lot of reasons to be skeptical of this:

  • Having types be indirectly referenced through WORD!s--as they were historically--allows for convenient aliasing. In the Rebol2/Red emulation, we can simply say paren!: group! and it works.

  • If you're going to make something like ANY-WORD! refer to a typeset, it doesn't have any obvious separation from the way you refer to a single type. Other things like type constraints (like EVEN!) also don't have the indirection to their underlying mechanic.

  • I've tended to favor creative uses of ! in terms or dialects...like !!WARNING!!. And I like the idea of !! meaning breakpoint, or things like that.

    • Though in the case of ! and !! they could be exceptions as WORD!s, similar to how there's an exception for / not being a PATH!.

In any case, I think & is the better move.

1 Like