"Extension Types" Implementation (On Hold)

Where could I find the discussion about new implementation of user defined datatypes ?

3 Likes

The new thing that has come on the scene isn't what I'd really call "user defined datatypes" as much as "extension defined datatypes". It's for C programmers to implement types like IMAGE! or GOB! with a DLL or statically linked module...without those being built in a-priori.

The feature's goal was to get past a historical property that limited Rebol to 64 built-in datatypes, which had to be named in the core interpreter and could not be changed or extended. Ren-C wanted to be much more modular...to avoid carrying the weight of things like GOB! to the JavaScript build (or a redundant IMAGE! datatype that was handled by a browser's canvas.) Then the web build could choose its own extension types, perhaps some kind of JAVASCRIPT-OBJECT! proxy or a CANVAS!, etc.

This was needed during the breaking the project up into independently selectable extensions--of which there are now 31. See the README.md for a few notes:

https://github.com/metaeducation/ren-c/tree/master/extensions

(At this time, the web build uses only JavaScript, Console, and Debugger.)

Implementation Details

A "value cell" in Rebol and Red are four platform pointers in size. Of these, the first platform pointer slot is used as bits for a "header". How the other three pointers are interpreted depends on a byte in that header...which was called the VAL_TYPE() in R3-Alpha (though Ren-C calls this the "cell kind").

Of this byte, only 64 of the states are used in R3-Alpha--and I believe Red. This was chosen instead of 256 in order to limit the number of kinds that need to be handled in a TYPESET! to 64 bits...making typesets small enough to fit in the rest of the cell. (That simplicity is nice for implementation, but points to a pretty big weak spot in the type system for more sophisticated purposes...which new designs will be needed for!!)

Ren-C's "extension types" is a primordial implementation of a strategy to reserve one cell kind to mean "this cell gives up one of its three non-header platform-pointer-units to be a pointer to its type". That allows an arbitrary number of these to be added. They can't pack quite as much data into their cells as the built-in types, since they only have two pointers instead of three to work with. But given that you can always point to some allocated data (and usually need to), it's not a big problem.

What the type pointer points at is an array of implementation functions for molding, generic dispatch ("action!"), comparison, etc. This pointer is made to be very similar in layout to the table for the built-in types like BLOCK! etc. So the performance hit doesn't affect the built-in types; the custom type just has a built-in type dispatcher in the master table that just does a little translation jump to use the table referenced in the cell instead.

Open Questions

Mechanically there are still several things to be addressed. One is that since there's only one cell kind for these custom datatypes--and all typesets are still 64 bits--if you say in your function's type spec that you take an IMAGE! it will also take a VECTOR!, and a STRUCT!... all extension types look like the same type to a TYPESET!. (I'm actually a bit skeptical of TYPESET! as a datatype and thinking what it does should come from maybe a generic interface that could be a function...so your type checks are actually generalizable, to where you could have a function that takes ANY-EVEN! number for instance...)

Also, exactly how datatypes will participate in a naming ecology is not known. Right now the theory is that they register via a URL!. That is to say that type of foo could come back as something including http://example.com/types/matrix. While that's a bit drawn out, one idea that came up in error IDs was that there might be a form of comparison function that lets you get as specific as you want about that... e.g.

>> /matrix ~ http://example.com/types/matrix
== #[true]

>> /types/matrix ~ http://example.com/types/matrix
== #[true]

I'll also mention the idea that type of would give back the new @[block] syntax; which would give them a distinct literal form, which has always been a problem. It may also be that kind of and type of are distinct, with the former giving back Rebol-structured word-based data, with the latter giving back some OBJECT!-like thing called a TYPE! which can answer meta questions about the type.

>> kind of 10
== @[integer]

>> type of 10
== make type! [
     name: "Integer"
     description: "A whole number, either positive or negative"
     minimum: ...
     maximum: ...
     (etc.)
]

So plenty still to worry about. But the first tier goal of being able to build variants of Rebol without GOB! or IMAGE! or VECTOR! or STRUCT! (or mentioning them in built-in type table), while still keeping all those features working has been achieved.

1 Like

I have not logged in sice last year but I have been here reading the progresses without logging in. This has caused me to not read your reply. I'll read it during this week, thanks.

Note that extension types were removed--at least temporarily--because there were more fundamental things in the type system which needed to be resolved.

However, a similar approach is likely to be re-added once things settle down, that similarly reserve one of the 4 platform-pointer-sized slots for type information.

Of this byte, only 64 of the states are used in R3-Alpha--and I believe Red. This was chosen instead of 256 in order to limit the number of kinds that need to be handled in a TYPESET! to 64 bits...making typesets small enough to fit in the rest of the cell. (That simplicity is nice for implementation, but points to a pretty big weak spot in the type system for more sophisticated purposes...which new designs will be needed for!!)

Though the "unlimited" number of types from extension types is on hold, the basic limit of 64 fundamental types has been lifted to 256.

It might sound like a small thing to implement, but rigging it up to work and not perform terribly involved some thinking.

You can see some of these new types and SIGIL! here:

REIFY and DEGRADE: a Narrower META and UNMETA

You can do as you wish with them in dialects, but the evaluator meaning is specific...and I definitely do want to put together a presentation explaining why they are all necessary and what they do.

Last year I didn't get all that much done (more has happened in the past two months than all of last year). But there were a few things:

2023, Another Year 💨 A Few Things That Happened

1 Like