Happy 2023, and 2022 General Status & Reflection

hostilefork · October 16, 2022, 4:56am

Happy New Year!

Note: This was originally an October 2022 status report, but I didn't do anything in November or December, so I'm really just going to add a bit to it...

Having hit up against some pretty deep questions about redesigning datatypes in October, I've been taking a break. It's good to clear one's head sometimes. I packed up and moved, and there was a lot to do for that.

I've been poking around some other languages and tools... tinkering with things like the Rust compiler sources... looking at Stable Diffusion and what sorts of trends are on the rise in open-source AI. Reading a lot of articles and watching YouTube Videos.

Hopefully some of the datatype-related ideas will settle in my head and I'll be able to start making progress on those. But I thought I'd review some of what's gone on the last months.

It's certainly worth pointing out that the system is using "stackless" processing... which is a significant change, that from a practical standpoint has let us get rid of costly workarounds needed for browser interop in the web build.
Changes were made to permit executing in WASI runtimes, enabling a new cross-platform target as well as being used in "serverless" cloud computing scenarios.
Integrating a rich editor and tabbed interface to the web REPL is a big cool thing.

I could go on about every little thing (like updating the crypto, or that web tests now run on cypress.io) but there are posts and git commits for that.

Nearly every other major change worth discussing involves big changes surrounding isotopes.

Generalized Isotopes Have Changed (Almost) Everything

From the beginning, Ren-C's goal has been to attack the fundamental weaknesses in how Rebol works as a language.

Integral to attacks on several major problems has been the idea of "generalized isotopes". If anyone hasn't had a chance to read the thread describing the rationale, I suggest doing so now (or re-reading it):

"A Justification of Generalized Isotopes"

This development kind of touches everything in the system:

It strikes at the heart of the /ONLY issue. When you see it used with something like REPLACE, you can really have that "A Ha" moment... that splicing intent should never have been carried by a refinement

>> replace/all [[a b] a b a b] [a b] [c d e]
== [[c d e] a b a b] 

>> replace/all [[a b] a b a b] spread [a b] [c d e]
== [[a b] [c d e] [c d e]]

>> replace/all [[a b] a b a b] [a b] spread [c d e]
== [c d e a b a b]

>> replace/all [[a b] a b a b] spread [a b] spread [c d e]
== [[a b] c d e c d e]

"Definitional Errors" have arisen as a crucial combination of solutions: one of the earliest Ren-C mechanics (definitional returns) mixes with a new idea (error isotopes), to give an actual viable answer for error handling in the system:

>> attempt [print "Attempting to read file" read %nonexistent-file.txt]
Attempting to read file
; null

>> attempt [print "Attempting but made typos" rread %nonexistent-file.txt]
Attempting but made typos
** Script Error: rread word is attached to a context, but unassigned
** Near: [rread ** %nonexistent-file.txt]

Moving away from a dedicated "LOGIC!" datatype to ~true~ and ~false~ isotopes solves a "top-of-the-page" representational issue... giving an in-between state that can be held in a variable, but must undergo some reification process before being carried in an array:
```
>> 1 = 2
== ~false~  ; isotope

>> append [a b c] 1 = 2
** Error: Arrays cannot hold ~false~ isotopes, must REIFY or META

>> append [a b c] meta 1 = 2
== [a b c ~false~]

>> append [a b c] reify 1 = 2
== [a b c false]
```
Treatment of BLOCK! isotopes as "packs" gives answers to how to pipe and transform multi-return expressions... and has also been leveraged as a way to tell the difference between "true null" (e.g. should trigger an else) and "packaged null" (a null packed into a block that is "a positive result that just happens to be null").
It has reshaped "voidness"--as something which only vanishes in interstitial slots, and can be manipulated conveniently in its isotopic form.
- Being able to pass voids as arguments enables fully functional compositions, like the infamous FOR-BOTH case of writing a loop wrapper that preserves the loop behavior invariants.
Isotopes are also covering things like being able to tell the difference between passing an ACTION! you want to run and an ACTION! you want to look for literally (by its identity)

In some sense, this "discovery" has pushed a big reset button on the project... because it affects so many things. Nearly everything has to be revisited. But the biggest issue is with the type system.

Isotopes Further Stress Weaknesses Of DATATYPE! / TYPESET!

Rebol and all of its clones deal with datatypes in the exact same simplistic way, limiting it to 64 datatypes... with a TYPESET! being a 64-bit number with one bit for each type.

Perhaps there are those out there who think this is fine. They may see it as like the engineer and the toaster story... where making a toaster with very few settings is virtuous, and it's unnecessary complexity to have any more complicated type system.

I've laid out some of my beliefs...such as that "kinds" of values (in a coarse sense) are at the very least based on WORD!, so you can add new things like vector and matrix and image...none of which I believe should be mandatory to build into the core...so lighter builds can be made.

Beyond that I have suggested that getting a full "type" description could be a more descriptive structure, telling you not only that something is a matrix but also giving you its dimensions.

The more these kinds of thoughts seem true, it makes TYPESET! seem like a bogus kind of thing... and type checking functions might make more sense. Why not something like:

foo: func [bar [integer! series? even?]] [...]

Performance aside, trading off typesets has some dangerous implications...such as what if a function gets specialized with a value that passes, but either the function or something about the value changes so it wouldn't pass? As it happens, today's specialized values are written in the place where type information for a parameter would usually be--so you can't typecheck after the fact.

With so many aspects of isotopes working out, it's unfortunate that the terrible type system is getting in the way. A dissatisfying answer would be just to say that all isotopes report that they are ISOTOPE! and you have to perform operations on them to find out the actual type... but this is what you historically have to do with items in BLOCK! (you can't typecheck for "block that contains one integer!" vs. "block that contains two strings").

Anyway: this is a longstanding problem area, that isotopes are now making it more imperative to solve.

Still...There Are A Lot Of New And Interesting Things

The biggest takeaway here is that isotopes have added new dimensions that are letting Ren-C bring really advanced capabilities within reach of very novice users.

There are many examples, but some of the most amazing to me are things like hooking the default UPARSE combinators--which have multiple return values--by being able to turn those return values into blocks and transform them.

So I'm still "finding the cool" in new behaviors, so that's motivating. But maybe with the basics sorted out, it's getting closer to time to start ripping out some unnecessary parts, so that a committed subset can be pushed out to people on the web.

iArnold · October 16, 2022, 10:08am

When we think about this, it is the programmer that decides that the things an end-user selects get put inside a block to process the given information, it is also to the programmer to check on the validity of the items inside a block! (type! and value).
That is thus okay.
The problem is that even if the rest of the software development world thinks that fewer than 64 datatypes is sufficient, we the Redbol world occupants really want to expand beyond that limit. What you describe is the (non)solution of having ISOTOPE! call out "hey I am different, but you need to find out for yourself how different exactly". Which is surely not what we really want.
Hence we need a way to expand beyond our 64 datatypes limit.
When there is no room anymore, which is the case, I propose to follow some kind of ASCII to UTF extension trick. This will limit the number of primary datatypes (with a 0 in first position) even more, but other type would then make use of a follow up byte to mark the definitive datatype.

Take care, hope you have found a great place to stay!

hostilefork · January 5, 2023, 7:43pm

I mention up top that I have been busy for all of November and December, so I didn't do anything worth adding to the October status post in terms of code progress.

But putting some space between yourself and the code can offer other insights.

Definitely this: I want to put the system into people's hands.

We've reached a point where for once, we know how APPEND works. That is no small feat... and the solution was a decade in the making.

I believe we now know how true and false work, with ~true~ and ~false~ being their meta state that evaluates to an isotopic state which can't be put into blocks.

There are so many neat solutions to old questions... like what is any [] or all [], and having coherent answers to basic questions is something to stand on the shoulders of for the next tough question. It all interlocks.

I'm hard-pressed to say exactly what's going to happen with binding, but I'm wondering if there's any way to make a system that's useful even if we don't know how that will work.

But I'm going to keep going at it, and start posting again here now that I'm getting more settled. Again: Happy New Year!