2023, Another Year 💨 A Few Things That Happened

Happy New Year!

This was a (relatively) slow year for development on Ren-C. There were a lot of life reasons. But one technical reason is that there was a lot of drudge work to go through all the various ramifications that generalized isotopes had brought into the system. There were hundreds of broken tests, each representing a question about the design of some feature.

Unlike forging ahead with the revelation of isotopes themselves, cleaning up the carnage wasn't glamorous or fun. And of course, the world doesn't stand still--Emscripten makes changes, Cypress tests make changes, GitHub Actions makes changes. Whenever they do, you have to move along with them or things won't work. Syncing that up (just to get back to where you were before they changed it) is no fun either.

Looking through the commit log I'll spare you the boring details, and try to point out a few interesting things that differentiate what's known now that wasn't known in 2022.

Neatest Demo: Visual UPARSE Debugger

This shows the potential of where things are going, and shows a hint of some of the overall strategy to aim Ren-C at interactive debugging.

Biggest Single Item: Actions as Isotopic FRAME!s

Shortly after conceiving of generalized isotopes, it occurred to me that there could be an isotopic form of functions that would run from a WORD!-dispatch... while a plain form would not.

This was the last piece of the puzzle to accomplish the invariant:

block1: ...  ; any possible block!
block2: copy []

for-each item block1 [  ; items in block guaranteed not to be isotopic
   append block2 item  ; ITEM can't be isotope, so won't invoke an action
]

assert [block1 = block2]  ; b.c. ITEM couldn't be splice, unset variable, etc.

But having two forms of functions was a conceptual quirk. This rippled into a naming quirk of what to call these two distinct things. "ACTION" vs. "ACTIVATION" was a first draft, with questions about whether "FUNCTION" vs. "ACTION" might be better (dredging up the old problem of the dual use of FUNCTION as a verb to create functions vs. the noun of the type).

To ultimately resolve the naming issue--as well as pare down the number of moving parts--I had an idea. If you made a FRAME! isotopic, then the behavior when referenced through a WORD! would be to make a copy of itself and invoke it. Then we just refer to the isotopic form as an "action".

There are implementation challenges with this that I'm still chipping at. Actions had evolved to be the "same size" as frames, but slots where frames put variables were used to hold parameter definition information for actions (quoted vs. normal, accepted types, etc.) Merging the two is raising some complex and interesting issues, but not the kinds of things that resolve overnight.

Other Naming Finalizations (Hopefully?)

What "VOID" means has been changed so many times that if someone saw that with no context... I'd be empathetic to them thinking Ren-C is nuts. But naming was tied up with fighting over the unique status of "what can't be put into blocks" in the pre-isotopic era... until it was realized there were a lot of these things.

(For newcomers like @bradrn, please see for instance Shades of Distinction in Non-Valued Intents)

I've tried to retcon the forum posts to be correct (sorry if my constant edits make posts show up as "latest" in the feed, that's just how this software works). But the names feel stable to me now. The last piece of the puzzle was being willing to call isotopic voids "TRASH", which seems to be AI approved.

Typechecking Via Predicates (sped up with Intrinsics)

Isotopes raised a big question about type checking, in terms of "What is the TYPE OF an isotope", and how do you say that a function accepts or doesn't accept them as arguments?

I couldn't come up with any bright ideas besides saying that all isotopes are of the same fundamental ISOTOPE! type, and you test for distinctions via predicate functions.

What's neat about this is that you could start writing things in your type constraints like foo: func [x [even?]]. Yet this meant type checking would slow to a crawl.

Optimizations for this are still evolving. But one big one was the introduction of intrinsics. These are natives which the evaluator can subvert building a FRAME! for, calculating one argument and asking for one result from a C function. Lots of functions fit this model... typecheckers like INTEGER? or simple functions like NOT.

COMMA! Evaluates to Isotopic Barrier State

I've called COMMA! "the language world's weirdest comma mechanic". But it seemed that with the rise of empty parameter packs as "vanishing" that the evaluator behavior of commas could be the same as something like a COMMENT or an ELIDE.

Yet in practice, I realized that fusing barriers with invisible intent was not optimal. Instead, reserving another isotopic state--the isotopic comma--was a better solution.

ODBC, Cryptography, Locale Use External API Only

The long bet in the libRebol API is to use strings as currency, in variadic functions that could splice in Rebol values. Large things have been built on this API, like the Web console. But to further prove that you can really do everything without needing to pick apart cells, this year I pushed several extensions to use this API exclusively...with no direct linkage to the internals.

ODBC contains really interesting examples of this coding stylization, which are worth a look:

/extensions/odbc/mod-odbc.c (line 549)

QUERY Sync'd And In Continuous Integration

The QUERY dialect presented 4 years ago is for letting you query the filesystem or web pages as a kind of replacement for GREP/SED/AWK kind of stuff, e.g. instead of writing:

ls | awk -F . '{print ${NF}' | sort | uniq -c | awk '{print $2,$1}'

You could say:

SELECT #fileext count (*) FROM files GROUP BY extension

While @BlackATTR had it working at one point, there were many changes...and some of those changes were inspired by seeing how much worse the query code got when imposing some experimental changes of the time. I've brought it up to date now, and so it should be in the corpus of codebases that are kept working from here on out.

Behind the Scenes: Naming And Code Organization

I've been slowly pushing on the internals to read better. Particularly important is revisiting outdated comments and adjusting misleading things to reflect the current understandings.

Internal naming is moving to where instead of clumsy names like REBVAL and REBBLK we have things like Value and Array for types. All-caps is being reserved for anything that's a macro with "weird" properties that an ordinary function couldn't have, e.g. things that can act as the left side of an assignment. It's slowly getting cleaned up to have a consistent and readable look.

Various places where debug checks were creating invasive syntax like having to use mutable_Foo(foo) to use as the left hand of an assignment with Foo(foo) on the right hand side has been cleaned up with better use of C++. This is done with container classes that defer the decision of whether they are the left hand or right hand side until the time of usage.

Something that's been interesting to work on is figuring out how to hook in debug checks for casting operations. I've invented a cast(type, value) macro that in C just acts as (type)(value) but can do more interesting things in C++. It has turned out to be quite the wild mechanic, and you can get the gist of a little of how wild it is by looking at %sys-debug-casts.hpp. In any case, the neat thing is how transparent this is to the C build and the code reads obviously as just doing the cast, but you get extra checks.

Onward to 2024... What's In The Plan?

The more things get pinned down the more performance is starting to be a meaningful sticking point in the design. So I've been looking at how fundamental choices impact performance, and seeing where the requirements can be bent to avoid situations which only afford expensive handling

I still consider UPARSE to be the best showcase of dialecting, and how you can accomplish it in usermode.. But in practice, it is painfully slow on anything large. I've been trying to look at how to optimize it in ways that minimize how much of it needs to be written as native code... looking for features that provide cross-cutting benefits that would apply to more scenarios than just UPARSE.

Binding needs to get sorted out and simplified. What's in the code right now is a real mess of experimentation, and it needs to be rewritten. Even if there's no great answer--then I'll have to just pick a passable answer--so that this can get out the door.

What I'm hoping is to have things done well enough that a good comprehensive paper and YouTube video could be made. That way it might save others the trouble of re-inventing the same mistakes that all the Rebol clones seem to keep making. There's some interesting ground covered here, and I want to make sure it's all documented.

3 Likes

Happy New Year to you too!

Just curious, perhaps I've missed where you defined it but what you mean by "this can get out the door"? A completed language research project you're comfortable having exhausted the ideas and interest within (what I've come to believe is now the case) or a stable artifact that an end-user like myself can have comfort in being a superset of and something that can finally replace Rebol 2 (what I'd originally hoped from the project).

Just wondering what your end target looks like really. Is the comprehensive paper and youtube video the final outcome of the project?

Unfortunately it's hard for me to make any specific promises about "fitness for purpose" of an end product.

How I feel about pushing things through has a lot to do with how things are going. As I mentioned, the last year was not all that fun for me, and progress slowed down as a result. It's easy to spend days or weeks just looking at other things.

If something interesting breaks through--a binding revelation, or something else big--it can be an impetus to keep going. But there's a pretty big list of challenges...and whenever I have to do large-scale revisiting of the codebase I am reminded how much junk is in there. If it's all challenges and no fun, then that may mean leaving things for the AIs to finish.

It's telling that Red sought explicitly to be an open-source Rebol2 replacement and has taken a decade-and-some and not gotten there--despite not really challenging any of the design decisions. What little I've read of their PORT! model suggests a lockstep repetition of bad ideas... and they still can't seem to get that done.

So far, Ren-C has IMO delivered more, while making fewer promises. It may continue to do so.

But the best way to guide development to make sure it does what you need it to do is to have codebases represented in the list of things that are checked by GitHub Actions.

1 Like