"System Object" vs. "Sys Context"

hostilefork · August 27, 2021, 12:07am

Let's say I asked you to explain the difference between the system context (sys) and the system object (system)--and why they are two different things.

I'm betting you'd probably not be able to explain it. Because I get them mixed up all the time.

TL; DR: - Having something called SYSTEM that is distinct from SYS isn't doing anyone any cognitive favors, and it should be changed. I propose SYS and SYSTEM be synonyms for the system context, and what is currently known as SYS become SYS.UTIL.

sys System Context: Usermode Helpers for the Core

There's a lot of support code behind things like booting, or LOAD or DO, or registration of codecs or whatever... that the average user is probably not going to want to call on a day-to-day basis. An example is make-scheme.

These routines don't necessarily compete for names with things in LIB. But putting them in SYS helps to call out that most users probably don't need to be concerned with them.

Pretty much every function in LIB routines is supposed to be something you could imagine a user being interested in calling in their scripts.
SYS functions are things that would generally only be called from extensions, or internally as a support function for natives or LIB functions.

system System Object: Global Variables and Template Objects

The system object is specified by a file called %sysobj.r.

As most people know, this is where you go to look for global state like the arguments that were passed to the interpreter when it ran...or the list of modules that have been loaded. But it also contains a large number of empty objects.

The make prep process creates a header file of constants of the integer indices of the fields in these objects. That makes it fast and easy for the core C code to directly address those fields, instead of having to linear search for them by key name.

Using these template objects has the effect that no matter how you create a derived object, it will get the fields named in them in that particular order. So the precalculated indexes will always be able to find the fields where they are looking... as long as that object was used as the base object.

But these template objects aren't just about interfacing quickly between C and the structures. Using common objects as keys helps save on allocations of lists of the keys...as the objects will share that list (as long as the object isn't expanded or created with more keys than specified in the base.

(I actually think it is a mistake to be doing objects this way, and that we should adopt JavaScript's model of "hidden classes" to get a more systemic optimization.)

Proposal: SYS => SYS.UTIL, SYSTEM <=> SYS

Having to type sys.util when you want something like sys.util.make-scheme isn't that oppressive.

UTIL is abbreviated, but so is SYS. We could make UTIL just an alias for UTILITIES, so the verbose-minded could write system.utilities.make-scheme if it pleased them.

hostilefork · July 1, 2022, 8:02pm

Clearly I got around to doing this as quickly as possible...

...but better late than never. I doubt most people will notice (when is the last time you called SYS.MAKE-SCHEME ?) But a few places were affected.

For most people, the big difference would be that you can now say sys.script.args instead of system.script.args.

What's in SYS.UTIL, anyway?

It's kind of a grab bag of weird things at the moment, all of which can probably use more thought.

But one key reason for its existence is that there's a bootstrap phase of "Everything you need in order for LOAD of a MODULE to work"... and then everything after that.

So SYS.UTIL is where various service routines pertinent to loading live, like codecs and IMPORT mechanics. If you're going to load from a file, you need the FILE! scheme, so things like MAKE-SCHEME live there too.

Once the SYS.UTIL code has been processed by bootstrap, it makes some things available in LIB. So if you say sys.util.load you're getting the same function as lib.load, which you typically invoke simply as LOAD.

Here's some of those foundations that get exported to LIB:

MODULE
LOAD-VALUE
LOAD
DECODE
ENCODE
DECODE-URL

Then there's other miscellany...

SYS.UTIL.MAKE-SCHEME

The MAKE-SCHEME function is used to register some callbacks that know how to deal with things like read file://. You say the scheme name is "file" and then you give some handlers for what to do when that gets a handful of possible verbs, like READ or DELETE etc.

On the surface this is cool, in practice there's a lot of unanswered questions that I won't go into here.

SYS.UTIL.ADJUST-URL-FOR-RAW

In the Web Repl, we had the idea of letting you say do on a short GitHub URL for a file...which would give back decorated HTML if you were to READ it. This kept you from having to hunt down the raw link for a file, you could just use the same URL that would show in your browser.

(Notably we don't want READ to do that jump to the raw URL. It's specific to DO, because we ostensibly know you can't mean to "run the HTML".)

Once we had this feature it was so useful we wondered "why can't the desktop builds do that too?" So the logic for translating URLs on GitHub and GitLab in this way was moved into sys.util.adjust-url-for-raw, because it's needed by the low-level DO commands

SYS.UTIL.UPARSE

We're in a delicate situation with UPARSE. It is the best-tested Redbol parse ever, and the dialect is too powerful to not use. but also it's slow, gives terrible error messages, and is still under development.

So it's kind of risky to use it in low-level bootstrap. Not so much for being slow, but because debugging the error messages prior to boot can be very painful.

However, some non-bootstrap uses are permitted. So once bootstrap is done, the UPARSE module pokes a copy of its action into sys.util, so that "non-essentials" like decode-url can use it.

SYS.UTIL.DO*

This is a service routine called by the DO native when handling for the type is easier to write in usermode than it is as a bunch of C code.

So if you DO something like a BLOCK!, that just runs the evaluator inside the interpreter as C...nothing special. But if you DO a FILE! then there's a lot of machinery that needs to come into play, and so the DO native will delegate to this sys.util.do* function.

SYS.UTIL.IMPORT*

This is the workhorse for much of the shared code that runs underneath DO and IMPORT. Really, DO is just a skin of this function that reverses its return values: IMPORT* returns the imported module as its main parameter, and the evaluated result of the body as its second parameter. DO swaps that.

(Much of this is still under review for design. But the key is that much of what mechanics should be shared, are shared.)

A big difference between this IMPORT* and IMPORT is that you tell it where to import to...instead of it just assuming it should be your "current context".

(If you're wondering how IMPORT knows what the "current context" is... it's just a trick, like with definitional RETURN. Each module declares its own IMPORT function, that's a specialization of IMPORT with the "where" as being itself.)*

SYS.UTIL.EXPORT*

This is the counterpart to IMPORT* which pokes a definition (or definitions) into the exports list of the current module. Again there is a local EXPORT function for every module, specialized to say that where to export from is itself.

SYS.UTIL.MAKE-PORT*

This is the code that runs when you call MAKE PORT! on something. I have better things to do with my day than be a broken record talking about how bad/underdesigned all the port stuff was. Anyway, this is still relatively unchanged since R3-Alpha...for better or worse.

SYS.UTIL.REGISTER-CODEC*

This is used only by the BMP, PNG, JPEG, and GIF extensions.

Codecs were simplified in Ren-C to not have a specialized C-style interface, but to talk in terms of BINARY! values and use natives. That part was a good decision, but other than that, there's not been any real development on codecs. Exactly how they are supposed to be different from streaming ports is not clear.

(I've lamented that we have a lot of stream-capable code now...from zlib and mbedTLS...but no real way to expose it.)

SYS.UTIL.*PARSE-URL

This is the ruleset used by DECODE-URL. It's really only outside the function to save on an indentation level, which is kind of lame.

SYS.UTIL.SCRIPT-PRE-LOAD-HOOK

This is something that gets called every time a script gets loaded. The default prints out the fact that it did...which Graham wanted disabled in the ReplPad.

It might be nice to put these somewhere like sys.hooks ... but the key reason for this living inside sys.util is so that SYS.UTIL.LOAD can see it at that phase of bootstrap.