The Naming of NULL vs. The "Meaningless" Value

rgchris · December 10, 2020, 3:45am

Note: When this thread was initiated circa 2020, there was a VOID! datatype that was essentially equivalent to a Rebol2 UNSET! under a different name. It could be put in blocks, but variables holding the value would generate errors on WORD! access. NULL was the only "non-valued" state that could not be put in blocks...a situation that was revolutionized with the rise of generalized isotopes.

the difference between null and void is that null is a non-existent or empty value or set of values while void is an empty space — WikiDiff

Looking at the definitions in the link above, it feels at first read that our NULL and VOID are backwards, that NULL is the tangible placeholder that can appear in a block and VOID is the vacuum.

I think nailing down the definitions of NULL and VOID outside of their current application or behaviour—even simply within English—would be instructive. I still feel unsettled with the way both work in a way I can't put my finger on even if I agree from other posts there is a need for both.

Either way, I feel do [] should vaporize:

reduce [do []] => []

hostilefork · December 10, 2020, 9:21am

My alternative conception was that stamping something with VOID on it is pretty well understood as a way of saying "alert, this thing is tainted, you might keep it for your records of what happened but don't try using it".

But a voided thing would still be a thing, unlike NULL as a complete absence of value.

Maybe the best way to attack the problem is to play out the scenarios in your head:

Q: "What is the value held by variables to indicate they haven't been assigned?"
A: "They contain the state known as <YOUR NAME HERE THAT'S NOT VOID>."

Q: "What does the PRINT function return?  Does it return the string it printed?"
A: "No, that costs too much in the average case that doesn't use the generated
   output.  It just returns a <YOUR NAME HERE THAT'S NOT VOID>...the same
   thing that unset variables hold, so that you'll notice getting an error if you
   try to use the value.  Unless it outputs nothing (hence no newline), in which
   case it returns a NULL...so you can react to that situation with ELSE or THEN
   or whatever."

While VOID might be rethought, NULL is non-negotiable

The parity with C's NULL (and all the languages that also embrace that meaning) is a critical element, and ticks all the boxes:

NULL is a 0 pointer value
It acts conditionally false
It's unable to lead you to data to operate on (e.g. you don't have a "thing" you can set a NEW-LINE on, there's no thing).
If a function returns NULL you are not stuck with a handle to an object that you have to free.

Could VOID Be The Result of Invisible Functions?

I could see the case that "VOID" wouldn't refer to a datatype at all, but the return behavior of functions like COMMENT and ELIDE. This interpretation would be really nothing... not a NULL, not an ornery value, but full-on vaporization. You might still phrase that as "The comment function returns void", or "has a void return", or something of this sort...as in C, when in fact returns don't even come into play.

That would line up to having parity with C as far as the interface of "void" goes:

void rebElide(...);  // this is the interface for rebElide, it "returns" void
REBVAL *rebValue(...);  // this is when you want a result back to C

rebElide("print {rebElide means you avoid an API handle to a bad-word!}");

rebValue("print {if you wrote this, you'd leak the returned handle...}");

So we'd be saying inside the language, ELIDE and COMMENT are spec'd to [return: <void>]

If shifting the meaning of VOID like this helps with your interpretation as "less than nothing" (as opposed to "a thing that has been marked invalid") then I'm game for that change. There seems to be a strong enough basis for it.

...But we'd Still Need a Name for "UNSET!"

Taking void for the phenomenon of vaporization would solve one naming problem, but we're still stuck finding a name for "what unset variables hold and that things like PRINT return."

I've explained at length why I don't like UNSET! as that name... because variables are unset, not values. POISON! or TRASH! or other things are awkward and I think maybe too pejorative...it indicates that there's some kind of problem, when there's only a problem if you try to use it in certain ways.

UPDATE: 2023: The name TRASH was decided to actually be used, softened somewhat by the fact that it's not used that often... and the ~ notation can suffice.

rgchris · December 10, 2020, 1:55pm

I'd still lean toward my own sense of what I'd consider (or feel) my primary definitions of NULL and VOID—where NULL is the product of nothing and VOID is the realm of emptiness—a place with no beginning or end, or indeed no definition (void as in voided checks is not the first meaning that comes to mind).

I think this is why the idea of VOID as replacing UNSET was key—there's an infinite array of words out there and until they are defined, they are just points in an endless vacuous universe. Language can be fluid and subjective, this is where my mind went.

If I were to expand on that, I'd say that NULL is negative nothingness and BLANK is positive nothingness where we must affirm across the wire that nothingness is the intent.

If I were to indulge a little further and incorporate bad words, then VOID becomes more like outer space and the various defined isotopic words are stars within distant constellations.

I get that NULL tracks to other language uses, perhaps my having no background in those languages is why I find it jarring (it's not inconceivable that they all use the word incorrectly ). I'm use to NONE as being control flow currency so the idea of NULL now having that role isn't that hard to grasp, just the other qualities that don't line up.

You're right in that I need to grapple with code, I need to review the test suite for the R3C branch where other changes aren't as big in comparison to NULL/BLANK/VOID. I should likely stop commenting on this issue until I've done that.

hostilefork · July 27, 2023, 2:36am

2023 UPDATE: Significant changes to the system ushered in by isotopes mean that true vaporization is actually accomplished through "empty isotopic blocks", which are called NIHIL. VOID expressions are the product of failed conditionals, and they often vanish but can be stored in variables.

Explaining that design and why the parts are all necessary is beyond the scope of this historical post. See "Invisibility Reviewed Through Modern Eyes" for the rationale... and I think it holds up!

Where things stand now hopefully performs to your expectations. Voids have no representation and there's nothing to display in the console. do [] gives back voids, and void vanishes in REDUCE and acts as a no-op for an APPEND. It is also the result of failed conditionals, and vaporizes in COMPOSE/etc.

>> void

>> reduce [1 + 2 do [] 10 + 20]
== [3 30]

>> append [a b c] void
== [a b c]

>> if false ['b]

>> compose [a (if false ['b]) c]
== [a c]

Because it has no representation, the quoted form of void is simply a lone apostrophe. And its quasi form is just a lone tilde. (See the post about three single-character intents for deeper discussion.)

>> quote void
== '

>> '

>> append [a b c] '
== [a b c]

>> quasi void
== ~

Then there's the isotopic form...I wound up deciding calling this TRASH was best, and it is the contents of an unset variable:

>> ~
== ~  ; isotope (a "trash")

>> x: ~  ; will unset the variable
== ~  ; isotope

NULL is an isotopic form of the WORD! "null". In the API this is represented as the 0 pointer and does not require having its handle released, so it is like C's NULL. It is used as an "ornery nothing"...but unlike TRASH it doesn't indicate an unset variable, so it can be fetched by normal WORD! access. The system accomplishes elegant error locality using the VOID-in-NULL-out protocol in many places, which hinges on the MAYBE function that converts NULL to void.

>> third [d e]
== ~null~  ; isotope

>> append [a b c] third [d e]
** Error: Cannot put ~null~ isotopes in blocks

>> maybe third [d e]

>> append [a b c] maybe third [d e]
== [a b c]

Then NIHIL is an empty isotopic block...which is a parameter pack with no values in it. This unstable isotopic form can't be stored in variables or API handles, and can only be handled in its meta form. (Again, beyond the scope of this thread...so read the post on modern invisibility to understand why the shade of distinction from VOID is justified.)

Something that might seem counter-intuitive is that VOID feels kind of "further out" than NULL... given it has absolutely no representation. Yet it's tolerated as inputs more places to indicate "intentional nothingness". It's the real deal of nothing.

But then isotopic VOID comes in to be TRASH...the "bad check". A variable holding it is considered to not be set, and it trips up access via WORD!.

NULL sits somewhere between. It's a state you put things in when you want to be able to see if it's good or not (falsey)...yet most non-conditionals trip up on it when you try to use it as an argument. Easy to test for via WORD!; ornery to use most places.

All three of these states can be held in variables or API handles. And then pure invisibility is built upon a weirder mechanic of NIHIL, which can only be handled by ^META-aware code. You don't need to know how it works to use it (the implementations of COMMENT and ELIDE are trivial in both the main language and UPARSE combinators). But the mechanics are there required to implement them.

I'm happy to address questions about this, and I have answers for a whole slew of complex things. UPARSE has been the crown jewel of showing off how usermode code can achieve rigorous factoring and wrapping, and there are lots of other examples as well.