Should UNSET? 'VAR-NAME check for VOID!, not NULL?


Due to some upcoming changes, accessing NULL variables through a plain WORD! will not cause an error. Hence we're shifting to where most VOID!s you'll encounter in the wild are going to relate to variables that are conceptually "not set". This will cause the necessary errors on typos, not-yet-assigned local variables, etc.

So here we might ask ourselves: What is an unset variable, if not a variable that errors when you try to access it via a WORD! ?

As a talking point on why UNSET? tests for nullness today, consider this behavior with nulls:

ren-c-upcoming>> for-each [a b] [1 2 3] [print mold compose [(a) (b)]] 
[1 2]

That code assumes the new world where you don't have to use :b to defuse the error of accessing the nulled b on the second iteration.

For comparison, Rebol2/R3-Alpha/Red wouldn't have errored either...but that's because they would use NONE!:

rebol2>> foreach [a b] [1 2 3] [print mold compose [(a) (b)]]
[1 2]
[3 #[none]]  ; actually renders as [3 none]

That's misleading, because it's the same result you would have gotten if a NONE! had literally been in the block:

>> foreach [a b] [1 2 3 #[none]] [print mold compose [(a) (b)]]
[1 2]
[3 #[none]]

Rebol2/R3-Alpha/Red's only other choice for the variable would be #[unset], which just misleads in another direction.

Ren-C clearly has the edge here, so no complaints on the implementation. The reasoning is sound, and NULL is the right name for the state. But do we have to say a null variable is "unset"?

e.g. could we just be content saying the variable is "nulled"?

It takes a bit of linguistic trickery to say:

A variable can be "set to null" (or "nulled"). Rebol considers such variables to be technically "set" (e.g. returning true from set? 'var-name...yet still they hold no value. By contrast, a variable is "unset" when it holds a value of type VOID!.

This means Rebol's definition of setness or unsetness thus does not have to do with a variable holding a value (such as one that can be put in a block). It is a pragmatic point on whether errors are triggered when the variable is accessed via a plain WORD! or PATH!, or if the fetched value errors when tested for truth/falsehood, or assigned via a SET-WORD!

Thus we speak of nulled variables (holding no value) and unset variables (holding a VOID! value).

Does that sound convincing? :-/ It seems tenuous to me...

...but IF voids were UNSET?, why not call the datatype UNSET! ?

I'd still feel firmly that variables are unset, not values.

I like set? var and unset? var as the tests to see if a variable is set through its name. But if UNSET! returned as the name of the data type itself, then by Rebol conventions you really couldn't have UNSET? be anything other than a direct test of a value for that type... e.g. unset? :var-name. Seems like a step back to me.

I'm hypothesizing a world where "unsetness" is the condition of a variable holding a void. But it's not so sensible to say that the void value inertly living in a block is only get an unset condition when that void makes it into a variable.

VOID! seems short and punchy. One letter less than UNSET!, and a number of letters less than something like UNDEFINED! (JavaScript's name choice for a similar idea).

VOID! being the return type of functions that have no sensible return value lines up nicely with C.

I don't know what I think, here.

I kind of feel like null owns unsetness, not void. "unset variables hold no value" seems a complete sentence. So unset? var as a shorthand for null? get var seems appropriate to me. I think that will be a lot more common than wanting to know if a variable-you-have-the-name-of holds void.

But that could just be nulled?. And unset var or unset 'word are helpful shorthands for when set var void won't work, and you'd have to say set/any var void. You don't get as much of a leg up for null, because word: null now works, and set var null works.

Another practical matter is what do you want the error message on void values to be. Rebol2's would be outright misleading in Ren-C:

rebol2>> asdf
** Script Error: asdf has no value

It does have a value (only NULL variables have no value, e.g. are not holding an ANY-VALUE! type). But what are the other options, here?

  • "asdf is void"
  • "asdf is VOID!"
  • "asdf holds a void value"
  • "asdf is not set"
  • "asdf is unset"
  • "asdf is undefined"

I feel like "asdf is VOID!" is the most uncontroversial thing. But it may confuse people who make typos and wonder how it is that this mystery variable holds a void. If you could say it's "unset" or "not set" and keep a separation between nullness and unsetness, it might hold together. And you'd have more compatibility with the historical UNSET function.

It's definitely something to think about. But I mostly wanted to lay out understanding of why even if VOID! is creeping back to being almost identical to UNSET!, that the name should not go back! I much prefer saying that an unset variable is one holding a #[void] value.


Here's a piece of code to consider I just looked at, from HELP:

    for-each [key val] system/contexts/user [
        if set? 'val [
           append libuser reduce [key :val]

For this code if #[void]s are unset, and NULL variables are to be considered "set" then you're not going to get things added verbatim. That's because REDUCE's only design options here are to produce a BLOCK! that has to force the value to a BLANK! or a VOID! (or error).

This shows some of the real core benefit we've been getting from making NULL the "unset", not VOID!. :-/

I'm really pretty torn about this. It may be in the scheme of things that <opt> refinements...a special case where you say you want NULL and not BLANK! when a refinement is not actually the best compromise. (Which is what I did to start with.)

What I'm going to do is put these branches aside for a bit and work on something else. People can feel free to throw in their observations.