Why (or why not) have UNSET! in Rebol-like Languages


#1

(UPDATE: I’ve edited this post to reflect the terminology change to NULL. But also, I found a post on Carl’s blog titled “UNSET! is not first-class”. It doesn’t particularly add anything new, but shows signs of a desire to differentiate the state…which seems to naturally culminate in Ren-C’s choice to make it impossible to store it in BLOCK!s.)

Question asked of me was:

Could you explain in laymen terms why unset! is there in Rebol type languages, and why it’s there?
To my brain, something like none makes more sense to use for ‘no thing’. I get that unset removes the reference between word and value(s), but unset pops up for “no return value”, even though it’s a return value, and ‘none’-ish things.

UNSET!'s existence is ostensibly from wanting to give a distinction between a variable that is not set (should cause an error if accessed) vs. set. Rebol would be a more unsafe language if such a state were not available. You’d just go typing if my-varable = 1 [...] when you meant to say my-variable and there’d be no indication of a problem.

Note that the reason you don’t get a “no binding” error for a misspelling comes from the somewhat liberal choice to bind words dynamically in the user context just because they appear. It’s the reason you can write:

>> print-foo: does [print foo] ;-- gets bound here
>> print-foo
** error (it's in bound state but not set)
>> foo: 10
>> print-foo
10

The foo in print foo gets a binding to the user context, even though there was no foo defined there beforehand. If that weren’t the case and it expanded on seeing set-words only, there’d be more
safety against typos…which might make it somewhat less critical to have a “defined-but-not-set” state.

Yet I’d still argue that typos aside, it’s valuable to have knowledge of when you’ve used something before you’ve explicitly assigned it. So this “bound yet triggers an error in casual usage” state is generally useful.

But I agree with you on the next point… that conveying this state of a variable as a “value” which has a “type”, and can be inserted into a block, is a bad idea.

One could imagine it not being conveyable at all except through an is-variable-set? test for those concerned with the status…and other forms of access would be errors. But one could also designate it a special transitional status that simply can’t be put in blocks… which is what Ren-C does.

So NULL is not a “value”. It’s in fact, the absence of a value. “set?” or “unset?” are questions asked of variables, e.g. SET? 'X or UNSET? 'X/Y. Then VALUE? is a question you can ask of an evaluation product, e.g. VALUE? :NOT-SET-THING is false. But the system protects against any instances of a “null cell” making it into the body of a block or other array. TYPE OF :NOT-SET-THING will also come back as a NULL.

Now once you have this non-valued thing, it would be a bit of a shame to waste its unique status. Ren-C uses it as the outcome of a failed conditional, while forcing all successful conditionals to some value (“voidifying” it if the branch evaluates to NULL). So…

>> block: copy [a b c]
>> append block if false ['d]
== [a b c]
>> append block if true ['d]
== [a b c d]
>> append block if true []
== [a b c d #[void]]

Since you never meant to intentionally append a null to a block (as that is impossible) you have this extra dimension of flexibility.

There is also “refinement revocation”, where nulls un-ask for the refinement:

>> block: copy [a b c]
>> append/dup block 'foo if false [3]
[a b c]
>> append/dup block 'foo if true [3]
[a b c foo foo foo]

The out-of-band nature is taken advantage of in “expert” operators, which differentiate between no value and a legitimate blank value:

>> block: copy [a _]
>> take* block
== a
>> take* block
== _
>> take* block
// null

A casual TAKE operation built on top of TAKE* will give you an error if it didn’t actually take anything, as opposed to returning NULL. Similar solutions help one put blank values in maps, and distinguish that from the absence of a value.

To summarize: the “null” state which is helpful in denoting a transient situation of “complete absence of value” becomes much more useful when you can confidently say it’s not a value, and is never stored in blocks or other data structures. Then it can really indicate some out-of-band quality. Blanks (“nones”) are too casually used as actual placeholders to serve these purposes.


Why VOID! is not like UNSET! (and why it's "more ornery")
More on the mechanics of void
#2

Nenad Rakocevic says:

Redbol languages are based on denotational semantics, where the meaning of every expression needs to have a representation in the language itself. Every expression needs to return a value. Without unset! there would be a hole in the language, several fundamental semantic rules would be collapsing, e.g. reduce [1 print “”] => [1] (reducing 2 expressions would return 1 expression).

I suppose he hasn’t read Godel, Escher, Bach.

The fact of the matter is that there is no such thing as a “complete” system. All we do is push complexity around in order to shape the territory to become suitable for our purposes.

Somehow, the idea that reduce [1 print “”] could be an error is by its definition worse than having a block that holds a “unset value”.

It’s strange how uneducated minds work.


#3

This is correct explanation and Gödel, Escher and Bach have nothing to do with it, they all died long before Rebol was written by Carl Sassenrath. It is of course strange how uneducated minds work, when they can’t get correct explanation when they see one.


#4

I don’t know what you have to gain by embracing this fallacy.

Variables are “set” or “unset”. Not values. All values should represent a “set” state.

x: 10
if set? 'x [print "The variable is set."]
set 'x 20
if set? 'x [print "Yup, the variable is still set."]
unset 'x
if unset? 'x [print "Now it's unset."]
if void? :x [print "Here's another way to do it."]
if not value? :x [print "A way of saying the same thing."]

Generic code has to process blocks of values, and it’s just not helping anyone when you can put “unset values” in blocks. Imagine enumerating this block, each item stored in a variable as you go. The variable you are using to enumerate the block runs into a problem… it has to hold the current item, yet it also might be unset. Don’t you see why this is not right?

I’ll point out that Ren-C’s ANY-VALUE! is not only more clear than ANY-TYPE! (which has a bad property of seeming like ANY-DATATYPE!, e.g. DATATYPE!) but it also does not include NULL. Optional parameters are “denoted” differently, but ANY-VALUE! means specifically anything that isn’t NULL. Hence ANY-VALUE? and VALUE? are synonyms.