(UPDATE: I've edited this post to reflect the terminology change to NULL. But also, I found a post on Carl's blog titled "UNSET! is not first-class". It doesn't particularly add anything new, but shows signs of a desire to differentiate the state...which seems to naturally culminate in Ren-C's choice to make it impossible to store it in BLOCK!s.)
Question asked of me was:
Could you explain in laymen terms why unset! is there in Rebol type languages, and why it's there?
To my brain, something like none makes more sense to use for 'no thing'. I get that unset removes the reference between word and value(s), but unset pops up for "no return value", even though it's a return value, and 'none'-ish things.
UNSET!'s existence is ostensibly from wanting to give a distinction between a variable that is not set (should cause an error if accessed) vs. set. Rebol would be a more unsafe language if such a state were not available. You'd just go typing
if my-varable = 1 [...] when you meant to say my-variable and there'd be no indication of a problem.
Note that the reason you don't get a "no binding" error for a misspelling comes from the somewhat liberal choice to bind words dynamically in the user context just because they appear. It's the reason you can write:
>> print-foo: does [print foo] ;-- gets bound here >> print-foo ** error (it's in bound state but not set) >> foo: 10 >> print-foo 10
print foo gets a binding to the user context, even though there was no
foo defined there beforehand. If that weren't the case and it expanded on seeing set-words only, there'd be more
safety against typos...which might make it somewhat less critical to have a "defined-but-not-set" state.
Yet I'd still argue that typos aside, it's valuable to have knowledge of when you've used something before you've explicitly assigned it. So this "bound yet triggers an error in casual usage" state is generally useful.
But I agree with you on the next point... that conveying this state of a variable as a "value" which has a "type", and can be inserted into a block, is a bad idea.
One could imagine it not being conveyable at all except through an
is-variable-set? test for those concerned with the status...and other forms of access would be errors. But one could also designate it a special transitional status that simply can't be put in blocks.... which is what Ren-C does.
So NULL is not a "value". It's in fact, the absence of a value. "set?" or "unset?" are questions asked of variables, e.g.
SET? 'X or
UNSET? 'X/Y. Then
VALUE? is a question you can ask of an evaluation product, e.g.
VALUE? :NOT-SET-THING is false. But the system protects against any instances of a "null cell" making it into the body of a block or other array.
TYPE OF :NOT-SET-THING will also come back as a NULL.
Now once you have this non-valued thing, it would be a bit of a shame to waste its unique status. Ren-C uses it as the outcome of a failed conditional, while forcing all successful conditionals to some value ("voidifying" it if the branch evaluates to NULL). So...
>> block: copy [a b c] >> append block if false ['d] == [a b c] >> append block if true ['d] == [a b c d] >> append block if true  == [a b c d #[void]]
Since you never meant to intentionally append a null to a block (as that is impossible) you have this extra dimension of flexibility.
There is also "refinement revocation", where nulls un-ask for the refinement:
>> block: copy [a b c] >> append/dup block 'foo if false  [a b c] >> append/dup block 'foo if true  [a b c foo foo foo]
The out-of-band nature is taken advantage of in "expert" operators, which differentiate between no value and a legitimate blank value:
>> block: copy [a _] >> take* block == a >> take* block == _ >> take* block // null
TAKE operation built on top of
TAKE* will give you an error if it didn't actually take anything, as opposed to returning NULL. Similar solutions help one put blank values in maps, and distinguish that from the absence of a value.
To summarize: the "null" state which is helpful in denoting a transient situation of "complete absence of value" becomes much more useful when you can confidently say it's not a value, and is never stored in blocks or other data structures. Then it can really indicate some out-of-band quality. Blanks ("nones") are too casually used as actual placeholders to serve these purposes.