NULL, BLANK!, VOID!: History Under Scrutiny

A lot of influencing factors have changed over time. Before making another major change, I want to run through the history of reasoning and look for any gaps, given modern understandings...in case some turn was taken that we would not take again given what kinds of things we know now.

To that extent, I'm going to somewhat "sanitize" the history to make it easier to absorb. e.g. NULL was initially called void...but for simplicity let's pretend it was just always called null. And there was no modern sense of TRY, it was called TO-VALUE. I'll try not to interrupt the flow by calling out such points inline.


Rebol historically had two "unit types": NONE! and UNSET!. Instances of both were values that could be put into blocks.

Ren-C started out with two parallel types: a simple renaming of NONE! called BLANK! (to match its updated appearance as _), and NULL (a purified form of UNSET! which could not be put in a block).

The original NULL was "ornery" like an UNSET!. It was neither true nor false (caused an error in conditionals)...and it was the same state used to poison words in the user context against misspellings and cause errors. With NULL being so mean, BLANK! was the preferred state for "soft failures"...it was still the outcome of a failed ANY or ALL, or a failed FIND.

But as the point was rigorous technical consistency, something like a failed SELECT would distinguish returning BLANK! from NULL.

>> select [a 10 b 20 c _] 'c
== _

>> select [a 10 b 20] 'c
; null

This provided an important leg to stand on for the operations that needed it (crucial to those trying to write trustworthy mezzanines). While casual users might not have cared or been able to work around it, writing usermode code that could be reliable was quite difficult without routines that were able to make this distinction. Too many things had to be expressed as natives, otherwise the usermode forms to be "correct" would be circuitous and overworked.

Along with this, failed conditionals were made to return NULL too. This provided a well-reasoned form of the pleasing behavior that made it into my non-negotiables list:

>> compose [<a> (if false [<b>]) <c>]
== [<a> <c>]

There was no ambiguity as there was in Rebol2 (or as continues to be today in Red). They don't do my "non-negotiable" behavior, and has no way to put an UNSET! value into a block with COMPOSE...even though it's a legitimate desire, and you will get one with REDUCE:

red>> data: compose/only [(none) (do []) 1020]
== [none 1020]

red>> data: reduce [(none) (do []) 1020]
== [none unset 1020]

red>> compose [<a> (if false [<b>]) <c>]
== [<a> none <c>]  ; as with Rebol2, R3-Alpha, etc.

NULL's Prickliness Runs Up Against a Growing Popularity

As mentioned: by not being able to be put in blocks, NULL became the clear "right" mechanical answer to things like a failed SELECT. But it caused some friction:

>> if x: select [a 10 b 20] 'c [print "found C"]
** Error: cannot assign x with null (use SET/ANY)

Even if you had been able to assign null variables with plain SET-WORD! in those days, it would not be conditionally true nor false. You'd have to write something like if value? x: select ... or if x: try select ... or if try x: select ...

The rise of the coolness of ELSE also made it tempting to use NULL in more and more places. Those sites where BLANK! had seemed "good enough" since they didn't technically need to distinguish "absence of value" were not working with ELSE: ANY, ALL, FIND. Attempts to reason about why or how ELSE could respond to BLANK! in these cases fell apart--and not for lack of trying. This gave way to the idea of a universal protocol of "soft failure" being the returning of NULL.

NULL was seeming less like the pure form of UNSET!, but more like the pure form of NONE!. Its role in the API as actually translating to C's NULL (pointer value 0) became a critical design point.

The writing seemed to be on the wall that this non-valued state had to become conditionally false. Otherwise it would break every piece of historical code like:

if any [...] [...]
if find [...] ... [...]
if all [...]

This would start developing tics like:

if try any [...] [...]
if value? find [...] ... [...]
if ? all [...]  ; one proposed synonym for VALUE?, still a "tic"

NULL Becomes Falsey, VOID! Becomes the New Meanie

It felt like Ren-C was offering rigor for who wanted it. You now knew when something really didn't select an item out of a block. All functions had a way of returning something that really couldn't mean any value you'd want to append anywhere.

But with a popular NULL that required GET-WORD! access to read it, the illusion of greater safety was starting to slip. :my-mispeled-variable was NULL too.

A path of reasoning led to the argument of resurrecting an ANY-VALUE! type which was not NULL that would be prickly. That's been called VOID!. It's more or less like an UNSET!, but terminologically makes more sense. I put it thusly:

"I like VOID because it's the return value of functions that aren't supposed to have any usable result. But more accurately, think of a paper check that has VOID written on it. You can still hand it to people, and you can get in trouble if you try to cash it. But you can't cause a problem by trying to cash a NULL check because there's no paper and nothing written on it--it is the literal absence of a check."

VOID! became a preferred choice for when branches accidentally became null (voidification). It replaced the previous "blankification" which was harder to catch anything unexpected happened, because blanks were so innocuous:

>> if false [null]
; null

>> if true [null]
== #[void]

Where This Stands Today

If the latest proposed change is adopted, reading NULLs out of a plain WORD! does not cause an error. VOID! more or less makes what was considered error-worthy in Rebol2 the same. And we come to a point where the "soft failures" of NULL are nearly as quiet as the "soft failure" of NONE! was historically.

"All we get" for Ren-C's trouble is more rigor for those who want it.

But that's a pretty pessimistic way of looking at it, because the rigor was the original motivation: to be able to write short clear usermode routines where you could be confident they were as correct as you could make a native. And there's actually a lot of ways in which the nulls raise errors when passed as parameters to things that don't take NULL (but which have behaviors for BLANK!). Yet under the proposal all refinements are implicitly optional arguments, hence NULL says nevermind. I feel good if that's what you meant, but uneasy if it wasn't.

But since if NULLs have been "defanged" so much that you can assign and use them, we need some short way of throwing in errors. I've talked about an inverse parallel to ENSURE called NON which could do things like NON INTEGER! or NON NULL:

>> append x non null select [a 10 b 20] 'c
** Error: NON did not expect a NULL but received one
** Near: non null ** select [a 10 ... (blah blah

I've started a separate thread to talk about a shorter name for "non null". But this kind of my best idea so far of how to address a world where soft failures are accepted by every refinement (and some arguments). If your failure is a logic failure--e.g. something semantically really wrong happened which was not obvious from the routine itself (like FIND not being able to find something is a known result), then don't use NULL. It's too nice now. Use VOID! or raise an error.

"Conclusion"

I'll probably run with the null change a while before pushing it on others, to see what other thoughts it provokes. But really--any comments help!

1 Like

As many objections to null stem from the foo: select [] 'word convention, what if foo: null was shorthand for voiding a word? That way you still get some meanie behaviour when you subsequently try to use foo.

Creative ideas always welcome. But bending mechanics in such a way has pretty serious costs. I tried what I thought was a cool zany idea involving blanks turning into nulls...and quickly realized that any benefits it had were eclipsed by pulling the rug out from under people.

While it may seem that writing a REPL is "easy", it's actually pretty complex because it deals in the full bandwidth of the language. And you realize that if you can't say:

 value: do usercode
 print-value :value

Then it's a pretty big challenge to your system. I feel it's a slippery slope if you can't get an accurate reading out of that, and it would sacrifice a lot of the "must...be...accurate" that NULL brings to the table.

I'm still not settled in my understanding of this concept. The reasoning appears to hold until I try it in practice.

If I can try and distill it in another way, it's as if BLANK! has split into two with a measure of VOIDing a positive branch that returns NULL for whatever reason, e.g. if true [null], the purpose of the latter is to enable ELSE/THEN.

I recall one other reason for a NULL/BLANK split not mentioned was using BLANK! as MAP! values without the associated key disappearing.

I'd be curious if we could further enumerate the ways in which the NULL/BLANK distinction offers more rigour as on the other hand, the finessing of these values does introduce additional complexity.

I should say that as of now, I don't have a favourable opinion of the ELSE/THEN idiom (on a stylistic/comprehension basis) and don't think they're worth the extra complexity alone, I'm looking to understand other circumstances where the distinction is critical. It may be too that I'm lazy and don't want to explicitly do the NULL <-> BLANK conversions where BLANK was implied before, e.g. value: any [first case | second case] ; or blank —that is to say, I'm willing to check my own long-hewn biases if that is indeed a bad practice.

It'd be conceivable in a VOID/BLANK-only world to imagine select [] 'c returning VOID as an indicator of no-match, similarly my-map/non-existent-key. In a world where val: void does not bomb, then val: my-map/non-existent-key wouldn't in itself be a showstopper.

I accept being skeptical of voidification, which seems to me the crux of your complaint. I'm willing to look at concrete challenges and imaginative mechanics which may make this less painful.

However...

I'd be curious if we could further enumerate the ways in which the NULL/BLANK distinction offers more rigour as on the other hand, the finessing of these values does introduce additional complexity.

NULL is simply non-negotiable. In contrast to my eagerness to delving into possible better ideas for voidification... I'm not particularly enthusiastic at this moment to sink more time writing up an extended defense of NULL, and feel that at some point you have to take my word for something that I've written about for years.

But... :man_shrugging: It started back with feeling I must be able to say compose [<a> (if false [<b>]) <c>] and have it unambiguously give back compose [<a> <b>] without concern that there is some meaningful variant I'm missing which I might have wanted to do. The failed control structure must return a non-thing, otherwise this is in question. We've discussed SELECT needing an unambiguous signal for "not there" vs "was there and was nothing". Maps not having a value, etc. etc.

NULL is a requirement at an API level, and a semantic level, and countless features will not work practically without it. Its existence is simply not possible to revisit. If you like features in Ren-C that are cool, then whether you know it or not, you really like NULL, a lot.

OTOH... we could question whether BLANK! itself is truly necessary. Does there really need to be a falsey placeholder type? Could people use the tag <nothing> or an empty string, etc.? But if such a beast does exist--and I think you'd want it to--there's interesting options for how it can be used as a sort of "reified twin" to NULL. Right now the only real way in which the system treats it special (beyond its falsey status) is the convention of letting it signal opting out of arguments, and getting a nice form of error locality in chains.

But back to the real issue...

Your valid complaint here is voidification, and whether it's worth the cost of distorting the output of control constructs for a feature you dislike.

Firstly, understand this feature goes beyond enabling ELSE and THEN. It's an answer to the question of "can someone from the outside of a control construct unambiguously know if it took a branch or not?" and "can someone from the outside of a control construct unambiguously know if it did a BREAK or not?"

Being able to glean this knowledge from the outside is important even for those who do not care to use ELSE and THEN. Imagine you are writing a wrapper for a loop construct out of two other loops, and you want that construct to be able to implement BREAK. With the NULL being a signal that unambiguously means BREAK, you can do so.

We could make NULL branches preserve falseyness but not nullness via "blankification" instead of "voidification". But I think voidification gave you more of an alert that you hit a distorted case and have to use another method.

As we're talking about falsey situations, note that falsey cases have always faced "distortion":

rebol2>> all [true true false]
== none

You lost the false, though not the falsey-ness. So blankification has that going for it.

I'd really like to look at concrete examples--as you know that I do not like it when there's not a "good" answer to a complaint. I'm really thinking about this idea of being able to get undistorted branches out:

 >> case [true [null]]
 ; void

 >> case [true :[null]]
 ; null

It seems like a little bit to waste for GET-BLOCK!, which I had hoped to have mean REDUCE. But that is only a shorthand, and not actually that helpful...since it would only work if the block you wanted to reduce was literal. This may be its higher calling.

Although the existence of such an escape would prevent wrapping (e.g. the wrapping of control constructs or loops to mirror out whether branches were taken or whether it hit a BREAK). Maybe this is acceptable in the sense of "not everything is going to work easily"...wrappers that want to support GET-BLOCK! branches will need to be more clever. Or maybe you simply understand that if you use a GET-BLOCK!, and a null is given, you are effectively un-signaling that the branch was taken...even though it was. But that's the meaning you understand you intend.

But overall: My own instincts are that it's probably on balance done mostly right, and that by merely converting code you're only seeing the needed workarounds as burdensome without appreciating the benefits.

I guess this is a big problem for me when it comes to it—handling blanks become burdensome and having lost their utility, they seem only to perform a placeholder role—a proxy for NULL within data structures, if you will—that has to be managed.

I need to give your response better consideration, but this part still jumps out at me

 >> flag: false

 >> print ["To me, this is" (if not flag ["not"]) "negotiable."]
 To me, this is not negotiable.

 >> compose [This is also (if not flag ['not]) negotiable.]
 == [This is also not negotiable.]

When you go down the road of BLANK!/("NONE!") trying to serve two masters...as both thing-and-not-thing...people will complain if there's no way to literally compose blanks into stuff.

When we talk about complexity that needs to be managed, you're going to have to manage it somewhere. The NULL/BLANK! distinction provides leverage in the implementation and puts the responsibility on you to know which you mean, at the right time. Otherwise you're just pushing the complexity into less obvious places.

It's not necessarily intrinsic that BLANK! needs to be falsey, if your complaint is about not preserving the value precisely...

>> any [null first [_]]
== _

>> any [null second [_]]
; null

But historical Rebol has a LOGIC! type, and a false, whose intent would be lost.

>> any [null first [#[false]]]
; null

That's just something you're used to. But it makes it seem that inhibiting BLANK!s use in compatibility layers seem not worth it...you could have just used <null> if you wanted something truthy and placeholdery.

We have to be looking at more than just one annoyance. The number of creative solutions afforded is high, and one has to appreciate the benefits in a holistic sense to be willing to think it's worth it to rethink how old code is written. But the new way for that old code might be even better. Can't know unless I'm reading it too.

I will point out this debate rages on even as we speak--today, in Red Gitter.

Because they lack invisibles, they face an uphill battle with trying to make return results that opt out. So an already poor idea like making UNSET! truthy leads to questions of why things wouldn't return UNSET! so they could just opt out of an ANY, while they would end an ALL. Generic ELIDE becomes a perfect tool for this once you get comfortable with it.

Because they lack a NULL state, they only have the option of a reified UNSET!...which due to its reified nature, may be seen when enumerating blocks. Writing mechanically correct code is nigh impossible.

Etc.

1 Like

I get that, that's why I'm asking. I have THEN/ELSE, the vanishing NULL—which I get and appreciate that to go about it from the legacy BLANK-only approach might get messy—and some other items (SELECT BLANK vs. nothing), and the fidelity of CASE/SWITCH/IF/etc.. However I see the BLANK/NULL split as messy too and I'm trying to justify all the ways it is necessary, because I'm having a difficult time balancing the pros/cons of each and wondering if one set of inconveniences do indeed outweigh the other. I'm as sincere as I can consciously be in approaching this with an open mind and apologies if this has indeed been enumerated elsewhere.

I've read this thread over and I feel I get the basic concepts. It just starts to hit me when I start to actually use them.

2 Likes

A post was split to a new topic: How to Subvert Voidification?