Shades of Distinction In Non-Valued Intents

UPDATE 2022: Rewritten completely to explain the modern answers for the topic than inbound links to this thread wanted to talk about.


Rebol2/R3-Alpha/Red Have Two Kinds of Nothing (both reified)

Historical Redbol gives you two main choices for "nothingness"...#[none] and #[unset]... both of which can be found either in variables, or as values in blocks:

rebol2>> block: reduce [none print "print returns unset"]
print returns unset
== [none unset]  ; misleadingly renders as WORD!s

rebol2>> type? first block
== none!

rebol2>> type? second block
== unset!

Using #[none] has the advantage of being "friendly" on access via word, allowing you to write things like:

rebol2>> var: none

rebol2>> either var [print "do something with var"] [print "do something else"]
do something else

But when var contained an #[unset], you'd get an error instead:

rebol2>> unset 'var

rebol2>> either var [print "do something with var"] [print "do something else"]
do something else

So instead of using var directly, you had to do something more circuitous and pass the word "var" into a special test routine (morally equivalent to today's set? 'var)

Hence #[none] was reached for frequently out of convenience. Yet this convenience came with a cost: it was very easy to accidentally append one to a block, even if its non-valued intent should have conveyed you might not have wanted to add anything at all.

But it's hard to say: sometimes you did want to add #[none] to a block, to serve as a placeholder.

Also, being able to enumerate a block which contained #[unset] values was problematic, because if you did something like a FOR-EACH it would appear that the variable you were enumerating with was itself not set.

Early Ren-C Made Reified BLANK! and non-Valued NULL

One thing that bugged me was that there was no "pretty" representation for a non-valued state in a block... and that #[none] often thus displayed itself as the word none (seen in the example at the top of the post).

So the BLANK! datatype took the single underscore _.

>> second [a _]
== _

>> if blank? _ [print "yep, it's a blank"]
yep it's a blank

>> if not _ [print "blank is also falsey"]
blank is also falsey

And critically, one of the first things I tried to do was rethink the #[unset] state into something that you'd never find in a block, and called it NULL (as well as made it correspond to C/Javascript null in the API):

>> second [a _]
== _

>> third [a _]
; null

Since NULL couldn't be found in a block, it wasn't ambiguous when you got NULL back from a block operation as to whether there was a "null in that position".

But it's still just two things:

  • blank! - A nothing you can put in a block

    • it was logically false
    • it was friendly via word access (no need for GET-WORD!)
  • null - A nothing you couldn't put in a block

    • it was also logically false
    • it was unfriendly via word access (need GET-WORD! for :VAR, or SET? 'VAR)

This put you in a difficult situation for your choices of emptiness when you were dealing with something like:

append block value  ; what nothing state should you use for value?

If you wanted to avoid accidentally appending blanks to arrays, you kind of wanted NULL so you'd get an error. But once you used NULL, you could not write the convenient if value [...] control structure.

Later Ren-C added a separate "ornery" non-Value State

A third state was added to be neither logically true nor false, and that would trigger an error on accessing a variable with it. (I'll whitewash history a bit and say this state was always called "NIHIL", and also always could not be put in blocks.)

This was the new state of unset variables:

>> unset 'x

>> x
** Error: X is an unset variable

>> get/any 'x
; nihil

>> if get/any 'x [print "Ornery!"]
** Error: nihil is neither logically true nor false

So NULL now represented a middle ground. It was something that was easy to test for being nothing (using IF) but that was impossible to accidentally put into a block.

This gave you three behaviors:

[1]  >> nihil-value
     ** Error: NIHIL-VALUE variable is unset

[2]  >> null-value
     ; null

     >> append [a b] null-value
     ** Error: APPEND does not allow adding NULL to blocks

[3]  >> blank-value
     == _

     >> append [a b] blank-value
     == [a b _]

WORD! Isotopes Brought Infinite Non-Valued Choices

Eventually the NULL state became the isotopic status of the WORD! null, so a ~null~ isotope.

It joined ~true~ and ~false~ as being isotopes you could test for truthiness and falseyness. But if you were okay with getting an error on conditional testing, any other word could be used:

  config: ~initialize-system-not-called~

  initialize-system: func [
      {Let's say this function reads the config file}
  ][
      ...
      config: [...]
  ]

This usually causes a nice labeled message anytime someone tries to use CONFIG:

Going this route would create a pain point for anyone who thought they were going to test for whether you had a config initialized by testing if config [...]. So that has to be considered as whether it's what you want.

Because on the other hand, you should use the null isotope as the initialization for something you may run some code and find it doesn't assign, and you want to be able to test that.

 directory: ~null~

 for-each [key val] config [
     if key = 'directory [
         if directory [
             fail ["Directory was already set by config:" directory]
         ]
         directory: val
     ]
 ]

VOID Provided a Clean "Opt-Out" Option

An unfortunate sacrifice that had been made in the design was that the "non-valued" status of NULL was chosen to raise attention to an error condition, rather than be an opportunity to opt-out of an APPEND:

>> append [a b] null-value
** Error: This is the choice that we went with

>> append [a b] null-value
== [a b]  ; would have been another possibility, but too accident prone

Some "strange" things were tried...such as making it so that appending a BLANK! was a no-op, and if you wanted to append a literal blank you had to append a quoted blank:

 >> append [a b] _
 == [a b]  ; hmmm.

 >> append [a b] quote _
 == [a b _]  ; hmmm.

(It wasn't that strange considering appending a BLOCK! would append its contents, and a quoted block was being tried as the way of specifying /ONLY. This line of thinking ultimately led to the designs for the isotopes that solve things like splicing intent, so it wasn't all for naught!)

When invisibles were first introduced, you couldn't "pass a void" to a function. Because so long as the evaluator processed an argument which vanished, it would keep re-triggering until it found one.

>> append [a b] comment "hi" 1 + 2
== [a b 3]  ; old idea was that comment vanished

This seemed justified because there wasn't anything you could pass. Void was the absence of value, not something that a variable could take on.

But with a change in perspective, void became a state that variables were allowed to hold. Not only that, but it could be quoted too:

>> comment "hi"
; void

>> quote comment "hi"
== '

The re-triggering was removed, and I realized that void was the perfect choice for opting out of operations:

>> append [a b] comment "hi"
== [a b]

>> append comment "hi" [a b c]
== ~null~  ; isotope

As you see above, an operation can return null when it doesn't have another good answer for giving back in case of a no-op. This gives good error locality, since the null won't trigger another opting out unless you explicitly convert the null to a void with MAYBE.

>> append (append comment "hi" [a b c]) [d e f]
** Error: APPEND doesn't accept ~NULL~ isotope for the series argument

>> maybe null
; void

>> append (maybe append comment "hi" [a b c]) [d e f]
== ~null~  ; isotope

Void variables were deemed to be something it would be undesirable to have be "too friendly". So they were slated to be mean on WORD!-access, and instead MAYBE would be used with null variables.

This gives a (seemingly) complete picture

[1]  >> nihil-value
     ** Error: NIHIL-VALUE variable is unset

      >> append [a b] get/any 'nihil-value
      ** Error: APPEND does not allow adding ~ isotopes to blocks
      
[2]  >> void-value
     ** Error: VOID-VALUE is void (use GET-WORD! or GET/ANY)

     >> append [a b] :void-value
     == [a b]

[3]  >> null-value
     == ~null~  ; isotope

     >> append [a b] null-value
     ** Error: APPEND does not allow adding NULL to blocks

[3a] >> append [a b] maybe null-value
     == [a b]

[4]  >> blank-value
     == _

     >> append [a b] blank-value
     == [a b _]
1 Like

I've brought the above discussion up to date, and I think it paints a pretty clear picture of why things are the way they are.

But I realized while thinking about what the right default for ARRAY would be, that there are effectively three single-character reified non-values:

[1]  >> array 3
     == [_ _ _]  ; BLANK!s

[2]  >> array 3
     == [~ ~ ~]  ; Quasi-VOID!s

[3]  >> array 3
     == [' ' ']  ; Quoted-VOID!s

This Offers Us Some Nuance Even If A State Must Be Reified!

When it comes to the direct behavior of APPEND to a block, all these states have to work the same. In the era of isotopes, all reified values are appended as-is... it cannot (and should not) be any more complex:

>> append [a b] second [c _]
== [a b _]

>> append [a b] second [c ~]
== [a b ~]

>> append [a b] second [c ']
== [a b ']

>> append [a b] second [c [d e]]
== [a b [d e]]

But when we throw in an extra operation, we can imagine a difference. For instance, we could make BLANK! semantically equivalent to an empty array for the purposes of things like SPREAD or EMPTY?

>> spread second [c []]
== ~[]~  ; isotope

>> spread second [c _]
== ~[]~  ; isotope

>> append [a b] spread second [c _]
== [a b]

>> empty? second [c _]
== ~true~  ; isotope

...and then we'd say that if you tried to do such things with a quasi-void or a quoted-void, it would be an error:

>> spread second [c ~]
** Error: SPREAD does not accept QUASI-void arguments

>> empty? second [c ']
** Error: EMPTY? does not accept QUOTED-void arguments

I think this suggests that ~ makes a better choice for the default value of ARRAY elements! We can't default to an isotope like the one representing unset variables, but it's the closest thing.

Ultimately it came to seem that having only the isotopes ~null~ and ~false~ be falsey was more valuable than having BLANK! be falsey. Simply being able to assume that anything you can find in an array is truthy offered more leverage. So blanks are now truthy, BUT they're empty.

What About Opting Out Of As-Is Appends, etc?

I mentioned that all items that can be found in a block have to act mechanically identically when it comes to TAKE-ing and APPEND-ing them. But what would XXX be if you wanted the following?

>> append [a b] xxx second [c [d e]]
== [a b [d e]]

>> append [a b] xxx second [c _]
== [a b _]

>> append [a b] xxx second [c ~]
** Error: Cannot append NIHIL (~ isotope) to a block

>> append [a b] xxx second [c ']
== [a b]

I suggest this operator be called DECAY. It would turn all quasiforms into their corresponding isotope, and its behavior of turning QUOTED! void into a void would be a unique behavior for which that is the only quoted form that it would do that for.

>> decay first [~asdf~]
== ~asdf~  ; isotope

>> decay first ['foo]
== 'foo

>> decay first [123]
== 123

>> decay first [']
; void

The reverse of this operator would be REIFY.

What About FOR-EACH Variations?

I think an additionally neat spin on how these can be treated differently can be how FOR-EACH responds.

>> for-each x (second [c []]) [
       print "Loop never runs"
   ]
; void

>> for-each x (second [c _]) [
       print "Loop never runs"
   ]
; void

>> for-each x (second [c ']) [
       print "Loop never runs"
   ]
== ~null~  ; isotope (like a void in, null out... or if a BREAK was hit)

>> for-each x (second [c ~]) [
       print "Loop never runs"
   ]
** Error: FOR-EACH does not accept QUASI-VOID as its data argument

This is a bit more speculative, but I like the general idea that a quoted void could let you have a kind of nothing that gave you the "opt out" ability in places where it could... and quasi void could give you an error, while blank acts like an empty series. This seems to offer some nice invariants that reduce overall code you have to write handling edge cases.


I hope that this all plugs together, @rgchris. Can you review this thread and tell me if I've finally gotten it all to gel for you?

2 Likes