BLANK! 2022: Revisiting The Datatype

Long, long ago there was a datatype called NONE. In historical Redbol, it had the bad habit of looking like a WORD!:

rebol2>> 'none
== none

rebol2>> none
== none  ; same in R3-Alpha and Red

But it wasn't a word:

rebol2>> type? 'none
== word!

rebol2>> type? none
== none!

It was a distinct type, which also happened to be falsey (while WORD!s are truthy):

rebol2>> if 'none [print "Truthy word!"]
Truthy word!

rebol2>> if none [print "Falsey none!"]
== none

And as we can see, NONE!s served purposes of signaling "soft failures": branches that didn't run, or FINDs that didn't find, or SELECTs that didn't select... etc.

rebol2>> find "abcd" "z"
== none

rebol2>> select [a 10 b 20] 'c
== none

Ren-C Divided NONE!s roles across NULL, VOID, and BLANK!

  • NULL - an "isotopic" state of WORD! that couldn't be put in BLOCK!s. Anywhere that NONE! would be used to signal a soft failure operation--like FIND or SELECT--would use ~null~.

    >> null
    == ~null~  ; isotope
    
    >> find "abcd" "z"
    == ~null~  ; isotope
    
    >> select [a 10 b 20] 'c
    == ~null~  ; isotope
    
    >> append [a b c] null
    ** Error: APPEND doesn't allow ~null~ isotope
    
  • VOID - the result of things that are effectively "no ops". Unlike nulls, they will vanish in-between expressions, and when functions like APPEND get them as an argument they are treated as no-ops:

    >> void
    ; void
    
    >> if null [print "Doesn't print as NULL is falsey"]
    ; void
    
    >> 1 + 2 if null [print "Voids disappear..."]
    == 3
    
    >> append [a b c] void
    == [a b c]
    

    (At one time void was also the state of unset variables, but that is now the isotopic state of void...currently called "nihil")

  • BLANK! was represented by a lone underscore ( _ ) and could be put into blocks:

    >> append [a b c] _
    == [a b c _]
    

    It retained the choice to be falsey:

    >> if _ [print "Won't print because blanks are falsey"]
    

Question One: Could BLANK! Just Be A WORD! ?

You might wonder if you could just say:

>> _: '_
== _

This would give you BLANK! as a WORD! that had the behavior of reducing to itself.

>> reduce [_ 1 + 2 _]
== [_ 3 _]

That could be just a default, and you could redefine it to anything you wanted. Generally speaking, people do like being able to define words as operators... and _ has historically been a WORD! (Ren-C allows you to use underscores internally to words, so it feels a little bad to take away one word).

But outside of being hardcoded as falsey, what makes BLANK! fairly "built in" is that in the path mechanics, it fills in the empty slots:

>> to path! [_ a]
== /a

>> as block! 'a//b//c
== [a _ b _ c]

Alternately, we could accomplish a "reified nothing" with a quoted null:

>> to path! [' a]
== /a

>> as block! 'a//b//c
== [a ' b ' c]

But there's other places the blank is used, such as to opt-out of multi-returns.

>> [_ pos]: transcode "abc def"
; void

>> pos
== " def"

So freeing it up to be an arbitrary variable feels kind of wrong, as if it were taken for dialects like multi-return you'd be unable to set it as a variable.

This may be an argument for using something like a TAG! instead, so you're not worrying about overlapping with user variables:

[<_> pos]: transcode "abc def"

Similar arguments have led me to contemplate the dangers of using things like [a b ...]: in case someone has assigned a meaning to the ellipsis. :frowning: That might be a good reason to keep ... as a TUPLE! instead of a WORD! exception, because no one could assign it.

I'm pretty sure we should keep _ reserved as a BLANK! datatype, not a WORD!. People can still give it arbitrary meanings in dialects, they just can't assign values to it as a variable... and they can't do that with # either or <a> so I can live with it. Taking it away from the word pool does more good than harm.

Question Two: Does BLANK! Still Need To Be Falsey?

My feeling is that having blank be falsey doesn't have all that much benefit. NULL does a better job of it, and really what it does is mess with its usefulness as a placeholder:

>> append [a b c] all [1 < 2, 3 < 4, _]
== [a b c _]  ; would seem nice, but gives error today since ALL is NULL

Thinking of BLANK! as being "null-like" in terms of non-valuedness is generally a hassle. It makes you wonder about whether something like DEFAULT should think of it as being assigned or not:

>> item: _

>> item: default [1 + 2]
== ???

In practice, I prefer the truly non-valued NULL being the only cases that DEFAULT overwrites. This is because NULL is far more useful than BLANK! when it comes to representing something that you think of as "not being assigned"... as you'll get errors when you try to use it places (e.g. in APPEND). Trying to use it to represent nothingness invariably leads to stray appearances in blocks (Shixin wrote a lot of code to try to filter them out in Rebmake, prior to it being switched to NULLs)

Also, the asymmetry between BLANK! and NULL were part of a scheme to try and solve what Redbols called "NONE! propagation":

>> second null
** Error: SECOND doesn't take NULL

>> try null
== _

>> second try null
== null

We still want this general concept, but the new idea is that it's VOID which opts out cleanly from these operations, and MAYBE is the operator that produces them.

This makes more sense, and I think it bolsters the argument that BLANK! is less of a falsey-NULL relative...but more of a placeholder value. I've said "blanks are to blocks what space is to strings". And space is truthy:

>> if second "a b" [print "Space is truthy"]
Space is truthy

>> if second [a _ b] [print "So why shouldn't blank be truthy?"]
???

So I Suggest The Removal of BLANK! From Being Falsey. This creates some incompatibility in Redbol (which has been using NONE! as a blank substitute). But it's something that can be worked around.

1 Like

Hm.

We lose something here, which is the visual pleasingness of blanks:

settings: make object! [
    alpha: "something"
    beta: _
    gamma: _
    delta: "something else"
    zeta: "yet another thing"
]

If you have to go to the WORD! for NULL... it's not worse than Rebol2/Red's NONE, but...

settings: make object! [
    alpha: "something"
    beta: null
    gamma: null
    delta: "something else"
    zeta: "yet another thing"
]

...but...I don't care for that as much.

You can use a quoted null as a single apostrophe, but that's really slight and feels incomplete, beyond the already kind-of-incomplete-feeling all quoteds might give people:

settings: make object! [
    alpha: "something"
    beta: '
    gamma: '
    delta: "something else"
    zeta: "yet another thing"
]

Your code editor could stylize that to make it more apparent. And maybe we could say that's what you should do if you're the type to not want to type out NULL.

Question 3: Should BLANK! Evaluate to NULL ?

We could say that you can't QUOTE a NULL, only ^META it to a BLANK!.

>> quote void
** Error: You can't quote voids.

>> quote null
** Error: You can't quote nulls.

>> meta void
== ~

>> meta null
== _

This would mean BLANK! would evaluate to NULL:

>> _
; null

It would look better in object renderings.

>> make object! [x: "abc" y: null z: "def"]
== make object! [
     x: "abc"
     y: _
     z: "def"
]

People looking for placeholders that don't reduce would have to use something else... but we have another candidate with the # which might be even better. It acts as the NUL character:

>> append #{FFFF} #
== #{FFFF00}

It wouldn't be the first time # was thought of as a "none" representation, R3-Alpha did that...but then didn't use it as the rendering:

r3-alpha>> #   
== none

I've always thought the single apostrophe was a bit slight. So I like that aspect.

Giving BLANK! an evaluator behavior to produce nulls messes with its applications in things like UNSPACED and DELIMIT, which would be unfortunate.

That's not to say they couldn't be dialects and subvert the evaluator...but if people got used to using them for making nulls everywhere then they might be surprised to find another behavior.

It could suggest switching to # for it.

>> unspaced ["A" # # "B"]
== "A  B"

The problem being that in the pattern of meaning, the # is codepoint 0 (because it has no string material). You can't put that in strings, so right now that's an error.

There's periods... empty 2-element tuples:

>> unspaced ["A" . . "B"]
== "A  B"

But I imagine that's something we'd want to make a WORD! exception, like slash, so that people can define it as some kind of operator.

You could quote the blanks:

>> unspaced ["A" '_ '_ "B"]
== "A  B"

That's... not what I had in mind, but you have to delimit everything else to use it literally:

>> unspaced ['A '_ '_ 'B]
== "A  B"

The greater good of the system for visual appeal of NULL may mean making sacrifices on this BLANK! for spaces issue.

Note: It ALMOST gets us to the FOR-BOTH harmony...

If BLANK! stayed falsey, then looking back at this:

for-both: lambda ['var blk1 blk2 body] [
    unmeta all [
        meta for-each (var) blk1 body
        meta for-each (var) blk2 body
    ]
]

It's just one little teeny bit off... it produces a QUASI!-null for void, which does not vanish.

Before I said the WORD! form of META was distinct because it did not meta voids, but passed them through

>> meta void
; void

>> ^ void
== ~

>> meta* void
== ~

So ^ was actually equivalent to META*. It's a bit of a bend.

I worked up a test for this and it looks nice, reduces noise a bit when looking at lists of things where many of them are null:

 >> make object! [x: null]
 == make object! [x: ~null~]

 vs.

 >> make object! [x: _]
 == make object! [x: _]  ; special rendering rule molds meta-null as _ not ~null~

 >> make object! [x: null]
 == make object! [x: _]

Yet it means that we'll have to use # as a non-reducing placeholder, or find some other solution. And I'd have to keep thinking about making my space (character) dreams come true.

Because I've only just tried this, I haven't fully absorbed what the downsides might be. I'll look at it some more.

There's a lot to ponder here. I think on the one hand it's important to explore all of the possibilities, on the other it seems to be getting awfully convoluted and lacking a comprehensive narrative.

I'm not up to speed with much of what has changed in this realm for some time, so I apologise if this glosses over some since settled items, though judging by this post, there's much still unsettled.

For me (using the family name) Rebol's first obligation is to represent data—both in language and the way the language is interpreted in memory. Specifically BLANK! and its underscore literal is a huge win (this is from me, the ultra-conservative sceptic) in representing positive nothingness—that a thing exists but lacks assignation: [name: "Thing" link: _]. Despite that positivity, I do think that as it represents the known absence of a value in data, it should be falsey in the general flow that data primarily should determine that flow.

What it becomes in a dialect or within the general flow as distinct from NULL is of lesser importance as I see it. If NULL is the evaluator's ultimate representation of nothingness, then there should be a way to access that in internal dialects, such as SET-BLOCK! or PATH! and the like or it is not really fulfilling its role.

I have this sense that the BLANK-NULL-VOID-ERROR story has too many actors with overlapping roles. I don't have anything tangible to back that up with at this time.

1 Like

Well, that's something. :slight_smile:

Hence you agree with "Taking underscore away from the word pool does more good than harm." This seems to have reached a consensus.

But also...it is now allowed to enclose weird words in vertical bars as in Lisp, so you can have a WORD! whose spelling is underscore:

>> w: |_|
== |_|

>> type of w
== #[datatype! word!]

>> to text! w
== "_"

(An important design deviation from Lisp is that leading and trailing spaces must be escaped... which biases | and || and ||| etc. to being WORD!s of length 1, 2, and 3 respectively...and prohibits absolutely empty WORD!s, though we might have some escaping for that if it turns out to be some critical compatibility point with JSON or something.)

I'm trying to make a general engine... so it will be possible to do Redbol compatibility, and if you want different rules you should be able to have them. But the core as I see it is the "default" distribution which should be based on what is reasonably determined the "best" and most coherent.

The way I see it, the substrate simply isn't set up to do this effectively

>> if select [alpha: true beta: false] 'beta [
       print "This will print, because FALSE is a WORD!"
   ]
   This will print, because FALSE is a WORD!

I suggest that whatever peace is made with the above could be made with BLANK! more easily if it evaluated to NULL (and presumably would give NULL back from GET as well?), e.g.

>> x: _
; null

>> get '_
; null

>> if get select [alpha: true beta: _] 'beta [
       print "GET of BLANK! could return null"
   ]
; void

(Note: This turns out to be complicated in a general sense, but I think it's on the right track. Consider GET as a placeholder for "the test that makes sense in your usage context".)

Then you have coverage for FALSE and anything you assign false to. And I think the isotopic ~true~ and ~false~ story looks like it may have the right stuff. (Note that GET does not do evaluations by default...GROUP!s or groups in tuples...so you need to ask it to, hence this is somewhat safe.)

So lone IF comes down to is an ANY-VALUE! available to answer whatever question I asked. A ~false~ isotope walks the line of being able to say "no" and a ~true~ isotope says "yes" but they sit in the netherspace of not being reified values...where all other isotopic WORD! cause an error. It shapes the space to make tests when you have either a full ANY-VALUE! or a NULL actually useful.

Sidenote: If you want to not error on a failed select, there's now a more elegant answer...and it depends on VOID.

>> get select [alpha: true beta: _] 'gamma
** Error: GET doesn't accept NULL for its VAR argument (use MAYBE if intended)

>> maybe select [alpha: true beta: _] 'gamma
; void

>> get maybe select [alpha: true beta: _] 'gamma
; null

It is a work in progress...but...I believe there's plenty of evidence that things are pointing toward a solid outcome.

The proof comes from the code: the contrast between what the approaches without it can't do (and how catastrophically they regularly fall down) vs. what the approaches with them can do cleanly and correctly.

UPARSE is a giant piece of evidence, but I think there's quite a lot more.

Note: Some of the original motivators for the idea have been resolved, making it harder to argue for. I reworked this thread to try and keep it relevant, but when it started the meta state of null was a single apostrophe:

>> make object! [x: null]
== make object! [x: ']

That looks pretty slight for something I felt was as "important" as null. New thinking is ~null~ is conveyed by an isotopic word. This gives us a convenient and relatively readable meta-representation for null isotopes... as the quasiform of null.

>> make object! [x: null]
== make object! [
    x: ~null~
]

I tried this after feeling success from using isotopic WORD! forms for signaling ~true~ and ~false~. Being isotopic, none of these signaling words can be put in blocks directly... but must be transformed first. (That's a new thing for a "logic!" representation, but not a new thing for nulls.)

Having them be word!s in their cell guts may seem odd vs. having a "distinct type" with a lexical form. But I think this squares the circle with how words were often used in lieu of a lexical type in historical Redbol. It's generalized and something that we could even imagine seeing internationalized. it all comes back to words.

That enigmatic and slight single apostrophe is now taken as the meta form of void:

>> '
; void

>> 1 + 2 '
== 3

While ~null~ is definitely more verbose than using _ to null out variables, I'm not entirely sure that nulling out variables needs to be in the scheme of things any more succinct than setting a variable to false. The uninitialized state of ~ is still succint:

obj: make object! [
    foo: ~
    baz: null
    bar: ~
]

Every experiment leads to a new thought...and trying blank led to seeing nulls as an isotopic state of blank, which led to the thought of being the isotopic state of the actual WORD! of "null". So in that sense it was a success.

Having a distinct BLANK! type which isn't a WORD! still seems like it offers us value... in being usable as a placeholder for where a variable would be, and a dialect part that can't be reassigned (so it can act as meaning "space" for instance, or an empty slot in block data). So I think it's going to go back to serving that role.

3 Likes

I reverted the blank-evaluates-to-null change and examined all the callsites. I definitely have some mixed feelings.

Turning _ back into a WORD! vs a separate BLANK! type is the only thing that I might see as a reasonable alternative, and then setting it to be null by default, but then settable to anything else in contexts you wanted it. You could then take it for granted that (x: _) would set X to null just as much as you would be that (x: null) would.

It's tempting to try picking something else to represent nothingness. What if you could opt-out of a FOR-EACH variable using a lone apostrophe?

>> for-each ' [1 2 3] [print "no variable"]
no variable
no variable
no variable

It's more slight than for-each _ [1 2 3], but not awful. And since voids are used to opt-out in slots it actually dovetails nicely with if the evaluated forms were allowed to opt out via void:

for-each (if false ['x]) [1 2 3] [print "no variable"]

That seems to make perfect sense. If your expression produces no iterative variables to bind in the body assume that you didn't need them. The alternative of (if false ['x] else [_]) or even (if false 'x else '_) seems like busywork.

This even would mean you could use () as a less slight alternative, if you didn't like the ' syntax:

for-each () [1 2 3] [print "no variable"]  ; actually pretty nice!

The premise may hold for multi-returns also:

>> [a ' c]: pack [1 2 3]
== 1

>> c
== 3

>> [a () c]: pack [4 5 6]
== 4

>> c
== 6

I'm not sure it looks worse than [a _ c]: pack [...], and would open up:

>> [a _ c]: pack [8 9 10]
== 8

>> _
== 9

>> c
== 10

One potential qualm: a quoted word vs. a regular word is seen as a different instruction by both FOR-EACH and multi-return. So from a type system perspective it's like you're conflating some meanings of what QUOTED! is supposed to signal when you use an apostrophe vs. a distinct BLANK! type. e.g. there's no "unquoted void literal", only a quoted one.

In the case of how the mechanics of LET works, this is actually a problem, because the apostrophes escape things it's not supposed to consider part of the LET :-/

 let [a 'b]: multi-returner ...
 =>
 let a, [a b]: multi-returner ...  ; just drops the quote level from quoted things

 let [a ' c]: multi-returner ...
 =>
 let [a c], [a ??? c]: multi-returner ...  ; can't drop quote level from lone quote

Yes, it could say that there's an exception for lone quotes and they stay as is, but it sort of points to the general unease and "weird exceptions" you have to make when the quoted state of void is used to represent something that is only quoted at all because it's "probably not meaningful otherwise". Just seems to lead to snakey rules that the more-visible blank doesn't require.

Or maybe we just say () is what you use, I don't know.

let [a () c]: multi-returner ...

Thinking further... today BLANK! is the default used by things like ARRAY but that could have different choices too:

>> array 3
== [' ' ']  ; quoted voids

>> array 3
== [~ ~ ~]  ; isotopic voids

>> array 3
== [# # #]  ; empty tokens

Of those choices, I'd probably say I like the isotopic voids, just because of how ornery they become when evaluated....and orneryness seems like a good characteristic for when you didn't specify an /INITIAL value to be used.

But just because we would return _ to the WORD! pool... would making it defined to evaluate to a null isotope be a good thing? People might want it for other purposes (consider things like the underscore.js library, where it's the name of a utility module that tries to disappear).

>> _: import %my-utility-lib.r

>> _.sum [1020 304]
== 1234

If you're going to open _ up for potentially interesting purposes, but then turn around and say everyone assumes it evaluates to null, then I'm not clear that the ability to redefine it is as much a benefit as a potential nuisance. :angry:

Beyond the above arguments, having a special type to serve as BLANK! in PATH!s is kind of a killer case, which makes reserving it as an inert dialecting part that is used as a "spacer" as well as potentially meaning just "space" makes the most sense to me.

2 Likes