BLANK! 2022: Revisiting The Datatype

In historical Redbol's meaning of the datatype NONE!, it had the bad habit of looking like a WORD!:

rebol2>> 'none
== none

rebol2>> none
== none  ; same in R3-Alpha and Red

But it wasn't a word:

rebol2>> type? 'none
== word!

rebol2>> type? none
== none!

It was a distinct type, which also happened to be falsey (while WORD!s are truthy):

rebol2>> if 'none [print "Truthy word!"]
Truthy word!

rebol2>> if none [print "Falsey none!"]
== none

And as we can see, NONE!s served purposes of signaling "soft failures": branches that didn't run, or FINDs that didn't find, or SELECTs that didn't select... etc.

rebol2>> find "abcd" "z"
== none

rebol2>> select [a 10 b 20] 'c
== none

Ren-C Divided NONE!s roles across NULL, VOID, and BLANK!

  • NULL - an "isotopic" state of WORD! that couldn't be put in BLOCK!s. Anywhere that NONE! would be used to signal a soft failure operation--like FIND or SELECT--would use ~null~.

    >> null
    == ~null~  ; isotope
    
    >> find "abcd" "z"
    == ~null~  ; isotope
    
    >> select [a 10 b 20] 'c
    == ~null~  ; isotope
    
    >> append [a b c] null
    ** Error: APPEND doesn't allow ~null~ isotope
    
  • VOID - the result of things that are effectively "no ops". Some contexts choose to make them vanish, and when functions like APPEND get them as an argument they are treated as no-ops:

    >> void  ; void results don't show anything in the console
    
    >> if null [print "Doesn't print as NULL is falsey"]
    
    >> 1 + 2 if null [print "Voids disappear..."]
    == 3
    
    >> append [a b c] void
    == [a b c]
    

    (At one time void was also the state of unset variables, but that is now isotopic void...which has reclaimed the name "none")

  • BLANK! was represented by a lone underscore ( _ ) and could be put into blocks:

    >> append [a b c] _
    == [a b c _]
    

    It retained the choice to be falsey:

    >> if _ [print "Won't print because blanks are falsey"]
    

Question One: Should BLANK! Just Be A WORD! ?

Ren-C allows you to use underscores internally to words, so it feels a little bad to take away one word.

Outside of historically being hardcoded as falsey, what makes BLANK! fairly "built in" is that in the path mechanics, it fills in the empty slots:

>> to path! [_ a]
== /a

>> as block! 'a//b//c
== [a _ b _ c]

Alternately, we could accomplish a "reified nothing" with a quoted void:

>> to path! [' a]
== /a

>> as block! 'a//b//c
== [a ' b ' c]

But there's other places the blank is used, such as to opt-out of multi-returns.

>> [_ pos]: transcode "abc def"

>> pos
== " def"

Once again, quoted void could be used here...though it looks a bit slight to my eyes, I can also see the point of view that it's better:

>> [' pos]: transcode "abc def"

Broadening the question, we could ask if blank is really the best choice for opting out of things like loop variables in FOR-EACH? Since VOID is used to opt-out, might a lone apostrophe be more coherent?

>> for-each ' [1 2 3] [print "no variable"]
no variable
no variable
no variable

It's more visually slight than for-each _ [1 2 3], but not awful. And since voids are used to opt-out in slots it actually dovetails nicely with if the evaluated forms were allowed to opt out via void:

for-each (if false ['x]) [1 2 3] [print "no variable"]

If your expression produces no iterative variables to bind in the body assume that you didn't need them. The alternative of (if false ['x] else [_]) or even (if false 'x else '_) seems like busywork.

This even would mean you could use () as a less slight alternative, if you didn't like the ' syntax:

for-each () [1 2 3] [print "no variable"]  ; actually pretty nice!

The premise may hold for multi-returns also:

>> [a ' c]: pack [1 2 3]
== 1

>> c
== 3

>> [a () c]: pack [4 5 6]
== 4

>> c
== 6

I'm not sure it looks worse than [a _ c]: pack [...], and would open up:

>> [a _ c]: pack [8 9 10]
== 8

>> _
== 9

>> c
== 10

So can lone apostrophe paper over some of the system-level needs for reified nothingness... at least in enough cases that it's a good tradeoff to give underscore back to wordspace?

Question Two: Does BLANK! Still Need To Be Falsey?

My feeling is that having blank be falsey doesn't have all that much benefit. NULL does a better job of it, and really what it does is mess with its usefulness as a placeholder:

>> append [a b c] all [1 < 2, 3 < 4, _]
== [a b c _]  ; would seem nice, but gives error today since ALL is NULL

Thinking of BLANK! as being "null-like" in terms of non-valuedness is generally a hassle. It makes you wonder about whether something like DEFAULT should think of it as being assigned or not:

>> item: _

>> item: default [1 + 2]
== ???

In practice, I prefer the truly non-valued NULL being the only cases that DEFAULT overwrites. This is because NULL is far more useful than BLANK! when it comes to representing something that you think of as "not being assigned"... as you'll get errors when you try to use it places (e.g. in APPEND). Trying to use it to represent nothingness invariably leads to stray appearances in blocks (Shixin wrote a lot of code to try to filter them out in Rebmake, prior to it being switched to NULLs)

This makes more sense, and I think it bolsters the argument that BLANK! is less of a falsey-NULL relative...but more of a placeholder value. I've said "blanks are to blocks what space is to strings". And space is truthy:

>> if second "a b" [print "Space is truthy"]
Space is truthy

>> if second [a _ b] [print "So why shouldn't blank be truthy?"]
???

So Either Way, I Suggest The Removal of BLANK! From Being Falsey. This creates some incompatibility in Redbol (which has been using NONE! as a blank substitute). But it's something that can be worked around.

1 Like

There's a lot to ponder here. I think on the one hand it's important to explore all of the possibilities, on the other it seems to be getting awfully convoluted and lacking a comprehensive narrative.

I'm not up to speed with much of what has changed in this realm for some time, so I apologise if this glosses over some since settled items, though judging by this post, there's much still unsettled.

For me (using the family name) Rebol's first obligation is to represent data—both in language and the way the language is interpreted in memory. Specifically BLANK! and its underscore literal is a huge win (this is from me, the ultra-conservative sceptic) in representing positive nothingness—that a thing exists but lacks assignation: [name: "Thing" link: _]. Despite that positivity, I do think that as it represents the known absence of a value in data, it should be falsey in the general flow that data primarily should determine that flow.

What it becomes in a dialect or within the general flow as distinct from NULL is of lesser importance as I see it. If NULL is the evaluator's ultimate representation of nothingness, then there should be a way to access that in internal dialects, such as SET-BLOCK! or PATH! and the like or it is not really fulfilling its role.

I have this sense that the BLANK-NULL-VOID-ERROR story has too many actors with overlapping roles. I don't have anything tangible to back that up with at this time.

1 Like

Well, that's something. :slight_smile:

Hence you are on the side of "Taking underscore away from the word pool does more good than harm." I guess that means you're less comfortable with lone apostrophe serving the purpose.

But also...it is now allowed to enclose weird words in vertical bars as in Lisp, so you can have a WORD! whose spelling is underscore:

>> w: |_|
== |_|

>> type of w
== #[datatype! word!]

>> to text! w
== "_"

(An important design deviation from Lisp is that leading and trailing spaces must be escaped... which biases | and || and ||| etc. to being WORD!s of length 1, 2, and 3 respectively...and prohibits absolutely empty WORD!s, though we might have some escaping for that if it turns out to be some critical compatibility point with JSON or something.)

I'm trying to make a general engine... so it will be possible to do Redbol compatibility, and if you want different rules you should be able to have them. But the core as I see it is the "default" distribution which should be based on what is reasonably determined the "best" and most coherent.

The way I see it, the substrate simply isn't set up to do this effectively

>> if select [alpha: true beta: false] 'beta [
       print "This will print, because FALSE is a WORD!"
   ]
   This will print, because FALSE is a WORD!

I've wondered if whatever peace is made with the above could be made with BLANK! more easily if it evaluated to NULL (and presumably would give NULL back from GET as well?), e.g.

>> x: _
; null

>> get '_
; null

>> if get select [alpha: true beta: _] 'beta [
       print "GET of BLANK! could return null"
   ]
; void

(Note: This turns out to be complicated in a general sense, but I think it's on the right track. Consider GET as a placeholder for "the test that makes sense in your usage context".)

Then you have coverage for FALSE and anything you assign false to. And I think the isotopic ~true~ and ~false~ story looks like it may have the right stuff. (Note that GET does not do evaluations by default...GROUP!s or groups in tuples...so you need to ask it to, hence this is somewhat safe.)

So lone IF comes down to is an ANY-VALUE! available to answer whatever question I asked. A ~false~ isotope walks the line of being able to say "no" and a ~true~ isotope says "yes" but they sit in the netherspace of not being reified values...where all other isotopic WORD! cause an error. It shapes the space to make tests when you have either a full ANY-VALUE! or a NULL actually useful.

Sidenote: If you want to not error on a failed select, there's now a more elegant answer...and it depends on VOID.

>> get select [alpha: true beta: _] 'gamma
** Error: GET doesn't accept NULL for its VAR argument (use MAYBE if intended)

>> maybe select [alpha: true beta: _] 'gamma
; void

>> get maybe select [alpha: true beta: _] 'gamma
; null

It is a work in progress...but...I believe there's plenty of evidence that things are pointing toward a solid outcome.

The proof comes from the code: the contrast between what the approaches without it can't do (and how catastrophically they regularly fall down) vs. what the approaches with them can do cleanly and correctly.

UPARSE is a giant piece of evidence, but I think there's quite a lot more.

One potential qualm: a quoted word vs. a regular word is seen as a different instruction by both FOR-EACH and multi-return. So from a type system perspective it's like you're conflating some meanings of what QUOTED! is supposed to signal when you use an apostrophe vs. a distinct BLANK! type. e.g. there's no "unquoted void literal", only a quoted one.

In the case of how the mechanics of LET works, this is actually a problem, because the apostrophes escape things it's not supposed to consider part of the LET :-/

 let [a 'b]: multi-returner ...
 =>
 let a, [a b]: multi-returner ...  ; just drops the quote level from quoted things

 let [a ' c]: multi-returner ...
 =>
 let [a c], [a ??? c]: multi-returner ...  ; can't drop quote level from lone quote

Yes, it could say that there's an exception for lone quotes and they stay as is, but it sort of points to the general unease and "weird exceptions" you have to make when the quoted state of void is used to represent something that is only quoted at all because it's "probably not meaningful otherwise". Just seems to lead to snakey rules that the more-visible blank doesn't require.

Or maybe we just say () is what you use, I don't know.

let [a () c]: multi-returner ...

Thinking further... today BLANK! is the default used by things like ARRAY but that could have different choices too:

>> array 3
== [' ' ']  ; quoted voids

>> array 3
== [~ ~ ~]  ; isotopic voids

>> array 3
== [# # #]  ; empty tokens

Of those choices, I'd probably say I like the isotopic voids, just because of how ornery they become when evaluated....and orneryness seems like a good characteristic for when you didn't specify an /INITIAL value to be used.

But just because we would return _ to the WORD! pool... would making it defined to evaluate to a null isotope or space (or some other default) be a good thing? People might want it for other purposes (consider things like the underscore.js library, where it's the name of a utility module that tries to disappear).

>> _: import %my-utility-lib.r

>> _.sum [1020 304]
== 1234

If you're going to open _ up for potentially interesting purposes, but then turn around and say everyone assumes it evaluates to null (or space), then I'm not clear that the ability to redefine it is as much a benefit as a potential nuisance. :angry:

2 Likes