Why No-Argument Refinements Are BLACKHOLE! or NULL

In Rebol2, a refinement's value would be #[true] or #[none]:

rebol2>> show-me: func [/bar] [print mold bar]

rebol2>> show-me/bar
true  ; actually LOGIC! #[true], not the WORD! true

rebol2>> foo
none  ; actually a NONE! value, not the WORD! none

As a newbie, I was bothered by this from the get-go... because I couldn't test refinements with AND/OR. (they don't take #[none]) I hadn't drank the Kool-Aid that ANY/ALL were always a desirable substitute.


NOTE: I never drank the ANY/ALL supremacy Kool-Aid. Of course there are a lot of cool uses for them as larger scale control structures--and it's important to know how to use them for streamlining program flow. But for testing flags? The ability to do it infix and without delimiters can read much better in many places.

So Ren-C has adopted infix AND/OR that operate conditionally on ANY-VALUE! and return LOGIC! (they also short-circuit)

ren-c>> flag1: flag2: true

ren-c>> if flag1 and flag2 [print "It allows WORD!s, this comes up often"]
It allows WORD!s, this comes up often

ren-c>> if all [flag1 flag2] [print "Not everyone loves this (e.g. not me)"]
Not everyone loves this (e.g. not me)

ren-c>> if (block? flag1) and (empty? flag1) [print "Short-circuit avoids error"]
; to do this, the second argument to AND/OR is quoted behind the scenes 

There were countless debates over whether refinements should just be LOGIC! (as in Red), or if a used refinement should be the WORD! of the refinement (maybe useful for chaining?). For a time, it seemed murkier when Ren-C introduced NULL...although it all tied up rather nicely in the end!

But, 2022 UPDATE: I've cleaned up a bunch of historical rambling of me talking to myself in this thread to pare it down to what's useful.


Evolution Led Ren-C to Refinements As Their Own Arguments

  • Multiple-refinement arguments were a fringe feature, used almost nowhere. Primarily it just created an inconvenience for 1-argument refinements having to come up with some random name for the argument.

  • With the existence of non-valued NULL, a single argument could represent the state of being used or not, along with the value itself.

    • The "used" state would be able to hold ANY-VALUE! that could be put in a block.

    • NULL became falsey--so just testing it with IF could cover most "is the refinement used" questions...with specific tests for NULL? when there was risk of conflation with BLANK! or FALSE!

  • It simplified the evaluator logic--removing the "refinements pick up every parameter after them" semantic.

    • This paved the way for adding more arguments to functions after-the-fact, without worrying about them getting lumped in with the arguments of the existing last refinement.

But What About Refinements That Don't Take Arguments?

This question malingered along longer than I would have liked it to. :zombie:

For some time it would give you back a WORD! that was the name of the refinement itself:

old-renc>> show-me: func [/bar] [print mold bar]

old-renc>> show-me/bar
bar

old-renc>> foo
; null

I'd advocated for this idea very early on:

  • Imagine if one function took /only and wrapped another function that also took an /only

  • If the wrapper got its ONLY variable for the refinement as the WORD! only...

    • you could just say inner-function/(only)

      • If you got the refinement, that acts like inner-function/('only)

      • If you didn't get the refinement, that would act like inner-function/(null)

  • If you squint your eyes and use your imagination a little, this might seem like a useful tool for chaining.

But the feature was a victim of Ren-C's other successes. It's much easier and more efficient to wrap functions with things like ADAPT, ENCLOSE, SPECIALIZE. The need to write this kind of "tunneling" function is rare: and refinement naming overlap is unlikely if the functions weren't somehow derived from each other.

(It also made building FRAME!s at a low-level more annoying...the value you put in the frames had to be contingent on the refinement name. And it forced path dispatch steps to permit NULLs, when it didn't allow it anywhere else. More complexity, for little benefit.)

So Wouldn't LOGIC! For Argless Refinements be... Logical?

You might think so. :robot: But actually... no.

There are a number of technical reasons. Yet an intuitive angle might be to consider how an argument-less refinement contrasts with a refinement that takes an actual LOGIC! argument.

foo: func [/mutate [logic!]] [...]

This refinement can be NULL (unspecified--e.g. the refinement was not used at all), or #[true] or #[false] when it is specified. There's three states.

But an argument-less refinement is really just "used or unused". So NULL and then a single truthy state... ideally a state that the system understands as meaning "I'm opting in".

The Modern Day: NULL or BLACKHOLE! (#)

While BLANK! has worked nicely as a falsey reified unit type, we've been lacking a truthy unit type.

But with the unification of issues and characters, the appealing truthy "kind-of-a-unit-type" of # exists. And its meaning of "please opt-in" is a good match for refinements.

  • Space-wise fits in a single cell with no allocation, like BLANK!
  • It's not a series so it's not mutable and has very few operations it responds to.
  • It's one easily-visible character.
  • And again, it is truthy.

Simple To Set, Doesn't Depend on the Refinement Name

 >> f: make frame! :some-function
 
 >> f.argless-refinement: if x = y [#]  ; slot will be either null or #

First reflex might be to make a TO-BLACKHOLE or similar:

 >> f.argless-refinement: to-blackhole x = y

But I actually think the if condition [#] looks pretty nice. It's unambiguous and you get the idea. "check this box, else leave it empty". Short and obvious!

Blackholes Are Fit For Purpose, and Aligns With Multi-Returns

Blackholes were first invented to help "opt-in" scenarios with multi returns, with patterns like:

if multireturn-var [  ; request for additional work made
    set multireturn-var do-additional-work
]

The truthy nature of #, along with being a no-op with SET, made it easy to request the additional work be done while not needing to name a variable to store the output.

That's no longer needed in multi-return, because it proxies variables for you automatically, so you don't need the overhead of SET.

But it's still a useful feature, and can come in handy with # produced by used refinements!

Higher Level Functions Can Add Conveniences!

If you think it would be nice to be able to specify an argless refinement with a LOGIC!, guess what... you can! (Just not when building low-level frames...)

Higher level functions like APPLY are happy to turn a LOGIC! into a NULL or # for you.

>> apply :append [[a b c] [d e] /only 1 = 1]
== [a b c [d e]]

SPECIALIZE doesn't do it at time of writing, but maybe it should...

>> apo: specialize :append [only: 2 = 2]
** Script Error: Argless Refinement /only Must be either # or NULL

In any case, I just wanted to explain why an argless refinement to your function is going to be showing up as # or NULL, and that's more or less the final (?) answer.

1 Like

To go into a little detail on this...

Unlike in historical Rebol (which type-checked each argument as it went), "Frame fulfillment" and "type checking" are two conceptually separate passes in Ren-C

You can ask a frame to be filled by the evaluator--even if the types are all wrong--and then do some adjustment that makes it valid before executing it.

Why is that useful? Well, just look at this REFRAMER:

intify: reframer func [f [frame!]] [
    for-each [key val] f [
        if val = <one> [f.(key): 1]
    ]
    do f
]

; ADD doesn't accept TAG!...

>> intify add <one> <one>
== 2

; but we were able to take a step to build up a frame for it...
; adjust it...
; and then run it!

When you look across the pieces of the process, it's clear some guarantee of a frame's canonization is needed. So when it comes to a refinement-with-no-arg, you don't want some dispatches to the function saying NULL for no refinement use, and some saying #[false], and others using a blank!. You definitely don't want it to just be some random WORD! or OBJECT!, that happened to be truthy!

But who's responsible for doing the canonization?

I definitely don't want typechecking itself to modify frames. That's a disaster. By the time you get to typechecking, whoever built the frame needed to put it into canon form. And it just errors if you didn't set up the frame right.

  • But we don't want it to be a PITA to get that canon!

  • I think the if condition [#] is pretty jive.

When looking to make the mechanics as minimal as possible, a single simple rule is the only frame mutation that is done. That is that any unset isotopes (~) that may be lingering from frame creation by MAKE FRAME! are turned into NULL. They're an illegal parameter state, serving to represent unspecialized parameters (distinct from specializations to NULL).

If we appreciate that rule, then we can appreciate why it's good to put responsibility on the frame builder to load in a NULL vs. a #. If we tried to force canonizing argless refinements to #[false] or true, that would require sensitivity to the kind of argument. Avoiding that sensitivity is much cleaner--because you don't waste time and introduce complexity by needing to know if the parameter is an argless refinement or not.

So just one rule: ~unset~ isotopes become NULL on frame invocation. Better than two or more rules...a simpler machine. Make sense? :man_teacher:

1 Like

(Note: Lifted this writeup out of another thread that otherwise is useless.)

Note that Ren-C had undergone hard-fought battles to confront problems like:

rebol2>> foo: func [/ref arg] [print ["ref:" mold ref "arg:" mold arg]]

rebol2>> foo
ref: none arg: none

rebol2>> foo/ref <thing>
ref: true arg: <thing>

rebol2>> foo/ref none
ref: true arg: none

rebol2>> condition: false
rebol2>> foo/ref if condition [<thing>]
ref: true arg: none

The above pattern made it hard for function authors to know if a refinement was in use. If they tried to go by the arg, they could be ignoring an inconsistent meaning where the caller literally meant to specify a none. The situation gets worse with multiple refinement arguments.

(Think about a case where you allow arbitrary values to be interleaved into a block with an /INTERLEAVE refinement, but no values will be inserted if the refinement is not intended. NONE! is one of the legal values, since it can be anything. If the function author heeds the refinement by name, then looks at the argument to see if there's a NONE!, it will work. But that prevents them from using the argument's status as none or not. The situation is confusing and you'd find functions making ad-hoc policy decisions, which may-or-may-not allow you to pass none as a way of backing out of a refinement when you used the refinement on the callsite.)

Gradually, the evaluator was made to keep the refinements' truth or falsehood in sync with the state of the arguments. Use of a NULL for all of a refinement's arguments at the callsite would make the refinement appear unused to the function, as if the caller had never specified it. Using NULL for only some of them would raise an error. And refinement arguments were never allowed to be themselves NULL... they were only nulled when the refinement was unused, and hence trying to access them would be an error.

This ultimately streamlined even further into the unification of the refinement and the argument itself...reducing the number of random words you'd have to come up with, shrinking call frames, eliminating a "gearshift" in the evaluator that opened doors to AUGMENT-ing frames with new normal arguments after a refinement argument.

But something to point out is that because these changes were incremental over time, ideas like the necessity of erroring on null accesses were also something that had to be challenged over time. I had a bit of uneasiness about things like:

rebol2>> foreach [x y] [1 2 3] [
             print ["x is" x "and y is" y]
         ]

x is 1 and y is 2
x is 3 and y is none

Something about running off of the edge of the data and not raising so much as a peep was unsetting. Doubly so because a NONE! might have actually been literally in the array. It seemed that once you had the power of NULL to distinguish, that not taking advantage of that with error checking would be a waste...

But such checks have upsides and downsides. Look at R3-Alpha:

r3-alpha>> data: [#[none!] #[unset!] 1020]
== [none unset! 1020]

r3-alpha>> data/1                          
== none

r3-alpha>> data/2  ; Note: prints nothing in console (and no newline)
r3-alpha>> data/3
== 1020

r3-alpha>> data/4
== none

Here's Ren-C:

>> data: [_ ~ 1020]
== [_ ~ 1020]

>> data.1
== _

>> data.2
== ~

>> data.3
== 1020

>> data.4
; null

Is this so bad, not to error on data.4 or other null cases? At least a distinct state is being signaled...that you can tell is out of band of what's in a block.

I think that when all angles are examined about how things have progressed, this makes sense, as does the unused-refinements-are-null change. It may feel a little more unsafe, but if we're using JavaScript...

js>> var data = [null]  // can't put undefined in array literal...
js>> data.push(undefined)  // ...but you can push it (?)
js>> data.push(1020)

js>> data
(3) [null, undefined, 1020]

js>> data[0]
null

js>> data[1]
undefined

js>> data[2]
1020

>> data[3]  // out of bounds (fourth element, indices are 0-based)
undefined

I think Ren-C is leading the way here, and that it's not worth getting bent out of shape over NULL accesses not triggering errors. The TRY mechanisms are there to pick up the slack when there are questionable uses of a NULL, to make sure it was intended.

1 Like