NULL in the libRebol API...and VOID? => NULL?

hostilefork · May 2, 2018, 1:45pm

Historical Note: This post discusses why what was called VOID at one point was ultimately chosen to be renamed as NULL. To keep the thrust of the point coherent, the terminology has been left as-is. Just know that for a time, what is known as NULL today was called VOID.

Being meticulous about getting "void" right vs. "UNSET!" has been paying off tremendously. The swampy nature of dealing with such issues in Rebol2/R3-Alpha/Red have given way to clarity...and subsequently, enabled great and solid features.

Now a new bonus:

If voids are always NULL in the API, there's a huge win

Check out this libRebol pattern, taking advantage of OPT and TRY for conveniently wrangling the void/blank switcheroos:

 REBVAL *var = ...;
 REBVAL *obj = rebRun(
      "opt match [object! blank!] try case [",
          var, "= some/value [first foo/baz/bar]",
          "integer?", var, "and (mode = 'widget) [second mumble]",
     "]", END);

 if (!obj) {
    // leverages C's natural "NULL is falsey" property
    // so testing for success requires no extra API call
    // nothing to clean up, no handle to rebRelease()
    //
    ... code for failure ...
    return;
 }

 // Rebol code can do heavy lifting for validation / errors
 // so we can assume the value is good to go
 //
 ... code for success ...
 rebRelease(obj); // if you're done with it...

You let the embedded Rebol pick things apart to make sure the result is a type you care about or not. Then your first reaction to the result can be "did I get something in the set of answers I'm interested in processing or not", and that reaction can be decided without any API call...you just take advantage of NULL being "falsey" in C.

Since people who aren't me haven't really been experimenting all that much with how the "voiding" has been working, you might not be as excited about it as I am. But that's just because you haven't tried it. OPT and TRY getting in there with all the new constructs brings a whole new level to "the game", and this idea of having an easy signal channel back to C from those constructs is really compelling.

"voids" always become NULL, why not call the test NULL?

Remember: type of (do []) is _. It's a falsey/blank value to say it has no type. There's no VOID! type, because voids have no unique identity. You can locate several different unique UNSET! cells in various arrays in R3-Alpha, mutate them to other types like INTEGER!, change them back, etc. Not so in Ren-C.

No void "type" means that "changing the name of void" is really just "changing the name of the test for void"... from VOID? to NULL?.

This seems like a win to me. Not only does it reduce the barrier to talking about the C behavior from the Rebol behavior, null can be the JavaScript representation too. The word isn't taken in Rebol to mean anything else, so why not reduce the cognitive load by using what other languages use?

Same number of letters in VOID and NULL. NULL? vs VALUE? have different first letters, which may be a plus.

I talked about this before, so why didn't I do it sooner?

Ren-C eliminated UNSET!-typed cells in ANY-ARRAY!, but the practical mechanics of voids appearing in various places the evaluator see as "incarnated cells" have lingered. Solutions to the problems come along one piece at a time, like the just-now-reconcieved definition of UNEVAL:

uneval: func [
    {Make expression that when evaluated, will produce the input cell}
    return: [group!]
        {`()` if void cell, or `(quote ...)` where ... is passed-in cell}
    cell [<opt> any-value!]
][
    either void? :cell [quote ()] [reduce quote ('quote cell)]
]

So if you do compose/only [if void? (uneval :some-unset-var) [print "this prints"]]. Necessity is the mother of invention for these kinds of things, and so far they've filled in the gaps. Things are at a technical point where we can do it.

UPDATE Jan 2019: UNEVAL--and the reasons it had to be invented--motivated Generic Quoting, which has supplanted it. Interestingly, (foo: ') uses generic quoting to assign the absence of a value to foo--and hence unsets it.

There's still going to be the tradeoff I mentioned regarding the mutability of handles. If you have a pointer to a REBVAL* now which is INTEGER!, might some operation change it to a STRING! later? Or if it's a BLOCK!, might its index be changed in place vs. making a new block?

Saves on handle allocations. But now imagine that you've handed a cell out to the user through the API, and offer them an API for doing evaluations into that cell vs. making a new one. And that evaluation produces void. That's a transition they wouldn't be able to do under this model. This makes an API handle's "slot" more like an array cell than it is like a context variable...it can hold blanks, just not voids.

Having had time to work with matters in practice and looking at the big picture, this is a small price to pay.

hostilefork · May 2, 2018, 5:48pm

Here are a couple of counter-arguments to null over. Er, mull over.

Note: The various details and hassles involved with this have, over time, given me more empathy for the idea that it took a really long time for humanity to get zero straight. Assuming we're going to say they ever did.

Console UI suggests non-"thingness" of voids

C programmers are used to NULL being a "thing". Rebol's treatment of it as a non-thing in the prompt might make it seem "more like a void"

>> 1000 + 20
== 1020

>> ()

>> 300 + 4
== 304

But the void analogy is a false one. In C you can't say void f() {...} and then some_func(f()).

Technically as we know from optional parameters, they're much more like a NULL pointer. Despite the console UI, you are getting back a "thing"...anything you can pass as an argument is a "thing". That thing just doesn't happen to be a Rebol value.

To change this perceptually, the "thingness" (or the "signalness") of null might need to be called out in the console:

>> ()
\\\ null \\\  ;-- some pattern to convey non-LOAD-ability?

Maybe even the console would output a comment?

>> ()
;-- null

This would assist in clarity of transcripts (I frequently have to add this manually to try and show what happened). Though it would contaminate the reading of any printed output, but is that any worse than when an evaluative product shows up?

>> print "this might be annoying"
this might be annoying
;-- null

>> foo: does [print "but is it actually any worse than this? 5]
>> foo
but is it actually any worse than this?
== 5

UPDATE 30-Jul-2018: There is now a distinction between "voids" in the API and nulls. Nulls provide the feedback of ;-- null in the console, which is useful feedback. Voids are not shown by the console, by design. It appears to be working reasonably well.

What the console does is of course configurable, but I'm just wondering if there would need to be a status quo to help people know that yes... something was returned, but that something was a signal, not an ANY-VALUE!

Many C implementations define NULL as 0

If NULL is defined as 0, then you get a problem with C variadics if you're assuming pointer reads. Imagine you have a variadic function that prints out C strings until it sees a NULL:

 null_terminated_printer("one", "two", "three", NULL);

The called function cannot process this in a legitimate way. Because when the null_terminated_printer routine tries to fetch items from the variadic list it will get them as a const char*. Yet if NULL is 0, and pointers are different sizes from integers on your system, this will crash out...because it will try to read a pointer's worth of data out of an integer's worth of memory. You have to write:

null_terminated_printer("one", "two", "three", (const char*)NULL);

So if people get too used to just throwing around lines like REBVAL *val = NULL, they might try passing literal NULLs to rebRun() as well. That's bad. If you keep them in a pattern of writing REBVAL *val = rebVoid() then they won't be as likely to make the mistake, because they won't think of them as equivalent.

This problem could probably be solved just by prescribing the use of nullptr, and then in C builds do #define nullptr (void*)0, and C++98 builds use this definition. If the commitment is being made that voids really are NULLs, it's as easy to prescribe that as it is to get people calling rebVoid()...without making it look like some new object type is popping up.

https://stackoverflow.com/questions/13816385/what-are-the-advantages-of-using-nullptr

And it could be a teachable moment for the average C programmer (e.g. observe libRed's redCall(), where this problem is overlooked).

Still looks best to go with `NULL?`

The semantics line up. At least in C and C++, programmers don't expect in a list of 20 customer objects to be able to insert NULL between objects 10 and 11...you're supposed to come up with a valid object to put there. You can only put NULLs in arrays of pointers--a fairly specific thing. The fact that Rebol has a NULL, it is a state a variable can be in but not a state an array element can be in makes sense.

Shimming in nullptr for non-C++11 users and prescribing that seems the best idea. People who don't like it don't have to use it. They just have to watch their step with (REBVAL*)NULL if passing to a va_list.

(If they're in the odd category of C programmers who worry about NULL type safety but would reject nullptr, then THEY can second guess me and make a rebVoid() and rebIsVoid()...I would guess this type of person does not exist.)

hostilefork · May 18, 2022, 3:03am