The question of what to do when REDUCE encounters NULL has been a thorn ever since its introduction. By definition you can't put NULL in a BLOCK!.
So you've seemingly got three choices:
-
Vaporize the expression slot: reduce [1 null 2] => [1 2]
-
Raise an error: reduce [1 null 2] =>
** Error: NULLs illegal in REDUCE
-
Put some value there: reduce [1 null 2] => [1 ~null~ 2] or [1 _ 2]
I'm exaggerating to call these the "only options". When you throw in refinements or pass in functions, you've got more options. I've also made the REDUCE-EACH function, which lets you get involved with the result of each expression evaluation...sky's the limit:
collect [
reduce-each x [1 + 2 null 10 + 20] [
if integer? x [keep :[<int> x]]
if null? x [keep <null>]
]
]
== [[<int> 3] <null> [<int> 30]]
But with no parameterization I think there are only three reasonable choices: You vaporize, you error, or you put ~null~ or _ there.
Several People Have Favored Vaporization
I myself have usually usually been on the side of erroring.
But the place that vaporization feels most convenient is when you're doing something like an append of data to a block, and you want to cut out some items.
>> use-preface: false
>> append data reduce [if use-preface [<preface>] 1 + 2 "Hello"]
== [1 + 2 "Hello"]
The key to why vaporization works here is that you're dealing with a situation that has no positional expectations.
But I don't generally use REDUCE in these cases. It can't splice (which I usually want to be able to do).
We've Tried Vaporizing NULL and... I Don't Think I Like It
Let's look at situations like the use of GET-BLOCK! (I believe must be a synonym for REDUCE) to do ranges in UPARSE.
; Set min to null so we can easily test if it has been set or not, but is
; still "a little ornery". Must set it before running the rule.
;
min: null
; Set max to blank so it can opt out by default if we have no max
;
max: _
<<BUNCH OF CODE THAT MUST SET MIN AND MAY OR MAY NOT SET MAX>>
uparse data [repeat (:[min max]) rule]
The reason I didn't say (min: ~)
or (min: ~overwrite-me~)
is because I wanted min to be "middlingly ornery". So I wanted to use it in expressions like any [min, ...] but I didn't want it to be able to be used as an opt out intention. This is the weird zone that NULL occupies and we're making the most of it.
Under this understanding...I'd be displeased if that turned into [repeat ([_]) rule]
, because I'd have liked to have been told about the issue.
Remember that it was not too long ago that the non-valued state would error on the variable fetch itself. We've made a lot of concessions to get to the point where it is falsey and can be retrieved without a problem.
So long as the answer isn't vaporization, it would have been okay. Making [repeat ([~null~ _]) rule]
would have been poisonous enough to cause a problem. And raising an error would have been fine too.
A Compromise: MAYBE where MAYBE NULL vanishes?
If you really want a REDUCE to make nulls go away instead of becoming a ~null~ BAD-WORD! or raising an error, how about this?
>> reduce [1 maybe if true [<x>] 2]
== [1 <x> 2]
>> reduce [1 maybe if false [<x>] 2]
== [1 2]
>> reduce [1 if false [<x>] 2]
== [1 ~null~ 2]
This gives the tool for removing things conditionally, while keeping the REDUCE number of expressions consistent.
Like I say... COMPOSE is my preferred tool for when you want splicing... not just letting you go from 1 expression to 0 values, but from 1 expression to N values.
We Can Also Make a REDUCE* Which Drops NULLs
...and I've talked about predicates and all the other possibilities. But I think vaporization just isn't the default I want. MAYBE seems a good way to get past the problem.
The middle ground of ~null~ -- even though it's not an isotope -- gives a compromise that I think is more discoverable when it goes wrong than vaporization.
It seems worth trying out.