Vaporizing Failed Conditionals In REDUCE

hostilefork · August 16, 2021, 2:13am

The question of what to do when REDUCE encounters NULL has been a thorn ever since its introduction. By definition you can't put NULL in a BLOCK!.

So you've seemingly got three choices:

Vaporize the expression slot: reduce [1 null 2] => [1 2]
Raise an error: reduce [1 null 2] => ** Error: NULLs illegal in REDUCE
Put some value there: reduce [1 null 2] => [1 ~null~ 2] or [1 _ 2]

I'm exaggerating to call these the "only options". When you throw in refinements or pass in functions, you've got more options. I've also made the REDUCE-EACH function, which lets you get involved with the result of each expression evaluation...sky's the limit:

collect [
    reduce-each x [1 + 2 null 10 + 20] [
        if integer? x [keep :[<int> x]]
        if null? x [keep <null>]
    ]
]
== [[<int> 3] <null> [<int> 30]]

But with no parameterization I think there are only three reasonable choices: You vaporize, you error, or you put ~null~ or _ there.

Several People Have Favored Vaporization

I myself have usually usually been on the side of erroring.

But the place that vaporization feels most convenient is when you're doing something like an append of data to a block, and you want to cut out some items.

>> use-preface: false

>> append data reduce [if use-preface [<preface>] 1 + 2 "Hello"]
== [1 + 2 "Hello"]

The key to why vaporization works here is that you're dealing with a situation that has no positional expectations.

But I don't generally use REDUCE in these cases. It can't splice (which I usually want to be able to do).

We've Tried Vaporizing NULL and... I Don't Think I Like It

Let's look at situations like the use of GET-BLOCK! (I believe must be a synonym for REDUCE) to do ranges in UPARSE.

; Set min to null so we can easily test if it has been set or not, but is
; still "a little ornery".  Must set it before running the rule.
;
min: null

; Set max to blank so it can opt out by default if we have no max
;
max: _

<<BUNCH OF CODE THAT MUST SET MIN AND MAY OR MAY NOT SET MAX>>

uparse data [repeat (:[min max]) rule]

The reason I didn't say (min: ~) or (min: ~overwrite-me~) is because I wanted min to be "middlingly ornery". So I wanted to use it in expressions like any [min, ...] but I didn't want it to be able to be used as an opt out intention. This is the weird zone that NULL occupies and we're making the most of it.

Under this understanding...I'd be displeased if that turned into [repeat ([_]) rule], because I'd have liked to have been told about the issue.

Remember that it was not too long ago that the non-valued state would error on the variable fetch itself. We've made a lot of concessions to get to the point where it is falsey and can be retrieved without a problem.

So long as the answer isn't vaporization, it would have been okay. Making [repeat ([~null~ _]) rule] would have been poisonous enough to cause a problem. And raising an error would have been fine too.

A Compromise: MAYBE where MAYBE NULL vanishes?

If you really want a REDUCE to make nulls go away instead of becoming a ~null~ BAD-WORD! or raising an error, how about this?

>> reduce [1 maybe if true [<x>] 2]
== [1 <x> 2]

>> reduce [1 maybe if false [<x>] 2]
== [1 2]

>> reduce [1 if false [<x>] 2]
== [1 ~null~ 2]

This gives the tool for removing things conditionally, while keeping the REDUCE number of expressions consistent.

Like I say... COMPOSE is my preferred tool for when you want splicing... not just letting you go from 1 expression to 0 values, but from 1 expression to N values.

We Can Also Make a REDUCE* Which Drops NULLs

...and I've talked about predicates and all the other possibilities. But I think vaporization just isn't the default I want. MAYBE seems a good way to get past the problem.

The middle ground of ~null~ -- even though it's not an isotope -- gives a compromise that I think is more discoverable when it goes wrong than vaporization.

It seems worth trying out.

hostilefork · June 26, 2023, 3:18am

hostilefork:

min: null
max: _

<<BUNCH OF CODE THAT MUST SET MIN AND MAY OR MAY NOT SET MAX>>

uparse data [repeat (:[min max]) rule]
The reason I didn't say (min: ~) or (min: ~overwrite-me~) is because I wanted min to be "middlingly ornery". So I wanted to use it in expressions like any [min, ...] but I didn't want it to be able to be used as an opt out intention. This is the weird zone that NULL occupies and we're making the most of it.

Under this understanding...I'd be displeased if that turned into [repeat ([_]) rule], because I'd have liked to have been told about the issue.

I feel like a bit of a broken record revisiting all these old posts from when NULL was the only non-valued state, that was trying to do double duty as both "soft failure" and "opt out".

But this is yet another solved case. Nowadays NULL doesn't vaporize in REDUCE (it's an error), but VOID does vanish. And you can convert NULL to void with MAYBE if that's what you want to do.

Ever since introducing this distinction, I haven't really seen problems with vaporizing voids...and the NULL erroring has caught many issues. The abstract potential for problems with void opting out of REDUCE is there, but I haven't seen any problems. It feels especially solid now that these behaviors are specific to constructs like REDUCE, and the evaluator doesn't vaporize VOID...just the much rarer NIHIL.