Should TRAP and CATCH return null if no fails/throws?

hostilefork · December 3, 2018, 6:04pm

R3-Alpha's TRY added a refinement /EXCEPT for passing in a block or function to act as a handler in the case of an error being raised. It struck me as a clunky attempt to parrot existing terminology. TRAP was created to be a seemingly better name than "TRY". It appeared to have more parity with CATCH, and TRAP/WITH paralleled CATCH/WITH.

I think the name is an improvement. And it paved the way for the short word TRY to fill another important role.

But there's a pattern in both of these constructs which is that they return a result whether something is caught or trapped or not.

>> catch [10 + 20]
== 30

>> error? trap [make error! "this is *not* a trapped error"]
== #[true]

That particular behavior of trap is particularly tricky because many cases check if a result is an ERROR? and use that as a detection of if a FAIL ran...and here we see that's not actually happening.

So if you're trying to write truly "correct" code that uses a TRAP or a CATCH, you pretty much wind up writing a TRAP/WITH or a CATCH/WITH...providing a function taking an error.

But I've been questioning the value of this mixing-up-of-return-results. If you want to get a value out of the block, why not do that by setting a variable? You're usually trying to set a variable anyway, e.g. value: trap [...], what's wrong with moving it into the code?

 trap [
     value: some-calculation-that-may-fail ...
 ] then func [e] [
     ... code to handle the error ...
 ] else [
     ... stuff to do if there was no error ...
     ... assume value is good ...
 ]

This cleanly separates out the code paths, allowing usage of null-sensitive constructs. So it means getting rid of the /WITH refinement on TRAP and CATCH, instead using normal THEN/ELSE/etc. constructs with them:

>> catch [10 + 20]
; null

>> trap [make error! "not a thrown error"]
; null

In the case of CATCH, you can always just throw your final result, to get it to conflate with an ordinary throw (this is inexpensive.)

TRAP can't do that (since it would only be able to return errors). Though you could piggy-back on CATCH if you really wanted to avoid variable declaration with a TRAP...just throw your result:

 catch [trap [... throw result] then e -> [e]]

I'm not opposed to the idea of code-golf-friendly constructs which could go ahead and do this squashing of results together. (CATCH-DO, TRAP-DO?) But the clean expression with only returning the caught or trapped thing--and null otherwise--seems quite appealing to me for the primitive building block.

Also note: the existence of ENTRAP

I made ENTRAP to address the problem of distinguishing errors from other values, by returning ordinary values enclosed in a block (unless it's null, which can't be put in blocks, so the entrapped thing is just null)

>> entrap [null]
; null

>> entrap [10 + 20]
== [30]  ; note it's in a block!

>> error? entrap [1 / 0]
== #[true]  ; it's just a plain ERROR! value, not in a block

>> block? entrap [make error! "abc"]
== #[true]  ; it's a block that contains an error (since it wasn't FAIL'd)

I don't know how that fits into the naming and scheme of things, but mentioning it.

hostilefork · December 3, 2018, 9:39pm

One issue with this is that CATCH and THROW have a "/NAME" feature, where you can label throws. Rebol2/R3-Alpha/Red all have the feature of being able to name a throw, but you cannot tell what the name is when you catch things...you lose that information.

By having CATCH take a handler function, Ren-C made that handler able to take two arguments...the value passed to the throw and the name. You didn't have to pay attention to the name, but you could give a list of names to CATCH and then check to see if any of the names you were looking for were present.

So how would this be collapsed into a single value? It could be done by saying that CATCH/NAME would implicitly assume you understand you'll be getting a BLOCK! with the name in the first cell, and the value in the second.

>> catch/name [
       throw/name 10 'alpha
   ] [alpha beta]
== [alpha 10]

(I specifically went with the value in the second slot so that it could be missing--e.g. a length 1 block--in order to convey a THROW of a NULL. I think being able to throw nulls is likely important; while it does cause some confusion regarding whether the CATCH actually caught something or not, it is probably worth it to have the ambiguity for the flexibility.)

That's not the worst thing in the world. But it is a bit clunky. The other option is that you pass the name in as a variable...which is kind of how other things have been done (historically, DO/NEXT, for example...when it wanted to return a new position as well as the evaluated value)

>> catch/name [
       throw/name 10 'alpha
   ] 'name
== 10

>> name
== alpha

But that doesn't seem to fit very well with the THEN construct:

 catch/name [...] 'name then value => [...]

So being able to wrap together the name and the value into one unit seems cleaner.

IngoHohmann · December 4, 2018, 6:14am

I am not sure I like this. Being able to use then/else is great, but I don't particularly like having to move assignments into the code.

Var: trap [...]

Makes it clear, that this is an assignment, which, by the way, may error so I'm handling this.

trap [var: ...]

looks like it is mostly error handling code, and the assignment is easily overlooked.

hostilefork · December 4, 2018, 9:22am

Compare:

 if error? data: trap [	              
     inflate/max data uncompressed-size	           
 ][
     info "^- -> failed [deflate]^/"
     throw blank
 ]

With:

 trap [	              
     data: inflate/max data uncompressed-size	           
 ] then [
     info "^- -> failed [deflate]^/"
     throw blank
 ]

What makes the assignment clearer? I think the second case, in particular because it doesn't conflate data as a variable which "may hold an error, or may hold the data". If you wanted to be "clear" you'd have to call it data-or-error, which is too wordy.

It seems to me the second way puts the data: closer to what's being assigned to it, instead of separating it in an awkward and artificial way. The first case doesn't make it that obvious that inflate/max returns a value at all--maybe only TRAP does? The only hint you have that you're not just capturing an error is the misleading name "data". If you mix up the data with the error that means you're going to need a test, and if error? data: trap is a lot of noise to see through.

(And that's a simple case that doesn't mention the specific error when it probably should. You could use ATTEMPT ... ELSE for this, which would just give you a null if it was an error otherwise the value. But I worry about today's ATTEMPT because it can make typographical errors or other changes hard to find, so it seems it should be improved to at least not scuttle some common errors of words not being bound.)

More complicated examples which are trapping a section of code that doesn't just do a single assignment are even better.

What makes me feel better about it as a primitive is that it prevents mistakes in generic code like:

if error? item: trap [
    someone-elses-array: get-array-may-fail x y z
    pick someone-elses-array index
][
     ; may be an ERROR! value that just was in someone-elses-array
]

Again--I don't object to there existing some construct that conflates raised errors with plain error values, and has all the concerns which go with that. But if this behavior is built in to trap, then I start worrying and thinking that it should turn non-raised errors into voids, or otherwise help avoid these kinds of mistakes. It seems the best way is to give a solid routine to build on, then let people do what they like with that.

Being able to get rid of TRAP/WITH is clean, and you can also use use ELSE to provide clauses easily for the non-erroring case. I think it's an improvement.

What about RESCUE for the combined operation?

Perhaps the conflating operation--which would have no /WITH refinement--could be called RESCUE? Terminology-wise that seems a bit more vague. TRAP and CATCH feel like being NULL makes sense if there was no fail or throw. (e.g. if trap [...] [... an error was trapped, so you definitely had an error! ...]) But rescue doesn't have that baggage.

I still would probably want to turn errors returned from the body normally into voids. If you have an error and really want to return it, you'd use a FAIL (in a parallel to what I suggested for using THROW as the last line of a CATCH if you want to return the value)

 value-or-error: rescue [
     someone-elses-array: get-array-may-fail x y z
     item: pick someone-elses-array index
     if error? :item [fail item]
     :item
 ]

But doing that filtering requires getting things into a variable anyway, so you should have just used TRAP.

So if looking at historical Rebol2-style code, you generally could turn plain TRY into RESCUE, though I'd still suggest using TRAP instead and putting your assignments inside the block.

IngoHohmann · December 4, 2018, 3:50pm

It seems, that my comment wasn't completely thought through.
You have convinced me.

hostilefork · April 9, 2020, 3:43pm

 >> error: trap [1 + 1]
 ; null

Looks like we may get the best of both worlds now, with multiple returns! The TRAP and CATCH could offer a second output which is the value that "falls out" of the evaluation.

 >> trap/result [1 + 1] 'value
 ; null

 >> value
 == 2

 >> [error value]: trap [1 + 1]
 ; null

 >> error
 ; null

 >> value
 == 2

 >> [_ value]: trap [1 + 1] then [print "error!"]
 ; null

 >> value
 == 2

 >> [_ value]: trap [1 / 0] then [print "error!"]
 error!

How slick is that?? Very...

...although...

This does present a bit of an oddity, where multiple-returns are prioritizing the SET-BLOCK! as higher than the THEN. The evaluator is reading this as:

>> ([error value]: trap [1 / 0]) then [print "error!"]

That isn't the way ordinary SET-WORD!s work. e.g. it means that you might end up with error holding a different value than what the overall expression evaluates to. So these two expressions mean different things:

>> [error value]: trap [1 / 0] then [print "error!"]

>> error: [_ value]: trap [1 / 0] then [print "error!"]

It's possible for us to force those to be equivalent. Basically hold off on assigning the first block element...keeping it void until the expression on the right is completely done.

Or we could do both. The leading value in the block could be assigned the result of the first evaluation to be available during the second. For that matter we could do that for normal SET-WORD!s too...

 >> x: null

 >> x: all [1 + 1 2 + 2] then [print ["x is" x] x * 10]
 x is 4
 == 40

 >> x
 == 40

That would make it less weird to think of an aggressive assignment of the multi-return's other outputs in the first step, maybe. While still avoiding the need to ever say x: [x y]: .... So I don't think it's unresolvable, it's just a matter of looking at the usage scenarios. Like, does it go backwards for multiple assignments?

 >> x: y: all [1 + 1 2 + 2] then [print ["x is" x] print ["y is" y] x * y]
 x is 4
 y is 4
 == 16

 >> x
 == 16

 >> y
 == 16

Optimization-wise, it's better to hold off and do only a single assignment...so probably the thing to do is to declare the first spot in a SET-BLOCK! "special" and say it's only assigned at the end, while the other spots are assigned in the beginning. :-/

Here we learn the value of prototyping before optimizing! And Ren-C's prototyping ability is getting nearly magical.

hostilefork · February 18, 2022, 7:34am

The world has changed, so now not only can NULL not be put in a block but neither can isotopes.

But we have a better solution than putting things in a BLOCK!... generic quoting and ^META!

Remember that ^META turns nulls into nulls, isotopes into BAD-WORD!s, and adds a quote level to everything else (including BAD-WORD!s and things that were already QUOTED!).

So now ENTRAP can be a simple usermode function built on top of the multi-return form of TRAP, that offers you either the raised error (if applicable) or the evaluative product:

entrap: func [
    {If evaluation raises an error, return ERROR!...otherwise the ^META result}

    return: [<opt> any-value!]
    code [block! action!]
][
    let [error result]: trap code
    return any [error ^result]
]

Note that you could also do that as let [error ^result]: trap code and then return any [error result].

This handles every possibility:

>> entrap [1 / 0]
; gives you an ERROR! for division by zero

>> entrap [make error! "an error that isn't raised"]
; gives you a QUOTED! error, so you know it is just a value

>> entrap [print "Hello"]
; gives you a non-quoted BAD-WORD! for ~none~, so you know it's an isotope!

>> entrap [first [~none~]]
; gives you a singly-quoted bad-word, so you know it *wasn't* an isotope

>> entrap [first ['abc]]
; gives a doubly-QUOTED! word, so you know result was a singly-quoted WORD!

So you can just test if the result is an ERROR? and if it is, then an error was raised. And if it's not, you UNMETA it and you will get the value (or leave null alone, as UNMETA is a no-op on NULL).

Pretty elegant!

(I'd actually thought of getting rid of ENTRAP, but it's actually very useful, especially when dealing with the "libRebol" API...since it is only really good at giving one value back in the C code...so I'm glad it doesn't have to be a separate codebase and can just piggy-back on top of TRAP.)

hostilefork · February 1, 2023, 11:01am

So there's a competing application for multi-returns, which is the ability to throw a multi-return... which is more useful:

>> [x fallthrough?]: catch [
       if false [throw pack [1 false]]
       throw pack [2 true]
   ]
== 2

>> x
== 2

>> fallthrough?
== ~true~  ; isotope

This seems like a much better use of the multi-return ability. And to my mind, that tips the scales to where you should do a THROW as the last line if you want "the block's result".

Then if you want to detect whether a result came from the end of the block or somewhere inside it, you can do as I did above and encode that as an extra parameter.

So... should CATCH return NULL or NONE if there's no throw?

Hmmm.

Right now functions that lack a RETURN statement return NONE. As a reminder, this is a multi-return pack of length zero, that has the special console behavior of printing nothing.

>> foo: func [] []
== ~#[action! {foo} []]~  ; isotope

>> foo

>>

If you really want NULL to come back from a CATCH, you can just say that with THROW NULL. It's more explicit.

catch [...] else [...]  ; default w/no throw is NONE, would not run ELSE

catch [..., throw null] else [...]  ; would run ELSE

Although...we could say that THROW always throws a "heavy null"...and then ELSE really would mean "didn't catch anything". That does seem kind of cool, that ELSE really would react to nothing being caught.

What About Named Throws... How to Know The Name?

Hmmm.

I'll start by saying I've personally felt a bit skeptical of named throws, compared with "definitional throws". :-/

But in any case, if you catch a named throw you might want to know its name. Though I notice that Red does not tell you:

red>> catch/name [throw/name 304 'red] [red blue]
== 304  ; no indication of if blue or red was caught

If we imagine you did want to know which name you caught, then here we have another competing interest for the multi-return ability... the /NAME of the throw.

In systems like C++ that support throwing, you can actually say what types of throws you want to receive. Then you receive an object, potentially re-throwing it.

Maybe this is another opportunity for isotopic objects! Recall that isotopic objects can't be stored in variables... they need to have a DECAY method to produce normal values (including potentially a multi-return), or other methods like ELSE and THEN which coerce them. But they could have a CATCH field which might just be some name or structure. Then there could be a CATCH/FILTER which would be able to pick over these isotopic objects and decide if the CATCH wanted it or not.

That sounds pretty interesting... there'd be no THROW/NAME, it would be more like:

[x y name]: catch/filter [
    throw isotopic make object! [
        catch: 'orange
        decay: does [pack [1 2 'orange]]
    ]
] obj -> [  ; only filter on isotopic objects, pass plain object here?
    did find [orange green] obj.catch
]

So something like that would wind up giving you x as 1 and y as 2 and name as orange, so I'm showing a way to get the name handed back to you.

This would mean isotopic objects with CATCH methods would be special. It wouldn't be all isotopic objects that got filtered... e.g. you might want to throw a PARSE result which I'm theorizing has special THEN and ELSE methods.

I like this direction more than I like baking in some particular /NAME mechanism. And I really like the concept of being able to THROW and CATCH multi-return packs. I'm going to fiddle around with this.