The Need To Rethink ERROR!

hostilefork · October 23, 2020, 7:48am

Looking around at how people deal with failure conditions, there's a pretty strong trend against exceptions--a trend which has grown stronger in the last batch of popular languages.

Exceptions operate on a dubious principle: that a trigger condition originating transitively in a deep stack of code can be meaningfully handled when passed up through intermediary stacks. There's an uncomfortable violation of abstraction: you are no longer dealing with a return result that was clearly formalized in the contract between caller-and-callee. "The function you called couldn't handle an error... but you're trying to do it without even knowing what that function called that failed...?"

Even exception advocates agree that they should not be used lightly...the name suggests it is only for "exceptional" circumstances. So things that are very reasonable to expect to occur during operation--like a filesystem API trying to open the file and it not being there--should fit into the normal return results. This is a slippery slope and subject to what your program considers "normal"...but the intent is that it's supposed to be more for things like "ran out of memory" or "network cable was unplugged during transfer".

But there are many who are against exceptions. e.g. Google's is against them: "We do not use C++ exceptions." When the pros and cons are weighed, they think it's just not worth it.

Rebol code that uses TRAP (old TRY) or ATTEMPT frequently shows that Rebol has an even greater weakness of the approach than usual: any arbitrary typo inside the executing code can be interpreted as the wrong kind of failure. Conflating a syntax error with something like file-not-found is much too easy.

Another key contributor to exception unpopularity is exceptions don't work well with asynchronous programming. Code that triggered a request can be off the stack while a handler is running. So there is nowhere to put a catch {} above the stack for the problem.

Emergent Pattern: Branching Returns

Across languages you see a consistent pattern of functions formalizing the return of a branched result: either a "successful" return or an "error" return...and labeling the return value as such.

Haskell has "the Either monad"...which bundles a value with a label of "left" and "right". By convention, if the value is labeled "left" it is an error value...and if it is labeled "right" it is a successful return. There is no "umbrella" error datatype--so strings labeled "left" or other common tuples are typically used.
Haskell-inspired Rust has result, which has a similar labeling scheme...though it specializes the purpose and gives the labels the names "Ok" and "Err". It suggests (but does not enforce) that Err-labeled values be instances of the std::error::Error datatype...which meets the basic expectations of what an error should be able to do (e.g. print itself out, show a call stack of where it originated).
JavaScript ES6 handles asynchronous scenarios by making it so that async functions don't return their result with return, but with either resolve() or reject(). If resolve is called, then callsites will trigger then() handling, otherwise they will trigger catch() handling. If await is used vs. then/catch, a failed promise will resort to throwing the error (e.g. an exception).
Node.js used the "callback convention" where asynchronous functions return errors as the first parameter to a callback, and if that is null the other arguments are assumed valid...but this is now typically converted into ES6 promises where errors produce
Go uses multiple return values for errors, with a convention that the last value in the return sequence is the error...with all other values being "zero values" of their type.

So if your own unsatisfying experiences with "throw/catch" and "fail/trap" solutions isn't enough to convince you...there's a pretty strong batch of added evidence.

Using The Parts In The Box Effectively

Something I had in the back of my mind ever since THEN could conveniently take a parameter value via lambdas, was what if ELSE could too.

At the moment ELSE only triggers on NULL. But what might happen if THEN didn't take ERROR!s or nulls, but ELSE did. Something like:

 (make error! "how about this?") then value -> [
     print ["This would not run:" value]
 ] else error -> [
     print ["This would run:" error]
 ]

But that crude sketch shows a weak basis for doing error handling:

It operates in the "single return result" realm...so to use it, you'd have to conflate ERROR! values in with a function's ordinary return values. That would be tricky to avoid including ERROR! itself, if it can return ANY-VALUE! (e.g. you can PICK an ERROR! out of an array)
There's nothing guaranteeing you handle an error...you could just forget and leave off the ELSE (The "good" versions of the branched error approaches make sure you never accidentally ignore them, you have to consciously throw them away.)
It may be (probably is?) a bad fit to fold this into ELSE...which isn't really about error handling, but whether branches are taken.

Using SET-BLOCK! for multiple return values wedged open the door for being more effective in this space. But we can think through this some more.

One thing I notice in JavaScript and Go is that the error result is distinguished, but positioning is set by convention instead of by name. As a random sample thought, we could syntactically push errors out somehow, e.g. with a TUPLE!

; function that returns 3 values and a possible error
;
[a b c].err: some-func arg1 arg2 arg3

I didn't say it was a great idea--just pointing out a degree of freedom.

Could We Get More Mileage Out Of ERROR!

Rebol2's ERROR! had an interesting aspect to it, as being an "ornery" value...like a VOID!...that you couldn't inspect normally. You had to DISARM it and view it as an object in order to pick apart its properties.

R3-Alpha's ERROR! was neutered and became just another flavor of OBJECT!. It carries a bit of standardized information about what line and location it originated from...but arguably this could be a useful feature for any value (internally to the system for debugging, I have functions like Touch() which will tag a value with the last place that modified it...and it comes in handy a lot).

Maybe it's all right as it is... and what we're missing is more like Rust's result. But I can't help but feel that in a universe of possible designs...that "OBJECT! that reports a different type" is weak.

Again: I don't have any great ideas right now. This is just noticing something and brainstorming. Having multiple returns in the mix is good, but I'd like to see something that's at least as good as what other languages have.

hostilefork · October 4, 2021, 10:00pm

Thoughts on CALL And Errors

In @gchiu's small pharmac script, there are a couple of uses of CALL that run ghostscript (gs):

call [
    gs -sDEVICE=pngmono -o (join root "-%02d.png") -r600 (pdfname)
]

call [
    gs -sDEVICE=eps2write -sPAPERSIZE=a4
        -o (join root "-%02d.eps") (pdfname)
]

But if either of these calls fail, the script will keep running. That is because CALL is not checking the result code...and in OS terms, any process that returns a nonzero value had an error.

I think the default behavior of CALL should be to raise an error on a nonzero result.

The best strategy seems that if you request the exit status explicitly, then it would assume you were going to handle it.

[# status]: call [...]

What makes this a little bit weird is that there doesn't seem to be any particularly great return result from CALL besides the integer return value. So there'd be a ~none~ return result by default, and this status would be the secondary result.

This doesn't exactly fit the pattern of my "distinguished error" result, because the status is an integer... not an ERROR!. Though if you don't ask for the status and it raises an error because of that you could see it that way:

[#].err: call [...]

Could a Function Tell If You "Use" Its Primary Return Result?

It's hard to say what "using" would mean.

For instance, imagine this scenario:

>> call [rm nonexistentfile.txt] print "nocheck"
** Error: CALL returned exitstatus 1
** Near: rm nonexistentfile.txt

So I'm proposing that would raise an error in a script. But what if you wrote:

>> call [rm nonexistentfile.txt]
== 1

The value 1 got "used" in the sense that it was reported by the console. Should that count as suppressing the error, since someone took the result?

What about:

>> elide call [rm nonexistentfile.txt]

The ELIDE consumed the result, but threw it away. It doesn't seem that should suppress errors.

Really only multiple-returns are set up to be explicitly thought of as "requested" vs. not. We could argue that explicit setting to a word is different:

status: call [rm nonexistentfile.txt]

And then CALL could be told whether there was an explicit assignment or not. But the way that multi-returns are checked is if their argument names are null or not...and the RETURN argument isn't named right now (it's just the RETURN: in the spec).

Overall I am wary of trying to do anything detection-wise with the main return, and keep that for multi-returns...which are really just syntactic sugar over refinements. That has this weird implication for CALL not having the exitstatus as its primary return. Oh well.

Thoughts on CALL and Asynchronousness

In the model I'm pushing for asynchronousness, this argument isn't as applicable...because the error can be received by the callsite.

What you'd want to do to make CALL asynchronous would be to just spawn it off on a separate "goroutine"-like thing.

If you didn't care about the result, you could just do something like this:

 go [call [echo "I'm being printed by the OS"]]
 print "This could print before the echo executes."

But if you wanted to coordinate, you'd make something like a "channel" and have the goroutine transmit updates on it.

let c: make-channel
go [
    send-channel c <start>
    call [echo "I'm being printed by the OS"]
    send-channel c <finish>
    close-channel c
]
while [msg: receive-channel c] [
   print ["Got signal from channel:" msg]
]
print "Channel has been closed"

That would be one strategy. But a different strategy would be used if you wanted to communicate with the input and receive the output, in a streaming fashion.

hostilefork · June 9, 2022, 6:38am

hostilefork:

(make error! "how about this?") then value -> [
     print ["This would not run:" value]
] else error -> [
    print ["This would run:" error]
]
But that crude sketch shows a weak basis for doing error handling: (Note: numbering added)

It operates in the "single return result" realm...so to use it, you'd have to conflate ERROR! values in with a function's ordinary return values. That would be tricky to avoid including ERROR! itself, if it can return ANY-VALUE! (e.g. you can PICK an ERROR! out of an array)

It may be (probably is?) a bad fit to fold this into ELSE...which isn't really about error handling, but whether branches are taken.

There's nothing guaranteeing you handle an error...you could just forget and leave off the ELSE (The "good" versions of the branched error approaches make sure you never accidentally ignore them, you have to consciously throw them away.)

The Above Was Written in 2020.

In 2022: (1) (2) (3) => ^META

We address (1) by multiplexing ERROR! return results onto the main return channel backbone with ^META conventions!!!

e.g. with ^META, we can say that if a function you are calling raises an error, then you get a QUASI!-ERROR! result.

>> x: ^(1 / 0)
== ~make error! [
    type: 'Math
    id: 'zero-divide
    ...
]~

This is distinct from what would happen if a non-triggered ERROR! value just happened to be evaluated to. That would be quoted by the ^META process.

Then we could say that UNMETA of a non-quoted ERROR! will raise that error. This gives a time between the ^META and UNMETA for you to handle it, if you can do so meaningfully. If you don't, it fails.

We solve (2) by saying ELSE and THEN both reject ERROR! meta-values...

Instead, special handling is done through something like an EXCEPT clause:

if x [1 / 0] then [
    print "Skipping THEN"
] else [
    print "Skipping ELSE"
] except e -> [
    print ["Handling" mold e]
]

Address (3) by making an isotopic ERROR! that falls out an evaluation chain without being caught cause a failure.

That means that if the above were a THEN and ELSE with no EXCEPT, the error would be offered as ^META to each clause, piped through. But then if the overall expression result wasn't targeting a ^META slot you'd get a problem.

This Means (Effectively) that there are ERROR! Isotopes

The trick is that error isotopes don't decay when you try to store them in variables, they trigger failure.

In a surface-level way, this somewhat resembles Rebol2's "armed" and "disarmed" errors. But it fits into a deeper story. And with stackless, it should be performant--because we can effectively be running a "TRAP" on every stack level at near zero cost (vs. the setjmp()/longjmp() that would be required in a non-stackless model on every frame)

This has potential, and I'm very curious to try it out.