Make Your Own Safety?

hostilefork · March 16, 2019, 4:34pm

At times it seems like Rebol can't decide if it's an extremely high-level language, or some kind of assembly language. It's subject to interpretation (pun possibly intended).

Certain unchecked assumptions seem like disasters ready to happen. For instance, if GET of a PATH! allows evaluation:

>> o: make object! [x: 10 y: 20]

>> value: 'o/(print "Boo!" | either condition 'x 'y | elide condition: false)
>> condition: true

>> get value
Boo!
== 10

>> get value
Boo!
== 20

The word GET does not really seem like it would have side-effects. You may not think to check that value is a PATH!. You may expect two GETs in a row to return the same thing. Etc. etc.

But if you put code as a GET-PATH! in source, it wouldn't seem so uncomfortable:

>> :o/(print "Boo!" 'x)
Boo!  ; Well, I told it to say that at this exact callsite, must have meant it...
== 10

For this case Ren-C tried to straddle the line, by allowing it in the GET-PATH! and SET-PATH! cases of the evaluator but not in the GET and SET native operations. But this starts creating a drift to where GET of a PATH is not the same as a GET-PATH!. That gets tangled up pretty fast...refinements here don't line up with choices there, it feels asymmetrical and unkempt.

Is it worth the tangle? Isn't everyone just one step away from assuming that do compose [(value): ...] will work the same as if you said SET VALUE? Is there really such a difference between the "English" get 'foo and the shorthand :foo in the first place?

Rebol is for customization, should people build their own safety?

One hates to pass the buck and say "well, the user can do it". But if you're making a system that's small enough to Put The Personal Back Into Personal Computing, maybe you don't want to second guess things like what everyone would want from SET. You might guess wrong.

Ren-C's pursuit is Power to the People to address the pain points that specifically peeve them. They don't have to wait for any language implementer to do it. e.g.

set: adapt 'set [
    if any-path? target [
        for-each x target [if group? x [fail "GROUP! in ANY-PATH!"]]
    ]
    ; fall through to normal SET behavior
]

The SET as used by the mezzanine will keep on working. The goal is that this definition applies in whatever place you were doing your work.

They didn't have to redefine the function interface, or rewrite the HELP unless they want to. This is the essence of what we're going for. And there are ways to make it more efficient, you can implement that check as native code against the internal API if you wanted.

So is the main value consistent behavior between GET-XXX! and GET (actually GET/ANY)? Then have a GET/HARD which doesn't evaluate groups and uses them as data (e.g. set/hard 'my-map/(1 + 2): <sum> would actually consider the GROUP! (1 + 2) to be the key. There are other reasons this ability is important besides maps, like avoiding double-evaluations in constructs like DEFAULT that have to both GET a left hand path and SET it, and don't want to run groups twice.)

...Maybe? But not everything should be laissez-faire...

The argument above about GET works because if you don't like it, you can change it. But what happens when the very tools by which you might change things fall down on you at a basic level?

We know how quickly things go to hell if you put functions in blocks and try to enumerate them.

 >> block: reduce [:print]

 >> for-each i block [if group? i [print "Found a group!"]]
 Found a group!
 #[void]
 ** Script Error: if is missing its branch argument

I don't want to belabor the implications of security, here. That's another thread.

But it seems to me that if the bias on a few things were tweaked a little, it would be a big help. One thing I've written about before is the idea that GET-WORD! not mean soft quote, but mean "I can accept ACTION!s". If you leave it off, you don't...and the FOR-EACH errors. Seeing the notation for-each :i block [...] would be a visual reminder that you're dealing with an iteration that might have an i that's an action, so you should be sensitive to it.

These are places that have to be looked at, and looked at soon. The web demo is going pretty well so far and even a bit ahead of schedule for Beta/One. But topics like this are weighty and monstrous, and there's so many of them that we would be looking at decades if they all were to be known-to-the-extent-we'll-know-them.

So in the end, what should SET do on PATH!s with GROUP!s?

I actually was going to go ahead and bias it so that SET goes ahead and runs GROUP!s in the path. Looking at the reality of the code is giving me some cold feet. As is often the case, the process of trying to reverse a change that was put in for a reason is a reminder of the motivations.

I'm going to keep thinking about it. But still, the point I raise here is a valid one. There may be a general principle that we be very selective about where we make our safety pushes--being mindful of the question of how difficult it would be for a user to customize the feature for themselves. The harder it would be, the more attention that issue should get.

hostilefork · May 2, 2019, 7:22pm

As another example of ceding "safety" to "expressivity", enfixedness has gone back to being an intrinsic property of functions. As such, it is preserved by assignment, or by passing as a parameter.

>> plus: :+

>> 1 plus 2
== 3

>> foo: function [arity-two [action!]] [
     print "Hello"
     arity-two 1 2
]

>> foo :plus
** Script Error: arity-two does not allow text! for its value1 argument

So you're getting an arity-two function, but you don't know its parameter convention. Hence it's dangerous to invoke it without an APPLY.

Breaking enfix functions out into a separate datatype (like OP!) would sort of help. But note that in Rebol2/R3-Alpha/Red there was only an ANY-FUNCTION! class, not ANY-PREFIX! and OP!. You could have made ANY-PREFIX! as a typeset yourself, but we have to ask how useful is this, really. When invoked through the evaluator, any aspect of the calling convention you don't know about is going to be a problem. Not just using the wrong number of args, but even with the right number then a function that quotes an argument is going to behave differently from one that doesn't when you call it.

So if there's a way of enforcing a type convention on function arguments, it seems that checking a signature for its left-looking-ness is just another thing that you'd have to check. Or, you should use an APPLY that supplies the arguments on your own terms.

It's not in vain to try things to improve the safety of the language. Making enfix not carry through assignments helped develop some interesting properties of the system that are serving in other places today, like CONST. But the idea that a function you trusted could lose its behavior by calling convention when passed through a simple assignment just turned out to not be worth the perceived "safety" it was granting.

hostilefork · January 11, 2021, 5:03pm

I was a bit frustrated by MOLD, because it was erroring on NULL input.

I was looking at this case:

print ["RETURNS:" either return-type [mold return-type] ["(undocumented)"]]

If MOLD of NULL returned NULL I could just say:

print ["RETURNS:" mold return-type else ["(undocumented)"]]

But this risks being caught unawares when asking to MOLD a NULL in other cases.

However, we've decided this is legal:

>> unspaced ["a" if false ["b"] "c"]
"ac"

So even though there is a general rule that functions not silently pass on NULL, MOLD is one of those cases where it's actually useful to do so.

Tying this back to "Make Your Own Safety..."

We should make it very easy for someone to make a version of MOLD that does error on NULL input.

Even without easy type signature manipulation (coming soon?) there's already a pretty good answer for this case with ADAPT:

mold: adapt :mold [
    if null? :value [fail 'value "MOLD adapted to not take NULL"]
]

This customized mold has all the refinements and help information of the original, and by "blaming" the callsite (with the fail 'value) the error is even half-decent.

The idea here of making it painless for people to twist constructs to address the problems that irritate them personally seems the best bias to take. We want these function variations to be frictionless and efficient enough that you don't feel burdened by creating a lot of them.

It's going to pretty much always be easier to add checks on at higher levels than to remove them from lower ones. So I think allowing NULL inputs to MOLD is a good default.