Pure and Refined: Simplifying Refinements to One or Zero Args

hostilefork · March 18, 2019, 9:49am

In bringing back a modernized positional APPLY as the default, there's some really cool possibilities at hand. We can do something people have wanted for a while, e.g.:

apply :append [series value /dup n + 1 /part skip series 4]

Generalized quoting gives us an interesting ability because we can actually quote refinements in such a model:

apply :append [series '/append-me /dup n + 1 /part skip series 4]

Since APPLY (without /ONLY) evaluates its arguments, the APPLY operation can tell the difference between a refinement meant literally and one not. (You might have done this with a GROUP! before, but that's noisier and also more costly in the evaluator, while this is practically free.)

That's pretty "exciting". But it got me to thinking about skippable parameters like predicates, or the label to COMPOSE with. Those things are looking like they're going to be a very big deal--paradigm shifts. So how could you specify those?

Even though they're not refinements, you could specify them by their parameter name expressed as a refinement:

block: [(1 + 2) (<*> 1 + 2)]
apply :compose [block /label <*>]

So inside of COMPOSE, label won't be a separate refinement variable...it will be the actual TAG! <*>.

This got me to wondering...

Why Do Refinements Need More Than One Value Anyway?

I'm always frustrated trying to name refinement arguments. If the function takes a /PART, why can't the variable be called PART? What's this other name for? Isn't that what NULL is for in the first place? To indicate the absence of a value?

This might make it a little harder to discern if PART could be a LOGIC! or INTEGER!. You don't have as easy a test for discerning them. But why should refinements be any easier than the entire rest of the system on this matter? And how often does this actually happen?

Functions with refinements have historically been pretty confusing, and having a refinement that takes more than one argument is extremely rare. If you really need multiple arguments to a refinement for some reason, there's blocks and paths and such.

Having the function arguments be refinements themselves has been an interesting experiment. And it's very useful for refinements that don't have any arguments, because then the arguments themselves can't serve the "present or not" status.

But it's not like it would be that hard to write something like this:

>> used: func ['refinement [path!]] [
    if not null? get refinement [
        refinement
    ]  
]

>> foo: func [/a /b] [print [used /a used /b]]

>> foo/a
/a

>> foo/a/b
/a /b

(I think @IngoHohmann made something of the sort a while ago.) In any case, my point is that I think we can live without a separate "status of the refinement" and value.

How it would look in practice

Imagine this function interpreted under the new understandings:

foo: function [
    arg1 [block!]
    /ref1
    arg2 [string!]
    /ref2 [integer!]
][...]

What this would actually be saying is that you have a /ref1 refinement whose only value is its use or disuse. This would be like any refinement without an argument today. It would be blank if not used, and for good measure we could make it hold /ref1 as its value if used (seems better than making something else up, and actually has applications for 0-arg refinements.

But then, arg2 is just another normal argument that comes after it. And ref2 is a refinement with an integer! argument--but that integer argument would arrive int the ref2 variable itself, or it would be a blank.

So what this function actually is doing would be like the following in today's world:

foo: function [
    arg1 [block!]
    arg2 [string!]
    /ref1
    /ref2
    ref2arg [integer!]
][
    ref2: ref2arg
    unset 'ref2arg
    ...
]

Already you can see that it wouldn't be that different from today. And Ren-C already has tricks up its sleeve for doing legacy emulations...the old behavior of getting multiple arguments would be emulated one way or another, without doing too much extra work. (The simplest emulation would allow the same notation for single-argument refinements, and error with more than one argument--and that is likely sufficient.)

It would save space and speed the system up

Right now when you have a refinement with an argument, that's two frame cells to fulfill. Collapsing it to one is obviously more efficient.

But saving on storage is only part of it. There's a lot of evaluator complexity trying to keep the state and worrying about there being more than one argument...looping, checking. A ton of complexity just vanishes with this.

The "refinement revocation" methods of today are more complex than they need to be as well. You can get in dicey situations where you've revoked one argument and not another. Specialization has to cover cases where you set a the refinement to false but the value to true. The fact that you can always make a parameter a block if you really want it to carry multiple arguments seems to solve a lot of problems.

You could put normal arguments after refinement arguments

I show in the example above putting an ordinary argument after a refinement argument. That may not look all that useful to you. Maybe it would have some help in putting related parameters together without worrying about whether they were optional or not...kind of letting you express things in the flow of your thought.

But there's a really compelling reason to do this mechanically for deriving functions that add new arguments:

Because of the way frames work positionally, you can't derive one function from another in a way that reorders its arguments. This means that if you try to derive from a function that has two normal arguments and one refinement, you can do it today because it's not implied that everything after that point is a refinement. But once you've entered the "refinement zone", it's a point of no return.

This would correct that weirdness and permit extending functions with more parameters, either regular or refinement, and not run the risk of a regular refinement getting picked up as an argument to something it didn't intend.

You could write your arguments in any order in APPLY

It helps make sense of the "The refinement names the argument you are about to give" situation. But why not let you put refinements anywhere in an APPLY?

>> block: [1 2 3]
>> apply :append [/dup 2 /value <x> block]
== [1 2 3 <x> <x>]

The current refinement processing mechanics would be much easier to rationalize and simplify under this model and likely make such reimaginations possible--as well as other forms of lightweight skinning that let you reorder function arguments on a whim.

I haven't tried writing it yet, but...

When I think of all the various parts of the system that get bent out of shape over edge cases, I have to say I think this sounds like it may well be a winner.

For a while we could disable the ability to put normal parameters after a refinement, and just raise an error if you do that. So you'd know to convert /foo bar [integer!] to just /foo [integer!] In the future though, cases like bar would start working as being a normal parameter.

The only casualties I can think of are using blanks as refinement arguments, and being able to do partial refinement specialization inside of an APPLIQUE. So you couldn't specialize like this:

  applique :append [part: true ...]

That would assume you wanted PART to be the value true. For a partial specialization (e.g. one that says you get the behavior as if you'd written APPEND/PART at the callsite, getting a refinement as a normal arg) you'd have to say:

 applique :append/part [...]

I can think of some other mechanical complications, but nothing overwhelming off the top of my head.

IngoHohmann · March 20, 2019, 3:45pm

My gut reaction, I don't like it. It is an important part of Rebol, though I hate to have to come up with a name for a variable for a refinement.

Now I've checked the core and only found 2 functions using it. For whatever that's worth.

Things to consider when using blocks:

documentation (how many values, which types)
type checking
what's the impact of creating unnecessary blocks?

hostilefork · March 20, 2019, 4:25pm

Why worry about the impacts of something that never happens? Doesn't seem too relevant when you've "checked the core and only found 2 functions using it" and "hate to have to come up with a name for a variable for a refinement"...

Take my word for it from writing the evaluator and things like SPECIALIZE. If you want to talk about impacts on the system from a performance and memory standpoint, this is an huge benefit. The cost of the blocks you'll never make pale in comparison. Shortens function specs, saves space, simplifies broadly.

The main potential drawbacks that I see are:

Can't pass a BLANK! as a refinement argument. But as the philosophy of BLANK! and NULL equivalence has been going further, that's probably for the best. NULL was revoking refinements and it's getting pretty prescriptive that a NULL and a blank are the same if you receive them. e.g. when parse ... [... :(if false [...]) ...] splices in a "null" it acts the same as if you had a blank and ignores the rule, same for print. Semantically blanks and nulls should have the same meaning, e.g. nothing?, both should count as no value held by a variable for DEFAULT's purposes, the list goes on...
We lose of a feature that was Ren-C only, which was to error when you tried to use a refinement argument that was not provided. This dials back to the historical behavior where unused refinement "arguments" are blank and not unset. But I can't say I feel being saved from such errors has shown to be a world-changing benefit. The biggest advantage it had was conceptual, by letting you not be fooled into thinking you'd been passed an explicit blank as a refinement argument when the refinement had not been used. But per the previous point, I think that ability of having blank "used" refinement arguments was likely misguided.

I'll see if while implementing it I find any gotchas, but I am anticipating this wiping out huge amounts of complexity, and losing nothing of real value...while opening the door to important features like the ability to extend functions with normal arguments. So if you had a function that took two normal arguments and a refinement, you can make an adaptation that takes another normal argument and it won't be picked up as an argument to the refinement at the end of the existing spec.

hostilefork · March 21, 2019, 7:21pm

So there's one case I just saw looking at an unrelated issue: what if you have an idea that's coupled with something that wants to be able to signify "any-value!, or nothing at all".

Consider ARRAY/INITIAL. It wants to pass in an initial value. As it so happens, having that initial value not be BLANK! isn't important, because all it does if you don't use the refinement is turn it into a blank. But imagine if it wanted to do something different in the /initial blank and no-/initial case

array: function [
    {Makes and initializes a block of a given size}
    size [integer! block!] "Size or block of sizes for each dimension"
    /initial "Specify an initial value for all elements"
    value "Initial value (will be called each time if a function)"
        [any-value!]
][...]

But there's a tidy solution:

Allow <opt> refinements.

By marking a refinement <opt> you are saying that you distinguish BLANK! and NULL, and this distinction is meaningful. So that's a request not to automatically treat lack-of-use of the refinement as "blanking" (which is usually done for convenience).

So instead you'd write:

array: function [
    {Makes and initializes a block of a given size}
    size [integer! block!] "Size or block of sizes for each dimension"
    /initial "Initial value for all elements, called each time if a function"
        [<opt> any-value!]
][...]

Then if the refinement isn't used (or is "used" but passed a null argument), you get NULL...not blank.

This lines right up with the current philosophy, that if you are passed a null argument to a refinement it shapes it up as if you didn't get the refinement passed at all. It's a better way of doing it, with fewer edge cases or possibilities for error!

Mark-hi · March 25, 2019, 6:16pm

I am confused. Refinement parameters always come after "normal" parameters. If a function specifies two refinements (with or without refinement-arguments) and two arguments, the only reordering power you get at the call site is that the refinements can be expressed in either order. The two parameters must come, in the order their corresponding arguments are present in the spec block, before any refinement parameters, whatever order they must be in. Currently, if you wanted to add a new argument, you'd put it in the spec block before the first refinement -- which must be exactly as hard as inserting it before the <local>, which you'd have to do for your idea anyway.

hostilefork · March 25, 2019, 6:23pm

In order to reuse an implementation, the frame you build must have the parameters in the integral order that underlying implementation expected by the time it gets to the point of running its code.

As it happens, there is no <local> preserved in the spec. A higher-level transformation just turns them into SET-WORD!

 foo: function [x <local> y z] [...]  ; spec transformed to [x y: z:]

So <local> is already collapsed to the idea of a property on the parameter itself, not "everything after some point". There's no "gear to shift"--the visitation of the local is the only moment in which it's in local mode, then it's on to the next one. So a derived function can theoretically add normal parameters after that without being confused. (I say theoretically because right now there are a bunch of asserts that derived frames have the same length as the frames they inherit from, but we would have to relax this to let ADAPT add refinements and parameters after-the-fact...but this is doable.)

Refinements, on the other hand, do effectively "shift gears" of the frame fulfillment. I'm talking about not having this gear-shift. The elimination of the shift is a very good and simplifying thing, saving memory cells, accelerating the system and removing a bunch of code. And nothing of significant value is lost.

We'll get type checking of patterns inside of BLOCK!s etc. one way or another. Right now I'm attacking the bitset-of-64-bits limit for types.

hostilefork · April 6, 2019, 6:21pm

Though Ingo didn't mention it here, on the commit itself he said:

I have to take your word about this simplifying and speeding up the evaluator.
I can't currently comment on the code quality, but I trust you there.
I obviously haven't tried this, yet, but I seem to have come to like the idea, so go for it.

You still can name the refinement variables if you like. The times when it makes the most sense to do so is when your refinement has an active name, which due to the shortage of short non-noun names available in Rebol, you usually want to recover those from LIB. The previous model didn't do this...so you used up two names.

For instance, in these patches to @rgchris's curl.reb, the /AS refinement is changed to /AGENT. I don't know if the right decision would be to call the refinement /AGENT (probably is), but I wanted to show the issue of when you want to get AS the operator back how you have to do it today...which you would have had to do using such a refinement name. I think there should be something in the spec dialect which does this shuffle for you...which probably means capturing the meaning of AS however it was when the function was defined (e.g. COMPOSE-ing it into the body) vs. blindly getting it from lib.

That's all open to suggestions, but this change is in, and the Redbol layer emulates the old interface. Let me know of any questions/comments/concerns.

(Note: While I may make it "look easy" to do these kinds of changes to the system, they are -a lot- of work. But the impressive thing is the resilience of the asserts and debug mechanics that allow the changes. It's not the kind of thing R3-Alpha/Red would be prepared to do. Which indicates just how much more potential the Ren-C codebase has in it.)

hostilefork · July 8, 2020, 12:08am

I came across an old comment on Redo of an action. This concept of "redoing" an action originated from the PORT! code, where there'd be a generic definition of an interface function on the port...which would then have to be redispatched to a usermode function. So the arguments were gathered for the archetype, and then had to be moved into the right slots to line them up in the implementation...which could be at entirely different positions in the frame.

(The same code was later used in HIJACK when functions were hijacked with functions that were not derived from the same base, e.g. via ADAPT/SPECIALIZE/ENCLOSE. Thus they can have incompatible frames, so some guesswork needs to be done in a similar way.)

The comment I wrote says this, and take note of what I say about the difficulty of doing the mapping "in the face of targets that are 'adversarial' to the archetype:"

// This code takes a running call frame that has been built for one action
// and then tries to map its parameters to invoke another action.  The new
// action may have different orders and names of parameters.
//
// R3-Alpha had a rather brittle implementation, that had no error checking
// and repetition of logic in Eval_Core.  Ren-C more simply builds a PATH! of
// the target function and refinements, passing args with EVAL_FLAG_EVAL_ONLY.
//
// !!! This could be done more efficiently now by pushing the refinements to
// the stack and using an APPLY-like technique.
//
// !!! This still isn't perfect and needs reworking, as it won't stand up in
// the face of targets that are "adversarial" to the archetype:
//
//     foo: func [a /b c] [...]  =>  bar: func [/b d e] [...]
//                    foo/b 1 2  =>  bar/b 1 2

This shows another angle of how having refinements with arguments inhibits a simple mapping of one named argument to another.

Of course this raises the question of whether you should have to name your arguments the same thing. If the original function takes [event] should you have to name it the same, or could you say [e] if it was a plain argument and not a refinement?

Either way, it gets a lot easier to set a policy when refinements are the arguments.