Beating REPEND: A New Parameter Convention?

When you do an append a reduce b, the REDUCE generates a new series... let's call it rb. Then rb is spliced into a. And then rb needs to be GC'd.

The idea behind repend a b is that you never make rb. Instead, expressions are evaluated one by one and put onto a as you go. The savings are twofold...reduced memory overhead and reduced tax on the GC by not making extra series nodes.

That might sound like a great savings, but here is a heated debate in Red about the questionable benefit of REPEND (as well as /INTO):

`repend` vs `append reduce` differences and regression · Issue #3340 · red/red · GitHub

I guess I'm halfway on DocKimbel's side there...in that if REPEND isn't showing a benefit it's probably more to do with a bug in REPEND vs. that the idea doesn't represent a savings.

But I hate the word REPEND. Things like REMOLD are double monstrous, and REFORM? Give me a break. These make a terrible impression.

More generally, I don't like the idea that every function would have to come in two flavors and create anxiety on the part of the caller as to if they're using the optimized one or not. I'd like any optimization to be more "under the hood" so the caller doesn't have to fret about it.

This got me to thinking...

A New Voodoo for GET-params!

Let's imagine that we have a new rule for params that look like GET-WORD!:

  • If the argument is a GET-XXX!, it is passed literally

  • If the argument is anything else, it is evaluated normally and the product is passed in with one quoting level added.

Here's an example definition

appender: func [
    block [block!]
    :value [any-value!]
][
   print ["Block is" mold block]
   print ["Value is" mold value]
   if get-block? value [
       append block reduce as block! value
   ] else [
       append block unquote value
   ]
]

Let's look at some concrete examples:

>> appender [1 2 3] 2 + 2
Block is [1 2 3]
Value is '4
== [1 2 3 4]

>> data: [[a b c] [d e f]
>> appender [1 2 3] second data
Block is [1 2 3]
Value is '[d e f]
== [1 2 3 d e f]

>> appender [1 2 3] :[10 + 20 100 + 200]
Block is [1 2 3]
Value is :[10 + 20 100 + 200]  ; not quoted!
== [1 2 3 30 300]

At the source level, the user doesn't really have to worry about the parameter convention. They get the same outcome as if the REDUCE had been done by the evaluator, but the APPENDER becomes complicit.

And look what happens if the GET-BLOCK! is in a variable...

>> data: the :[10 + 20 100 + 200]
>> appender [1 2 3] data
Block is [1 2 3]
Value is ':[10 + 20 100 + 20]
** Error: Cannot append evaluative items...

A ha! We could tell that this was an evaluative get-block product, and not meant to participate in our little trick. (Erroring is actually the right answer here, you would need to use only data or ^data or quote data etc. to APPEND an evaluative GET-BLOCK! under the new rules.)

This is neat, because it means users can express intention to reduce at the callsite...and it is something that you can optimize on an as-needed basis.

As One Would Expect, There Are Some Glitches...

There are some seeming semantic glitches when a function takes these and they're not the last parameter, where you might see variations along the lines of:

 >> takes-first-args-normally :[elide print "A", 1 + 2] (print "B", <x>)
 A
 B
 3
 <x> 

>> takes-first-arg-specially: :[elide print "A", 1 + 2] (print "B", <x>)
A
B
<x>
3

Basically: If you somehow relied on side effects happening in left-to-right parameter order at the callsite, then moving the REDUCE of any parameters other than the last one into the body of the operation will change that order.

This is nothing new for this line of thinking in optimization: imagine if APPEND and REPEND took their arguments in the reverse order, so that the block wasn't the last item. You couldn't just blindly substitute APPEND REDUCE for REPEND in that case, if you were dependent on argument-ordering effects...if there was an evaluation in the first parameter's reduction that was needed for the second parameter.

But the difference is that the person editing APPEND REDUCE => REPEND made a change at the callsite. If you change the parameter convention and don't touch the callsites--with the intent that they stay working and you're just adding an optimization--it starts to matter.

We have some control here, though! We can define how GET-BLOCK!s act as arguments to function calls, and we can say that they don't actually perform their REDUCE until the function executes. That leaves breathing room for people who wish to add the optimization later...knowing they won't break the expectations.

Whew, that solves that problem! Good thing it's the only one! Oh, no, wait... :face_with_head_bandage:

Not All REPEND Operations Take Literal Blocks

You see repend data [...] a lot of the time, but there's also repend block1 block2.

So append data :[...] can be finessed as an optimization for the first case, but doesn't solve the second.

To shore it up, we'd have to say that :(...) means "reduce the result of what's in the expression".

>> :(reverse [1 + 2 10 + 20])  ; -> :[20 + 10 2 + 1]
== [30 3]

This way, we could actually pass the APPEND an expression to reduce the product of. We'd need to do the evaluation at the moment we passed the parameter (I think), and then alias it as a GET-BLOCK!, so:

>> appender [1 2 3] :(reverse [1 + 2 10 + 20])
Block is [1 2 3]
Value is :[20 + 10 2 + 1]
== [1 2 3 3 30]

Where Are GET-WORD!, GET-PATH!, GET-TUPLE! in all of this?

We don't have GET-WORD! mean "reduce the product of fetching the word":

>> block: [1 + 2]

>> :block
== [1 + 2]  ; not [3]

But it seems it would be inconsistent to not put these other GET-XXX! types into the family of parameters that are captured as-is. So the above code would get this behavior:

>> appender [1 2 3] :foo
Block is [1 2 3]
Value is :foo
** Error: Cannot append evaluative items...

Instead of a REDUCE it would need a GET. But this makes a good argument for why REDUCE of a GET-WORD! should work as a word fetch, for generality... it makes routines like this easier to write correctly.

I don't think it's worth shuffling the symbols around so that :foo does a reduce and we pick something else for GET. It seems to me that :(foo) is fine enough.

But even though GET-WORD! won't run arbitrary code, you can be impacted by ordering problems, where someone might pass a :foo argument and then in the next parameter change the value of foo. Hence for consistency, we'd be saying that normal parameters would likely have to delay their get of foo until all the parameters were given...this way you could change the parameter convention without affecting callsites.

But likely the best way to go about that would be to protect the word from modification:

>> some-func :foo (foo: 20, <arg>)
** Error: FOO captured by GET-WORD! in parameter slot, can't modify
      while gathering arguments

I'm Probably Over-Worrying About It

...these protection mechanisms I mention in order to make it painless to change a parameter convention are not likely suited to being the kind of concern that applies.

But it's good to articulate what the limits of a design are...

2 Likes

Functions like this can give us pretty profound guidance on what the behavior of operations on various datatypes should be.

For instance: here we see that we want to REDUCE a GET-BLOCK! (and any GET-XXX! type), and we want to unquote quoted items. Well doesn't that make it seem like a good idea to make REDUCE of a quoted item unquote it?

>> first ['foo]
== 'foo

>> reduce first ['foo]
== foo

That makes the above shorter:

appender: func [
    block [block!]
    :value [any-value!]
][
   print ["Block is" mold block]
   print ["Value is" mold value]
   append block reduce value
]

Internally there's all kinds of shortcuts you could take with this...empty blocks being no-ops, or blocks with just one inert item not kicking into a reduce.

I think that we might want to avoid making REDUCE accept just any type, like WORD!s etc, but here we can see that if you were wiring up operations and you wanted to line it up so some input to REDUCE was a no-op you'd know how: you'd quote it.

1 Like