In R3-Alpha, an /INTO option was added to REDUCE and COMPOSE. It blended the functionality of INSERT into these routines, so as to avoid the overhead of creating an intermediate series that would just be thrown away:
>> data: copy [a b c]
>> insert data reduce [10 + 20 30 + 40]
>> data
[30 70 a b c]
>> data: copy [a b c]
>> reduce/into [10 + 20 30 + 40] data
>> data
[30 70 a b c]
So no new functionality is added...this is a refinement whose sole purpose is to be a lower-overhead way of doing what you could do already.
But...it's narrower. There's no /PART refinement, so you're going to get all of the reduced data inserted if you use /INTO. There's no /DUP, so you'll get one copy only. There's no /ONLY, so arrays will be spliced in. And from a Ren-C perspective, there's no /LINE (which APPEND+MODIFY+INSERT have now)...so you can't have the inserted data give a newline marker.
Plus, /INTO just has INSERT semantics, and returns the tail of the operation. You can't do MODIFY. And if you want to optimize append data reduce [...]
you'd have to generally say head reduce/into [...] tail data
. Noting that each function call in the evaluator has cost, and path dispatch takes longer than ordinary dispatch in the first place, one might wonder just how much it's saving...?
I don't want to get into a bunch of artificial examples of /INTO usage to show what it's faster or slower at...but to make my point here, some R3-Alpha timing of that:
>> data: copy [a b c]
>> delta-time [loop 1000000 [append data reduce [10 + 20 30 + 40]]]
== 0:00:00.481017
>> data: copy [a b c]
>> delta-time [loop 1000000 [head reduce/into [10 + 20 30 + 40] tail data]]
== 0:00:00.397192
I'm sure you can craft some situations where it can be shown to perform better...especially when the GC is taken into account. But what I'm trying to get at is that I think this is the wrong place to be looking for optimization.
- It's asking users to write their code in an unnatural form with more limited options than they'd expect from APPEND+INSERT+MODIFY. (While trying to write the above example I got mixed up and tried to write
reduce/into data [stuff to reduce]
because saying where to put the reduced data first feels more natural...) - It creates a cognitive uneasiness ("am I doing this right? should I have used a REDUCE/INTO"), leading people to write less clear code in pursuit of a performance benefit that may or may not materialize.
- Everyone writing an operation that generates a new series will now wonder if they have to make an /INTO version as well.
- REDUCE's increased complexity means more documentation, more refinements to fulfill in the frame, more checking of those refinements, more cost to evaluate the PATH! when the refinement is used, etc.
- This is all for no increase in functionality.
While all these points are a bit grim, it's actually #3 that worries me most. An even more frightening thought is if people start worrying about adding the missing /PART, /DUP, /LINE components. We can see that /INTO is a precedent that suddenly creeps into the source level consciousness of everyone writing code.
There are a lot of areas to look at for making the system run on the whole faster without the downsides that /INTO brings to the table. There should be a heavy skepticism about introducing these kinds of things. Maybe it can speed some things up, but for what collateral damage?
So I want /INTO to die, and tackle optimization in more systemic ways. And it's easier to tackle systemic optimization when there's less code, following more predictable rules.
Note: Just looking at the one use of COMPOSE/INTO in R3-Alpha for a moment:
opt [#":" copy s2 digits (compose/into [port-id: (to integer! s2)] tail out)]
...it's a good data point on this being a bad direction. Compare with:
opt [#":" copy s2 digits (append out compose [port-id: (to integer! s2)])]
I think playing to Rebol's strengths means doing everything possible to make the second case acceptable enough in performance that such distortions aren't worth it within the domains it would be applied to.
(I'd also like to add that use of CHAR! literals like #":" in PARSE rules that are processing ANY-STRING! is another area where we need to work to make sure that it's not sufficient performance benefit to warrant doing that instead of plain ":". Those are the kinds of things that need to be solved at acceptable cost as well, to get the source to be as clear as it can. You should only be using character literals if matching CHAR! values in BLOCK!s...)