Implementing COLLECT + KEEP

I thought it might be interesting to compare how Ren-C does COLLECT to other implementations.

Rebol2

collect: func [
    body [block!]
    /into
    output [series!]
][
    unless output [output: make block! 16]
    do make function! [keep] copy/deep body make function! [value /only] copy/deep [
        output: either only [insert/only output :value] [insert output :value]
        :value
    ]
    either into [output] [head output]
]

MAKE FUNCTION! in Rebol2 was a low-level routine that did not make a copy of the body. If you're wondering why it's being used here instead of FUNC when it does a COPY/DEEP of the body... I think that just means it's misguided inlining (to avoid the overhead of calling FUNC).

Historical DO is variadic, and if passed a function will collect further args at the callsite. Ren-C makes DO arity-1, and only runs functions with complete frames. Variadic invocation is done with the distinct RUN operation.

We see here a COLLECT/INTO feature that lets you do a COLLECT into an already existing series. These two statements would be equivalent.

collect/into [...] target
insert target collect [...]

There was a certain cabal of people who lobbied for adding /INTO operations to various functions in order to avoid the creation of intermediate series...which they believed was costly. Experience has borne out that the handling of /INTO generally made things slower. I was in the camp who never liked it, and called it "the /INTO virus". All instances of /INTO in Ren-C were dropped.

Due to the /INTO, the implementation is based on INSERT instead of APPEND, and has to update the intermediate block's insertion position as it goes.

There's no specialization in historical Rebol. So the KEEPER function being made here takes an /ONLY refinement, and then dispatches to one of two different calls to either INSERT or INSERT/ONLY.

R3-Alpha

collect: func [
    body [block!]
    /into
    output [series!]
][
    unless output [output: make block! 16]
    do func [keep] body func [value [any-type!] /only] [
        output: apply :insert [output :value none none only]
        :value
    ]
    either into [output] [head output]
]

The misguided inlining is gone, and it just uses FUNC.

Rather than having two calls to INSERT based on whether /ONLY is used, it uses the messy historical APPLY operator (see APPLY II: The Revenge for contrast).

If you notice the NONE and NONE in the APPLY, this is for missing refinements that APPEND and INSERT have, that aren't given to KEEP... /DUP and /PART.

r3-alpha>> append/dup [a b c] <d> 3
== [a b c <d> <d> <d>]

r3-alpha>> append/part [a b c] [d e f g] 2
== [a b c d e]

By using specialization, Ren-C inherits all the refinements of APPEND. There's no /ONLY (since splicing is solved with isotopes), but there is a /LINE refinement (demonstrated above)... and /DUP and /PART are still around.

ren-c>> collect [keep/dup <d> 3]
== [<d> <d> <d>]

ren-c>> collect [keep/part spread [d e f g] 2]
== [d e]

Red

collect: func [
    body [block!] 
    /into
    collected [series!] 
    /local keep rule pos
][
    keep: func [v /only] [append/:only collected v v] 
    unless collected [collected: make block! 16] 
    parse body rule: [
        any [
             pos: ['keep | 'collected] (pos/1: bind pos/1 'keep)
             | any-string! | binary! | into rule | skip
        ]
    ] 
    do body 
    either into [collected] [head collected]
]

Okay... well... they did that differently.

It adds another keyword, COLLECTED, to give you access to the collection result as you go. That seems to me about as likely to cause accidents from people not knowing about the feature as it is to be useful. Needing access to the buffer as you collect it means you might as well name it, at which point COLLECT is saving you almost nothing... Ren-C makes it easy enough to specialize APPEND and call it EMIT, adding whatever other features you need.

The code block that you pass in has its bindings mutated directly--and everywhere in the system that does this has bothersome implications. (pos/1: bind pos/1 'keep) uses 'keep as a shorthand for naming a context, e.g. "bind the word KEEP or COLLECTED that we found in the block to whatever context this word 'keep is looking up to". In this case, that means bind it to this function invocation.

Red has a weird ability to chain refinements that they're taking advantage of here with append/:only which saves them from doing an APPLY or branching to either APPEND or APPEND/ONLY calls. This only works if the variable name matches the refinement name. Strange feature. Ren-C can do this but via what I think is a more normal way, with append/(if only ['only]). In practice, having things like specialization, adaptation, enclosure, etc. turns out to be much higher-leverage (and faster in most cases).

Offhand I don't see how their /INTO can work correctly if they're using APPEND.

red>> data: [a b c]
== [a b c]

red>> collect/into [keep <1> keep <2> keep <3>] data
== [a b c <1> <2> <3>]

red>> data
== [a b c <1> <2> <3>]

Okay, so, it doesn't work correctly.

rebol2>> data: [a b c]
== [a b c]

rebol2>> collect/into [keep <1> keep <2> keep <3>] data
== [a b c]

rebol2>> data
== [<1> <2> <3> a b c]
1 Like