Every Thought On Array Splicing Has Been Had 🤯

...so it's time to take what's known and tie it all up. :brain:

Here goes:

BLOCK!s (and only plain BLOCK!s) Splice By Default

This is to say that BLOCK! is The universal container ("[o]"). Splicing of the contents of a non-block is done by aliasing it as a block first...which can be done efficiently without copying memory.

It was always obvious to me that PATH!s should not splice. But I didn't have any real guidance on GROUP!, or the new types like SET/GET-BLOCK!/GROUP! (and whatever @[block] or @(group) ultimately gets called). It was a toss up between "anything that uses spaces to separate elements splice, and anything tighter--like tuples/paths--not" vs. "the logo is [o], so anything with brackets is special" (I was pretty sure "do whatever Rebol2 did" was not a good idea).

But I think there's a better narrative for justifying this solution. Looking at other languages like Haskell/Rust/Elm, they all used parentheses for tuples...where their notion of a tuple is fixed-size with elements that aren't necessarily of the same type. This led me to think about the question of if BLOCK!'s universal containerness might make it something you don't just pick automatically...you'd pick something else when splicing wasn't what you'd typically want.

Here's a made-up example:

add-new-products: func [product-list: [block!]] [
    product1: '(#WID-0304 "Super Widget" $1.99)
    product2: '(#WID-1020 "Super Widget Plus" $2.00)

    append product-list product1
    append product-list product2
    append product-list [
        (#WID-0421 "Super Widget Premium" $2.01)
        (#WID-9999 "Super Widget Ultimate" $102,003.04)
    ]
]

See how light the generic quoting made the GROUP! literals? Now this makes the choice of BLOCK! vs. GROUP! something that you can reason about better. You can take advantage of the difference so you're not fighting the behavior--but "synergizing" with it.

But also...with @[...]/etc that doesn't splice, you have more options for shifting your data's conception of itself as it is passed around.

I'll point out that since there's no @ or : in the logo, this does ultimately dovetail nicely with "what's in the logo is special"...it's not just any bracketed thing, it's plain BLOCK!. :slight_smile:

"Modal Parameters" are the Answer for Anti-Splicers

I've been trying to embrace @rgchris's concept that the language bias should stick to making it so that "common" code does not lean too strongly on symbols to convey meaning. While you can retrain yourself to comprehend pretty much any symbol soup, letting your mindset drift to that "new normal" isn't good for communicating code to others. (And it's probably not good for your own ability to see clearly, either...even if you -think- you understand what you're doing.)

This has to be balanced against many other design factors; to which I'm sensitive because I actually understand what it takes to make things work at a mechanical bits-and-bytes level. So this pushes back and forth.

Modal parameters give an easy mechanism to library authors, or others who want a rigorous way to append values "as is" without typing /only at every callsite.

>> append [a b c] @[d e]
== [a b c [d e]]

>> item: @[d e]
>> append [a b c] item
== [a b c @[d e]]  ; modality comes from parameter, not fetched value

>> append [a b c] '@[d e]
== [a b c @[d e]]  ; quoting suppresses modality, then evaluates away

 >> do compose [append [a b c] '(third [<d> #e @[f g]])]
 == [a b c @[f g]]  ; quoting in COMPOSE is there to help in cases like this

It's not something that has to be brought up in early tutorials. But I like it. And it's a generic mechanism that people can use when they want a parameter to indicate the mode of a refinement...so there's generic uses for it.

Further changes to APPEND and such will be abandoned

I experimented with making APPEND only accept blocks, and then maybe blocks and strings, and other kinds of tweaks. They weren't worth it.

What's changing the game here is making BLOCK! the only type that splices by default.

It used to be that when you had a moment of doing an append of some parameter--that wasn't a block before but suddenly is now--you groaned and said "why'd I forget the /ONLY" or "why does this darn thing behave so randomly".

Now a new thing you can ask in many of these moments is: "Why was this value a plain block if I didn't want it to splice??" It should feel less random, when you have more alternatives. I'm going to look at my array choices in this new light, and maybe need /ONLY less often as a result.

Modal parameters are good to have to point people to who aren't on board with splice-by-default. And I'm willing to accept the burden of using a modal parameter to enter an @ symbol when I want it. It saves significant evaluation time over APPEND/ONLY, and doesn't require a series node allocation to hold the APPEND and ONLY words.

BLOCK! Conversions Needed

Because the only way to get splicing is now to have a block, it raises the question of how to get blocks.

AS BLOCK! is cheap; it doesn't allocate any memory, it just aliases the series as a different cell class. So it's a good choice if you know what you have is an array.

>> group: '(d e f)
>> append [a b c] as block! group
== [a b c d e f]

If you don't know if you have an array value or not, this is a little harder. But we can actually turn items into a one-cell BLOCK! without allocating any memory. This is a new trick which was called mirroring, but the mechanism is changed to where it needs a new name. The new block is read only, but that is okay for the purposes of this append since it's gone after the splice.

The name for the operation I had in mind was BLOCKIFY (though it's not using the mirroring mechanism at the moment, only the PATH! trick is, which proves it does work).

>> value: '(d e f)
>> append [a b c] blockify value  ; [d e f]
== [a b c d e f]

>> value: 1020
>> append [a b c] blockify value  ; [1]
== 1020

But that's a weird looking word, and I hate having to have lists of things like this (groupify? set-groupify?) The concept from the other day of FORCE might be interesting

append [a b c] force block! value

That's at least generic, and it kind of conveys "I want a block, if it's a block then great, if not then change it". But that makes it sound like its changing the input value to be a block, vs. wrapping it.

But maybe AS can wing it, and say that if you give it a non-block it can just do the wrapping in a block anyway and give you something read-only? :-/ I mean, the user doesn't know that every single-valued item doesn't secretly live in a 1-element array in the implementation...it might. Hmmm.

Anyway, this name looks like the only missing piece. What do you call something that doesn't do any memory allocation but just makes a light 1-element wrapper that can live in a cell, to keep us from having to add more crazy refinements?

It's not a super high priority as most cases are known to be AS (in the form known today)

Speak Up or Hold Your Peace

Like I say, I think I've probably had all the thoughts. There aren't any more to be had. If you want to prove me wrong, post it here...but do it soon.

2 Likes

It is a kind of CAST that you want to do here. Could call this FORGE. Other suggestions FABRICATE (too long) HATCH, CRAFT (I like that), FEIGN, FIDDLE.

Do as R2 did, is a good starting point. Many things have been looked at with a refreshing mindset. Though a lot of things have grown to become as they are, which is not always as consistent or well-thought out. Imo all is open for improvement with the proper argumentation.

I think I like where this is leading to and being able to do without the /ONLY to append a block! as a block! to another block! is pretty cool. I have had many occasions where I forgot the /ONLY and having the @ notation to signal the /ONLY purpose is fine for me.

Another name for BLOCKIFY could perhaps be BLOCKSET, ASBLOCK, TOBLOCK.

In some sense, all of life is "storytelling" in one form or another. So a lot of this deep consideration is really about "having a story".

When people show up and ask why something is the way it is, it's important to be able to tell that story and have it be something that they can put in their mind and be comfortable with. I think due diligence has been done here, and no one can say alternatives weren't given a shot.

I'll point out that it's not just about the endgame. While it may seem like nothing is accomplished by spinning one's wheels trying to change things like this back and forth over years, that overlooks all the enhancements that came in that process.

I want to avoid "TO" usages in contexts like this because I'm pretty sure all "TO" operations need to make a new series.

>> input: '(a b c)  ; note: remember that Rebol2 INPUT is now `ASK TEXT!`

>> output: to block! input
== [a b c]

>> append input [d e f]
== [a b c d e f]

>> output
== [a b c]  ; not changed when original is changed

What we're looking for here would be more like what AS does; it's just a question of how much sense it makes to be able to say as block! 1

I've been thinking about the techniques for AS BLOCK! giving back [] as an immutable block that lives in a single cell...and it's a little bit tricky in the implementation because the cell has no room for an index. So if you say something like foo-next: next as block! then there has to be some magic.

I think the magic is that it uses a single bit in the cell to indicate whether it's at the head or the tail, and if you try to step outside of that boundary then you wind up triggering an allocation of a read-only 1-element array that gets put into an actual cell. This would mean you'd get a lot of those with:

pos: as block! <foo>
pos2: skip pos 2
pos3: skip pos 3
pos4: skip pos 4
...

Following this all through in the code, I'm pleased with the very clear definition...that has a philosophical basis now:

append block :value 
; ^-- appends 0 items if value is null, 1 item if not a block, or N if block

It's rigid, it's clear. You have an easy test of block? :value for knowing if you're dealing with the universal container. If you want to append it as-is, use append/only or @value.

Clarity Disappears If You Use TO BLOCK!

Let's put aside AS for a moment. As pleasing as the above is, you're quickly into the danger zone once you write:

either block? :value [
    append block value  ; add itemwise
][
    append block to block! :value
]

The spectrum of meanings for TO BLOCK! value is currently so variant that this is nigh-unknowable if you don't know what type value is.

But one thing is clear...it doesn't make a whole lot of sense to have your TO BLOCK! of a value just wrap that value in a block. You can do that with reduce [value]...it's a poor use for TO (and if you want blocks left as is, you can use blockify value).

I'd go so far as to say that it's so bad that no data type should define its TO BLOCK! conversion as simply putting the item in a block. That's just a way of saying "I can't meaningfully be expressed as a BLOCK!". If that's the case, the conversion should be an error.

rebol2>> to block! [12-Dec-2012]
== [12-Dec-2012]  ; I think we can now say that this is "clearly bad"

There are likely some other good rules for what TO BLOCK! would do. e.g. FOR-EACH on your item should give back individual things that match. So this looks inconsistent to me:

rebol2>> collect [foreach ch "ab cd" [keep ch]]
== [#"a" #"b" #" " #"c" #"d"]

rebol2>> to block! "ab cd"
== [ab cd]  ; e.g. it was transcoded

That TO BLOCK! is much more obvious if you just say transcode "ab cd". So I think the FOREACH result should be the TO BLOCK! result. (This aligns with how I've been talking about how TO BLOCK! of a string giving back individual CHAR!s would be handy if you are dealing with some kind of string algorithm for which fixed-size codepoint manipulation would be useful.)

My proposal for AS is an efficient alternative to TO

The goal of AS is to produce something that doesn't necessarily have its own identity (if it can avoid it) but otherwise should act the same as TO.

This means you could do some kind of surgery to build a BLOCK! you don't need for anything else, and then as path! it. Even though PATH! is no longer in the ANY-ARRAY! typeclass (not always guaranteed to have an array under the hood), the system would take advantage of it (since it can in this case) and the path would co-opt the series internally.

But...

If AS is just a compatibility for TO, then I've suggested this behavior of falling back on blocks if you don't know what else to do is not cool.

Point is that we have to use a lot of imagination to know what the heck someone means if they write:

append block as block! some-completely-random-value

The value might be a DATE! or an INTEGER! or a GROUP! or a PATH! or who-knows-what. I'm not sure where in the code anyone should be writing this and expecting it to "just work".

So I think the example is just flawed by design. You shouldn't see code like this. Instead you should see things that look more like:

append block switch type of value [
    path! [to block! value]  ; don't try to use internal sharing optimization 
    group! [as block! value]  ; do use internal sharing
    date! [compose [the-date: (value) {adds 3 items}]]
] else [
    value  ; append as-is
]

This gets rid of having to have a cheap way to put an item in a block, just to conform weird values to the "Promise me you're making a block" protocol. I think that is just a red herring.

I'm really liking how this is laid out with the single unambiguous point of control on splicing. BLOCK!-or-not. Again...its simplicity makes it something you can leverage vs. fight against!

1 Like

It's fine for me!
(some more chars)

1 Like