Archival Brainstorm: Could APPEND, FIND, etc only accept blocks?

hostilefork · July 22, 2019, 5:28am

(Update 2022: This circa 2019 meandering thread started with a simple question, that led to a lot of experiments that may have confused matters more than clarified them. So read at your peril. The current state of the art proposal involves "isotopic groups" and is a completely new direction.)

The following Redbol-ism long seemed unnatural to me, of defaulting to breaking paths apart unless you say /ONLY:

rebol2>> append [a b c] first [d/e/f g/h/i]
== [a b c d e f]

rebol2>> append/only [a b c] first [d/e/f g/h/i]
== [a b c d/e/f]

This raises people's questions about why blocks should default to that, which confuses many a newcomer...and which I was opposed to when I first saw it.

rebol2>> append [a b c] first [[d e f] g/h/i]
== [a b c d e f]

Ultimately it came to be that using Rebol in practice made me feel splicing was the natural default for many block operations. But something long remained uncomfortable with this pattern... which applies to other routines with this kind of /ONLY (for instance, FIND)...is how it's hard to tell at the callsite what's going to happen when you are talking about data indirectly:

 rebol2>> item: [a b c]

 ;... then much later, callsite might suddenly stop working generically
 ;... when you suddenly switch item to a block from something else:

 rebol2>> find reduce [1 + 2 item 3 + 4] item
 == none

The known existing schools of thought

"splicing should be the special operation" - you should have to ask for append/splice to get it, with all other appends defaulting to being /ONLY. More generically the term might be something like /multi so it works e.g. with find/multi
"there's something special about BLOCK!" - this could be thought of as reinforced logographically by the distinction of using brackets in the [o], that makes it work, so you just know they are weird.
2b. "there's something special about datatypes where space is the delimiter, so BLOCK! and GROUP! both count
"it's too late to change any of it it, so leave it as is"

I was convinced #1 was probably not in Rebol's best interest. But then I disliked auto-splicing of PATH! enough that I rejected #3 as a least-favorite option. So at one point Ren-C was switched to using #2b. But...

Informing this further with a new use case

I just ran into a problem with my proposed usage of @[...] to be an irreducible capture of a datatype, that can also carry a quoting level. That doesn't work if @[...] blocks are considered a container that needs /ONLY for things like FIND:

 find reduce [integer! quoted-word!] quoted-word!

Becomes:

find [@[integer] @['word]] @['word]

It today assumes the @[...] is to be handled the same a a regular block. So this is morally equivalent to:

find [@[integer] @['word]] ['word]

Which acts the same as:

find [@[integer] @['word]] lit 'word

That doesn't do what's intended, and doesn't match the datatype. But it feels ever more haphazard to just pick a random reasoning.

What it makes me feel is that there's just something fundamentally wrong being glossed over.

Concept: expect [...] always, splice always, leverage :[...]?

It feels like "at the source level", you want to be able to see whether what you're passing along is going to be treated atomically or not. This is a parallel to other problems, like what caused "Backpedaling on non-block branches". That was mitigated with soft quoted branching, which carries some other advantages.

One idea could be that appends, insert, finds, etc. always take BLOCK! and always presume splicing semantics.

>> append [a b c] [1 + 2 10 + 20]
== [a b c 1 + 2 10 + 20]  ; compatible with history

>> append [a b c] 3
** Error: APPEND does not accept INTEGER! for its value argument
; Not compatible, but at least an error and not random new behavior

>> append [a b c] [3]
== [a b c 3]  ; compatible with history

Also allow GET-BLOCK! to ask for reduction along with your splicing:

 >> append [a b c] :[1 + 2 10 + 20]
 == [a b c 3 30]

Block parameters (the only type which would be tolerated) passed in would be spliced--because as mentioned it always takes a block parameter, and they're always spliced:

 >> items: [1 [2 + 3] 4]
 >> append [a b c] (second items)
 == [a b c 2 + 3]

But you could slip past this by using a GET-BLOCK! that has your expression in it...thus it would reduce and get spliced, but leaving the original alone...effectively an /ONLY:

 >> items: [1 [2 + 3] 4]
 >> append [a b c] :[second items]
 == [a b c [2 + 3]]

Further radicalization - soft quote the second argument?

If you'd be willing to write append [a b c] (second items) always instead of append [a b c] second items, then all of the above is compatible with soft-quoting. You could then use literal material as-is, which could work for BLOCK! and other things:

 >> append [a b c] '[1 + 2 10 + 20]
 == [a b c [1 + 2 10 + 20]]

 >> append [a b c] '3
 == [a b c 3]

As with branching, non-quoted things would complain if you didn't give it a BLOCK!:

>> item: 10
>> append [a b c] (item)
** Error: APPEND only takes BLOCK!, GET-BLOCK!, or QUOTED!

>> append [a b c] [item]
== [a b c item]

>> append [a b c] :[item]
== [a b c 10]

>> append [a b c] '10
== [a b c 10]

>> item: [1 + 2 10 + 20]
>> append [a b c] (item)
== [a b c 1 + 2 10 + 20]

It would work how an IF doesn't have the rule on its condition but only the branch, it's just the thing to be appended:

 >> target: "abcd"

 >> append target ["efg"]
 == "abcdefg"

 >> append target '{ghi}
 == "abcdefghi"

Pros

You aren't confused when you see append x (y) about what y is going to look up to. Because if it weren't a block, that would be an error. Being introduced from day one to APPEND+INSERT+CHANGE as operations that expect a block of things to be appended... and FIND+SELECT as taking a block of things to be found, might seem strange to us now...but I think the net complexity drops compared to /ONLY and the problems it causes.
Kills off the idea of /ONLY and all the mire that accompanies it. It is so easy to make mistakes with that, no matter how experienced in Rebol code you are. I don't feel any satisfying solution has been articulated about it.
Has a good alignment with the [o] meaning BLOCK! is a special datatype. Sets up a psychological basis for working coherently, and hopefully not making mistakes down the road. Suggests people use blocks to represent groups of parameters as arguments to functions instead of single items systemically...a better principle to embrace than expecting them how to realize to design every routine with an /ONLY option...which was a bit like the /INTO virus
Can cover REPEND-style cases where the block being repended is source-level with expressions (likely most common). If it's quoting the second argument, it could blend the evaluator into the operation, even if that evaluation is just to reduce a variable name for you to be the item to append. There's been some amount of issue about performance when the natives are reduce and append and executed in two steps, whereas a reducing append that saw the source GET-BLOCK! could build a right-sized block more efficiently.
Single element blocks are pretty efficient; moreso than refinements I've mentioned how the design has been set up such that :[a] fits in a single series node, so it's not horrible to need to say append data :[item] instead of append data item... it's actually better than append/only data item. And quoted things are efficient too, so append data '10 costs the same as append data 10.

And across the range of allowed inputs, it would be compatible with historical code. The Redbol emulation would be fairly trivial:

redbol-append: function [series value /only] [
    append series case [
         only [:[value]]  ; evaluates to a spliced BLOCK! with one item in it 
         any-array? value [as block! value]  ; force path/group to block
         default [:[value]]  ; put anything else into a block
     ]
]

Cons

Obviously asking people to write find data [<x>] instead of find data <x> seems blocky, and find data :[var] instead of find data var is uglier. Plus append string ["stuff"] feels a bit wordy and append string '{stuff} makes you change your string delimiter. This might be mitigated some by being able to say append string 'stuff and allowing WORD!-based appends.

...Or perhaps being more lenient when the target is a string type, so the /ONLY distinction wouldn't exist in the first place...so allow any type? e.g. enforce the "must be a block rule" for the argument only when modifying or searching in arrays?

While the soft quoting is not integral to the proposal, it has that problem of making you say append data (second items) instead of append data second items. Regardless of whether one thinks the soft quoting is a good idea, there's much more information there... you know the second thing in items is a block and it will be spliced. If you saw append data :[second items] you don't know what the thing is but you know it's not going to be spliced.

For those not going with Redbol emulation, it could be a fresh start

Killing off /ONLY feels like a noble cause to me.

This line of thinking reminds me a bit of the train of thought behind saying that if you have an argument that can take ANY-VALUE!, then you do not use the blank-in-null-out convention for such arguments...you have to switch to thinking of null as some kind of nothing, or accept erroring on null.

So maybe we could consider this a variant of 2 above:

2c. BLOCK! has a special use by convention in the language as a "generic container of N things". You should generally not make a parameter take ANY-VALUE! if that argument intends to use this meaning of BLOCK!. Instead you should always take a block, even if just to provide a single thing, for clarity at the callsites.
2c.1. If your use case fits it, you can use soft-quoting to allow QUOTED! values to indicate single items as a shorthand--at the cost of having to put expressions producing BLOCK!s into GROUP!s.
2c.???. (maybe?) If you have a span of target cases where multiple items cannot be handled differently from single items (e.g. find of a TEXT! in a TEXT! has no distinct meaning from find of a BLOCK! with that TEXT! in it, where an /ONLY refinement would have no meaning) then these cases may be tolerated as missing the BLOCK! container for single elements.

BlackATTR · July 22, 2019, 2:45pm

A few quick reactions.

I was never a big fan of /only. The refinement name didn't seem intuitive to me. Easy enough to learn it, but not intuitive.
It may go against the anti-line noise philosophy of rebol, but for my eyes I prefer append data (second items) and append data :[second items] over append data second items. I think the added line-noise improves legibility rather than hinders it.
I also like that this proposal feels more consistent, and it will encourage (force?) people more in the direction of using quoting, which I think is a new/enhanced feature worthy of embracing in this version of the language.

On the other side of things, I could see how this might feel like a deep change for some developers...

hostilefork · July 22, 2019, 2:59pm

More wacky ideas: built in evaluation to block ops unless you quote

I looked at this idea briefly, which is:

 >> append [a b c] [1 + 2 10 + 20]
 == [a b c 3 30]   ; acts as repend...splicing

 >> append [a b c] '[1 + 2 10 + 20]
 == [a b c 1 + 2 10 + 20]

While this might look a bit odd, it has the advantage that there don't need to be separate names for the reducing forms of such operations.

Outside of oddity it doesn't solve all the problems. You have an issue of what to do when you want to splice a block as-is that you have in an expression. There's no way to do it:

append block1 block2  ; would splice, but would reduce block2

append block1 [block2]  ; would not splice (effectively an /ONLY)

append block1 'block2  ; would append just the word block2

This is a tough circle to square, certainly.

Part of me wonders if the ability of functions like APPEND to see the @ on things without quoting could be leveraged:

 >> append [a b c] [d e]
 == [a b c [d e]]

 >> append [a b c] @[d e]
 == [a b c d e]

 >> append [a b c] reverse [d e]
 == [a b c [e d]]

 >> append [a b c] @(reverse [d e])
 == [a b c e d]

 >> item: [d e]

 >> append [a b c] item
 == [a b c [d e]]

 >> append [a b c] @item
 == [a b c d e]

But that is completely at odds with the problem I found using @ for datatypes, and actually backwards of what you'd think a LIT-XXX! would do (I said "literally" that). Picking a rarer type that's easy to type for the splice rule sounds tempting on the surface, but maybe even raises the chances of esoteric failure.

UPDATE: The esoteric failure I propose being a problem can be taken care of by only allowing the behavior when the source is literally an @-value (not some expression incidentally evaluating to an @-value). The parameter can remain evaluative, and @-values can be passed as their actual value using quoting. See Modal Parameters.

BlackATTR · July 22, 2019, 3:02pm

Yikes.
And agreed, append data :[(second items)] is definitely a bridge too far.
I get nervous about changes like this and how they might impact Beta/One.

hostilefork · July 22, 2019, 3:30pm

The status-quo choice that solves my current problem is 2a...which would be to keep /ONLY but limit aggregate default behavior to plain old BLOCK!...not GROUP! or PATH! or anything else.

Over time I guess I've absorbed through osmosis that append [a b c] [d e] should not be [a b c [d e]]. Though in reality that comes up much less than "repend" or "join" does. There's definite friction I feel with having to say append/multi [a b c] [d e] to get [a b c d e].

It may go against the anti-line noise philosophy of rebol, but for my eyes I prefer append data (second items) and append data :[second items] over append data second items .

It's a pretty important Rebol property to avoid noise, but if for every piece of correct code you pretty much have to say append/only data second items the noise is just getting pushed around.

I also like that this proposal feels more consistent, and it will encourage (force?) people more in the direction of using quoting, which I think is a new/enhanced feature worthy of embracing in this version of the language.

I was feeling somewhat good about the proposal until realizing :[...] is not acting like REDUCE.

This thread is a brainstorm to try and see if there's any other option.

Fundmentally the contention we're facing here is that it feels like there's a difference between what people can reasonably think they mean by append [a b c] [d e] from append [a b c] item. The former seems like no problem to say "oh of course you meant splice, because you see that is a block". The latter being generic runs you into frequent risk of failing to make the appropriate discernment.

The question is if any potential tricks to cure the /ONLY illness are worse than the disease. I don't know, but it's worth having a strong statement of why a problem can't be solved, vs. that it just wasn't because of technical laziness!

hostilefork · July 29, 2019, 7:35am

While implementing a trial of the new APPEND+INSERT+CHANGE logic for "modal parameters", I managed to get the system to boot without having to force every call to have either a /SPLICE or an /ONLY. I did it by restricting the rule to saying you only had to disambiguate if the second arg was an ANY-ARRAY!.

In light of modal parameters making a nice test case uglier, this got me to wondering if there is a compromise on the "always append blocks" rule this post was conceived with...

What if you were restricted to appending BLOCK! to ANY-ARRAY!s, unless you used /ONLY ?

>> append "abc" "def"
== "abcdef"  ; target is not array, no restriction

>> append [a b c] "def"
** Error: Must splice BLOCK! arguments with ANY-ARRAY! unless /ONLY is used

>> append [a b c] ["def"]
== [a b c "def"]

>> append/only [a b c] "def"
== [a b c "def"]

>> append [a b c] [d e f]
== [a b c d e f]

>> append/only [a b c] [d e f]
== [a b c [d e f]]

This is a curious thought. It seems like it would be more annoying for the likes of SELECT and FIND, if what you were looking in was an ANY-ARRAY! to make you put everything you were going to search for in a block, unless you used /ONLY. Maybe modal parameters could help with that:

>> item: 'b

>> find [a b c] item
** Error: FIND in a BLOCK! requires searched for item a BLOCK! if not /ONLY

>> find/only [a b c] item
== [b c]

>> find [a b c] @item
== [b c]

This matches a bit more the sense of "literally this" that the name LIT-WORD! might suggest here.

Perhaps this suggests a hybrid solution that would be less failure prone. Code that you saw work once would be unlikely to mysteriously fail.

UPDATE: I quickly turned my test around to work this way for APPEND/INSERT/CHANGE and it feels far less disruptive, while seeming to have the effect of helping avoid most of the surprising categories of error that I want to stop. It doesn't get rid of /ONLY...but leaves it unnecessary for string targets (as today) and then makes it a requirement if you don't want the argument to a block to be forced as a spliced block. Modal arguments make it more palatable, especially for things like FIND. This seems to be closing in on "definitely better".

hostilefork · August 24, 2019, 9:39pm

Back to Reboling, and pushing this idea through, I think this idea is a winner.

New users would get in the rhythm of working with ANY-ARRAY! with block arguments. This looks only a little more verbose for literals:

append block [<failure>]
; instead of
append block <failure>

But setting the pattern early means people will read:

append block thing

As knowing that thing is a block and it will be spliced. If you then say:

append/only block thing
; or
append block @thing

Here you're consciously knowing that whatever it is, it's appending one and only one item. Classic bugs just won't happen due to better conventions, helping newbies and old school users alike.

This leads to a question about whether using /only if the target is a string series should be an error. What use is the polymorphism? How often does one write code that operates on either a string or a block, that wants /only semantics on the block, but just wants it ignored on the strings? I think it's rare enough to write such code in the first place, but to me it seems any polymorphic code would be splicing the blocks just as it would the strings.

On the other hand, it seems oppressive to only allow strings to be appended to ANY-STRING!, making you write:

append text unspaced reduce [{"} name {"}]
; instead of
append text reduce [{"} name {"}]

In fact, if we believe that GET-BLOCK! should represent a reduction, you have another option:

append text :[{"} name {"}]

It seems the benefits of prohibiting /ONLY (or the equivalent @ modal parameters for string destinations) could have a strength outweighing any possible inconvenience: If you saw /ONLY or @ at the callsite, then you know the destination is an ANY-ARRAY!.

I haven't implemented this rule to see how much pain it would cause. But I'm guessing not much.

Anyway--this aims to resolve one of the most fundamental and enduring frustrations of the language. If it fails, at least there is documentation of why the perpetually bug-causing behavior has been chosen as "just the way it is." But right now, I don't think it's failing.

Odds are keeping the name as /ONLY is beneficial to continue to allow people to write code that works in Ren-C but is backwards compatible elsewhere. I'm hard pressed to think of a name change where the name is so much of an obvious win. But it's always fun to think of parallel universes where the better option was chosen. Since it would now apply solely to the case of the target being an ANY-ARRAY!, maybe... /ONE? Saves a letter, even:

>> append [a b c] [d e]
== [a b c d e]

>> append/one [a b c] [d e]
== [a b c [d e]]

Other ideas could be /SINGLE, /ATOM, /VALUE, /CELL, /WHOLE, /UNIT, etc. etc.

hostilefork · August 29, 2019, 4:17pm

As an additional twist on this idea, I wondered if we could take the same tactic as with PRINT and decide that TEXT! strings are an exception.

That idea was "Limiting PRINT to BLOCK! and TEXT!" (with modifications of BLANK! to opt out, and the single CHAR! of NEWLINE accepted).

This is because PRINT wants to break you of the habit of saying print x and getting the "surprise!" of when that suddenly runs code one day when it's a block. It wanted to discourage you from thinking that PRINT is some kind of generic debug dumper... and guide you to using DUMP (--).

But might the same simple exception of TEXT! be worth it in this rule, as well?

>> b: [d e]
>> t: "de"
>> o: make object! [x: 1020]

>> append [a b c] b
== [a b c d e]

>> append [a b c] t
== [a b c "de"]

>> append [a b c] o
** Error: must use /ONLY unless splicing block or text

This still gets you to break the habit. If you're writing generic code, what are the odds that everything you are dealing with is a TEXT!? you're likely to be appending INTEGER! and TAG! and lots of other things. Letting the narrow case of text strings slip through without complaint covers the most common case of what people try to do, e.g. collect [keep "this"].

I mention this as I think the gamble actually has worked pretty well for PRINT, on balance.

hostilefork · July 11, 2022, 9:17pm

4 posts were split to a new topic: JUST X as a Synonym for QUOTE THE X

hostilefork · January 2, 2021, 7:25pm

A post was split to a new topic: JUST vs LIT/LITERAL/LITERALLY

hostilefork · July 17, 2022, 5:15am

4 posts were split to a new topic: What Are Our Options For Ditching /ONLY ?

hostilefork · July 17, 2022, 6:29am

Archival Brainstorm: Could APPEND, FIND, etc *only accept blocks*?