ENVELOP (and COMPOSE!) By Example

hostilefork · September 23, 2024, 12:43pm

Prior to splices, we were considering rethinking append/only [a b c] [d e] as append [a b c] only [d e], where ONLY would just envelop its argument in a block.

@rgchris didn't care for the name:

On another topic

Correct me if I'm wrong, but this proposed ONLY function would simply create a single value cell with the block reference, which would seem pretty efficient.

It'd be very easy to shim:
only: func [value][
    reduce [value]
]
I'm still fond of ENVELOP over BLOCKIFY as a name. I don't think ONLY would make the cut. Naming is tricky as it is sort of a hack—it's purpose is to make a block a singular value but in actuality it is creating a new value of which the old one just happens to be the only content.

As it happens, ONLY defined in this way stuck around for a while. (I actually thought it had been deleted, but it turns out it was hiding as **only*, so just finally deleted it now!)

I agree that ENVELOP is a better and more useful name for the category of operations. Today we have ENBLOCK and ENGROUP:

>> enblock [a b c]
== [[a b c]]

>> enblock <tag>
== [<tag>]

>> engroup [a b c]
== ([a b c])

>> engroup <tag>
== (<tag>)

But there's no generalized ENVELOP.

"Envelop by Example" Seems Like an Important Construct

>> something: 1020

>> word: 'something  ; demo behavior when unbound (binding from context)

>> envelop '[] word
== [something]

>> envelop '() word
== (something)

>> envelop '@[] word  ; would work with sigil-decorated types
== @[something]

>> envelop '(()) word  ; could work with nested envelopes
== ((something))

There's a big advantage in passing in a block or group "by example". It means you can implicitly pass along a binding, which can be integrated in the same step...if that's what you want. (The modern art of writing Ren-C code requires a lot of consciousness about the decision to use bound or unbound material.)

>> eval envelop '(()) word  ; quoting means no binding
** Error: something not defined

>> eval envelop $(()) word  ; if binding passed in, it's used
== 1020

ENVELOP might even support Synthetic Asymmetric Delimiters

>> envelop '(| |) word
== (| something |)

>> envelop '(|) word  ; shorthand--assume paired?
== (| something |)

>> envelop '(<*>) word  ; maybe not assume, for COMPOSE marker compatibility
== (<*> something)

ENGROUP and ENBLOCK Still Useful

I do think that ENGROUP and ENBLOCK as specializations of ENVELOP turn out to be what you'll use at least 90% of the time...so they're worth having around.

But as arity-1 functions, the returned block or group would be unbound at its tip. So you'd have to use the ENVELOP-by-example to pass in a binding.

This Overlaps the MORPH Proposal Somewhat

MORPH has the ability to change the decorations on the value you're passing in, whereas ENVELOP would assume you wanted the item as-is, just enclosed in some other stuff.

My instinct is to say that this takes the pressure off MORPH to be all things to all people... vs. the idea that we don't need ENVELOP and it should just become a subfeature of morph. But I dunno.

hostilefork · September 23, 2024, 4:23pm

I have realized that this is an incredibly useful ability...

...but even more importantly...

The Binding Aspect Motivates COMPOSE-by-Example

Since today's COMPOSE is arity-1, to get it to work at all you have to run it on a bound block (assuming the nested groups you're composing aren't somehow already bound). The tip of the binding of that block is what COMPOSE sloppily borrows to use when evaluating the inner groups.

>> x: 1, y: 2  ; let's say these are incidental definitions

>> var: 'y

>> code: compose '[x + (var)]
** Error: var is not bound

>> code: compose [x + (var)]  ; eval'd BLOCK! binds, compose borrows that binding
== [x + y]  ; but the result tip still has the binding

>> eval compose [let x: 10 let y: 20 (as group! code)]
== 3  ; let's say this is not what I meant

If you didn't want the final result of a COMPOSE to be unbound, you still have to bind the block long enough for compose to find the bindings...and then unbind it.

Not only is that awkward, what if you had a meaningful binding on the input you wanted to keep. You'd have to store the binding somehow... bind to the context for your groups long enough for the compose to work, then rebind it to the stored binding...

Compose-By-Example Can Fix This!

Let's bring back an old term...and call it COMBINE.

>> code: combine $() '[x + (var)]
== [x + y]  ; worked even though we passed in an unbound block!

>> eval compose [let x: 10 let y: 20 (as group! code)]
== 30

So not only do you get the freedom to specify what delimiters (or synthetic/nested delimiters) you want to use, you can also supply an arbitrary binding.

Old COMPOSE Is Still Useful Day-To-Day

It's useful enough to keep its name, and do what it does. It works out a lot of the time.

But the strange thing here is that COMPOSE wouldn't just be a specialization of COMBINE with an unbound group '(). I think that would imply leaving the bindings on the groups as-is, not stealing the binding off of the other argument.

So COMPOSE would likely instead be an adaptation of COMBINE that would take the binding off of the thing you passed it, and put it onto the "example". Let's say the two arguments to COMBINE are PATTERN and TEMPLATE (see post on specialize:relax):

compose: adapt (specialize:relax get $combine [
    pattern: ~<removed from interface (ADAPT phase fills in)>~
]) [
    pattern: inside template '()
]

This is all quite cool. Agree, @bradrn?

hostilefork · September 23, 2024, 4:47pm

There's a bit of a missed opportunity here with being able to use sigils for COMBINE...to say the sigil is enough, you don't need a list:

>> var: 'x

>> combine '$ '[x + $var]
== [x + y]  ; unbound

Besides letting you avoid lists, it could be useful with lists when you want to generalize the same compose operation across GROUP!s and BLOCK!s (and FENCE!s)

>> combine '@ '[[some stuff] @[spread [a b]] (other stuff) @(reverse [c d])]
== [[some stuff] a b (other stuff) [d c]]

There's no binding, so it won't work. BUT I WANT IT!

Could SIGIL!s Carry a Context, like an ANY-LIST Can?

It wouldn't necessarily be hard to give them list-like properties of carrying contexts, but getting the binding on them would be annoying:

>> combine (inside [] '$) '[x + $var]
== [x + y]  ; unbound

Could $$ mean "bind the $ sigil", and $@ mean bind the @ sigil, etc?

>> $$
== $  ; bound

>> combine $$ '[x + $var]
== [x + y]  ; unbound

That could work, with $$ being its own SIGIL! with this particularly strange behavior.

Note that $ being a WORD! (which it shouldn't/can't) would likely not help--because words do not today store "contexts"/"specifiers". Once they are bound, they are glued to the thing they are bound to...this is a performance point, because it means each fetch of a bound word doesn't have to look it up again. Only the binding process itself does the lookup for words. So leveraging the "not a word" nature of SIGIL! to have a weird property like being able to store contexts might make their strangeness useful vs. just strange.

I am having a hard time thinking about how it would fit into a more general mechanic... I don't know that implies $$word and $$block etc. would need to exist, and I'm 99% sure I don't want them to.

But maybe just being a magical outlier is all right. It is called "SIGIL" after all:

A sigil (/ˈsɪdʒɪl/) is a type of symbol used in magic. The term usually refers to a pictorial signature of a deity or spirit (such as an angel or demon. In modern usage, especially in the context of chaos magic, a sigil refers to a symbolic representation of the practitioner's desired outcome.

bradrn · September 24, 2024, 1:14am

Yep, this looks good!

(Not sure how happy I am with bindings on sigils, though. That feels like it may open up the same can of worms as bindings on strings do.)

hostilefork · November 18, 2024, 7:16pm

hostilefork:

Let's bring back an old term...and call it COMBINE.
>> code: combine $() '[x + (var)]
== [x + y]  ; worked even though we passed in an unbound block!

>> eval compose [let x: 10 let y: 20 (as group! code)]
== 30
...Old COMPOSE Is Still Useful Day-To-Day

It's useful enough to keep its name, and do what it does. It works out a lot of the time.

So I've been stressing over the mental overhead of having two constructs, instead of just one.

One Big Advantage of ARITY-2 COMPOSE

It's hard to think of a name for string interpolation, and expanding the scope of COMPOSE's powers seems like it would be a fit.

But that would always need you to give some capture of context as a parameter.

 >> compose $() "Strings (pick [do don't] 2) have binding"
 == "Strings don't have binding"

Composing Unbound Material Should Be Common

In the modern binding model, unbound code is very much the default currency.

Many COMPOSEs should probably be making unbound material, and just aren't. While we're getting away with not doing it, I think that's kind of accidental (and increasingly I've been running into problems where bound compose results are messing things up).

Would Stress That You Have a Choice

If parentheses aren't the best choice for your situation, you can pick something else.

>> compose ${{}} [{a} (b) [c] {{first [<d> #Z]}} {e}]
== [{a} (b) [c] <d> {e}]

A Refinement Would Be Wordy

I'm questioning trying to differentiate the operations with a name difference (COMBINE vs. COMPOSE). That's confusing.

A refinement makes more sense. But even in the ideal case (where refinements are moved to the head) we'd have:

compose:with $() [...]

And I don't know that ":WITH" is the best name, it acts as if it's adding something vs. overriding a default. More honestly it would be:

compose:pattern $() [...]

Weird Semantics...

The behavior of stealing the binding from the block means that understanding COMPOSE as a specialization with $() is wrong.

Stealing the binding from the tip of the block you're composing and using that in the slots is a kind of "weird backchannel". If that's what you're doing, you're the weird one, so maybe your code should reflect that.

But, It Would Drive The "Noise" Level Higher

Some (like @rgchris) do not like symbols in code at all.

Having COMPOSE frequently instantiated as compose $() [...] would be a thorn. (Though as I've stated, I think you really should be doing compose $() '[...] much more than one would think).

And it is unfortunate that you need the $ to get the binding and suppress evaluation, plus you might think that would act on the pattern [... $(...) ...]

But the game here is getting correctness. And it seems to me that if you give people a powerful fundamental, they can adapt and specialize it to do whatever they want, under any name they want.

Should "Sneaky" Capture Be The Default?

For COMPOSE to work with strings it either needs to take a pattern parameter, or be one of the sneaky constructs that uses the current context by default.

If it shifted to the "sneaky" behavior, then it would be more useful on unbound blocks:

>> x: 10, compose '[a b (x + 1) c d]
== [a b 11 c d]

>> x: 10, compose "a b (x + 1) c d"
== [a b 11 c d]

But if you move code that says compose block from point A to point B, then you will be changing what it uses for its scope. The same would be true of moving compose $() block, but at least there you can see the capturing construct... the $() should (?) call your attention to it.

I Think My Biases Lead Me To Arity-2

Looking at all the angles here, I just don't think it's that much of an imposition to pass the pattern. And you live in a murky binding world if you don't.

If you are writing enough code where the compose $() [...] pattern is frequent enough to get on your nerves, then there are probably other ways you can attack that repetition that are even larger scale than just specializing that bit out.

hostilefork · November 20, 2024, 12:32am

So I went ahead and did the transition to arity-2 COMPOSE, so I could see the effects.

Since bootstrap needs to still work, I needed to have a name for the arity-1 version, and I called it compose1 Which actually isn't a terrible name for the arity-1 version. Maybe compose-1 would separate the e and the 1 a little better, but it also adds noise.

There's also the possibility of compose*, though adding a star usually means "lower level version". But more generally it could be argued that it's a "special version".

What its doing is actually:

compose1 list
=>
compose (inside list '()) list

So maybe calling it compose-inside would be more informative, but that wouldn't tell you that you were using groups, so it's compose-groups-inside...

Well, when I put it that way, COMPOSE* might be good for this.

Anyway, I think the competition is between calling it COMPOSE* or COMPOSE1

If we had CHAIN! dialecting, it might permit:

>> compose [a (1 + 2) b]
== [a 3 b]

>> compose:${{}} [(a) {{1 + 2}} {b}]
== [(a) 3 {b}]

But that feature is very speculative, and binding issues may make it non-feasible.

This is a pretty tough call, but I think I'm on the side of saying that COMPOSE should be arity-2.

Beyond the necessity of providing a binding environment when you want to apply COMPOSE to strings, I'm seeing too many places where an arity-1 COMPOSE can't do what you want.

I also feel like taking away the assumption of COMPOSE being groups is good--it's not groups, it's the pattern you choose. I like {{...}} because at least now, that has no meaning and stands out a lot better.

The harsh reality is that to really make things work... binding has to be thrust into the awareness of those who are building and composing code.

Being able to evaluate templated code groups (or blocks, or fences...) in an environment that is different than the environment of the surrounding code is not something that has been easy before.

But I think it's foundational--and as I've said, the most common surrounding context is likely no context... with the intention of inheriting the meaning from wherever you are putting the composed code.

hostilefork · December 22, 2024, 8:23pm

I've had a bit of time to ruminate on this, and "deal with my feelings" about it.

I find myself lamenting things like this:

compose $(a).b
=>
compose $() $(a).b
; or...
compose $() '(a).b
; ...when would you use which ??

But it's hard to gauge because this is kind of the first year where binding has any sort of shot of working. The existing corpus has kind of a selection bias... it doesn't do very ambitious things with binding, because none were possible.

These little decisions may matter more than we know, and glossing over them could be a pretty bad idea. I think that's where my leaning is, and arity-2 compose still seems like it's probably the right direction.

What's Really Got Me Puzzled Is "Use Existing Binding"

The operator I can't quite entirely get my mind around is actually the one that doesn't bring in a binding from outside, but that uses bindings that are "already there".

How much of the binding should be used, exactly?

This one seems "obvious":

 foo: func [x y] [
     return compose-existing [a (x + y) b]
 ]

You'd want that to add the function parameters, right? We'd presume here that comes from taking the binding on the tip, which was established when the block evaluated. The contents are unbound, so that tip binding gets used...

Except what if the GROUP! had a binding? Should that be honored, or does it only take the binding from the tip?

This is a semantic problem, because we're descending down into a nested structure that may not be fit for evaluation--and may not be intended to ever be evaluated as some kind of cohesive form. We don't know what this structure is going to be used for. So how would COMPOSE know if it was a good idea to be descending the structure, using evaluator-like rules to inherit the bindings?

Seen in this light, it makes a stronger case for arity-2 COMPOSE. Which in fact faces some problems of its own, when there's a binding on the patterns you are composing already. Though that's a narrower problem, because it definitely wouldn't be deriving binding as it descends into a structure, so it really would only be cases where the specific thing it found on descent had a binding on it.

A safe default would be to error if the things you are trying to compose have already been bound. Then perhaps have a compose:override switch to say to use the passed in binding. (Or probably better to just require a predicate function that gets passed the list and you have to decide what to do with it, instead of getting involved in the details.)

But There Are Casual Cases Which Come Up

Consider for instance the ENCODE and DECODE operations. The way I'm working them up is that there's a dialected block for specifying the encoder and decoder.

 encode 'UTF-8 string  ; use default settings
 encode [UTF-8] string  ; equivalent form of dialected block
 encode [UTF-8 ...] string  ; block can pass settings

But I also thought it would be good if it "pre-composed" the block. So the meaning of GROUP!s in all encode dialects is to compose. This is an expedient convention, which frees the authors of encoders from having to worry about it and lets users assume they know what groups will do, because it's very likely that you want to supply settings from expressions or variables:

num-bytes: 4
encode [BE +/- (num-bytes)] integer

This is one of the cases where traditional "just do it" COMPOSE feels like a fit. But with arity-2 compose it winds up being a mouthful, something like:

compose (bind binding of spec '()) spec

You have to produce a GROUP! which has the same binding as the BLOCK!.

There needs to be a shorter way to say this, and I think arity-2 compose can do this for us if it uses a different part of speech:

compose @() spec

This goes along with some other uses of THE-XXX! to mean "reuse binding". It's better to do this with something that's explicitly seen by the COMPOSE operation as a signal of meaning, instead of making an assumption about what an unbound GROUP! meant:

compose '() spec  ; looks "intentional" here...
compose pattern spec  ; but an unbound group could just be an accident

This still gives the freedom to use other patterns like @{{}} or @(<*>) without having to use some other refinement to the process. These would be meaningless for strings--as there's no binding to reuse.

I don't know what it means in terms of descending. Just use the tip is probably the sanest default.

I think with this adjustment, arity-2 compose still comes out ahead.

I'm still wondering about the COMPOSE* operation with sneaky environment capture, and assumption of parentheses. Maybe COMPOSE** would look for doubled parentheses.

>> x: 1000, y: 20

>> compose* "The sum is (x + y), with an easy comma."
== "The sum is 1020, with an easy comma."

>> compose** "The sum is ((x + y)), with an easy comma."
== "The sum is 1020, with an easy comma."

>> compose $(()) "Is that ((x + y)), actually worth it?"
== "Is that 1020, actually worth it?"

More food for thought.