Breaking MAKE OBJECT! Into Component Operations

hostilefork · December 28, 2020, 8:53am

The behavior of the following in historical Rebol/Red strikes me as extremely buggy:

rule-ctx: [num: 1020]
rule: [num 'foo]
bind rule rule-ctx

parse-obj: make object! compose [
     rule: (rule)  ; assume this is the non-splicing semantic
     run: method [data] [
          parse data rule
     ]
     ...
     num: 304  ; imagine this is just an incidental field name
]

The COMPOSE'd rule will be bound to PARSE-OBJ's num instead of RULE-CTX's num. The more generic the word, the more likely this is going to happen on accident. You get different behavior than if you'd named the rule something else, and not used COMPOSE... as with make object! [rule: a-rule, ...], because then the block wouldn't be fetched via word when the body runs. It would be the word a-rule that was hit with the binding wave, not what it looked up to.
Not only was the COMPOSE'd rule changed, but the original rule is changed! This is because there's nowhere to store unique binding information for the rule as seen through the lens of the object vs. not. It is mutably bound, and the only way to avoid this is to make a copy.

As it happens, I actually have some progress to report on #2 with virtual binding. I'm making it possible for multiple "views" of the same block to see it bound different ways (a technique which started with having multiple views of a function's body viewed via different frame instances, that is extended to stacking on views through arbitrary objects).

But that doesn't do anything to help #1.

MAKE OBJECT! Is Actually Three Operations

There are three distinct steps performed historically by MAKE OBJECT!.

COLLECT all the top-level SET-WORD!s in the BLOCK! to make an empty OBJECT!, with all those words unset.
BIND the block's ANY-WORD! elements to the newly created object.
DO the bound block.

Virtual binding means the second step can be very fast. The block is just annotated to say "you are a view as seen by this object". (It's possible to deep-walk the block and do some a-priori caching to make the execution the first time faster, but that's not necessary.)

Without virtual binding, the second step requires a deep walk... and destructively binds the block so any other references to the block are now contaminated. This is usually not what you'd want, but sometimes (as when modules are being loaded) it's intended.

What If MAKE OBJECT! Was Just The COLLECT Step?

If we let you make an object from a block without actually binding or running it, you could tailor the steps as appropriate for your operation.

For instance: Modules don't want virtual binding...they fabricated a new block themselves from the input UTF-8. They want to destructively bind their single copy to lib and the module:

 block: transcode read module-filename  ; oversimplified...
 mod: make module! block  ; imagine this does *not* run the block
 bind block lib
 bind block mod
 do block

You get another advantage: you have access to both the result of the DO and the result of the MAKE. These could be provided as separate outputs from something like IMPORT.

If you wanted something that was a cleaned-up version of today's MAKE OBJECT! that used virtual binding to avoid contaminating the input block (e.g. because it was read-only material in a function body, or passed as a const parameter), you could say:

 obj: make object! block
 do in obj block  ; I've repurposed IN to be the virtual binding operator
 obj   ; the object as the result

This still has the possibly unintended effects of #1 above, where all COMPOSE'd components inherit the binding. But you could use a more conservative virtual bind that only applied to the top level set-words. I have that implemented, it would just need to be exposed somehow:

 obj: make object! block
 do in/shallow/set obj block
 obj

This would make binding into the object optional. If MY were like METHOD and took the binding on the left and propagated it to the right, you could say either:

 make-obj-toplevel compose [rule: (rule), ...]  ; no binding effect
 make-obj-toplevel compose [rule: my (rule), ...]  ; block bound into object

And if MY were powered by virtual binding, that wouldn't have to touch the original rule.

For that matter, with only top-level words assigned, you'd get the advantage of not even needing a COMPOSE. Any normal words would be left as they were:

 make-obj-toplevel compose [rule: rule, ...]  ; different `rule`s

This Seems A Promising Direction

It needs to be hammered out in terms of the details, and names. We have names like CONSTRUCT and CLASS that we are looking at. I don't particularly like CONTEXT or OBJECT as verbs since they are type names.

There's an uneasy question about whether MAKE should be a high-level operation (due to its short name, wanting the high-level operations to be in reach). However, while its name is short, it has to be combined with a type which makes it not all that short compared to e.g. CONSTRUCT.

I experimented for a time with HAS as seeming analogue to DOES. It didn't catch on at the time, I suspected because it sounds more like a question than a statement. But it is does have brevity on its side:

obj: has [
    x: 10
    y: 20
]

I'll throw it back out there. Anyway, my natural leaning would be to say that MAKE becomes the low level operation that doesn't actually run the block you give it, then these other higher-level variations get names.

This would also seem consistent with MAKE FRAME!, which gives you an empty (seeming) context. But we could also distinguish make* from MAKE, or variations of that.

Afterthought: I'll also remind people about WRAP, which was an operator I proposed long ago that was like MAKE OBJECT! but returned the result of the evaluation...which could be more convenient now:

>> wrap [x: 1000, y: 20, x + y]
== 1020

This gives you the same result as use [x y] [x: 1000, y: 20, x + y] by automatically collecting top-level SET-WORD!s. This could make a comeback...more efficiently expressed with virtual binding.

rgchris · December 28, 2020, 3:01pm

This indeed seems a good direction. Certainly leans against using MAKE as the means of creating derivatives:

account: new-object-constructor-that-works-the-old-way [
    name: "Flintstone"
    balance: $100
    ss-number: #1234-XX-4321
    deposit:  func [amount] [balance: balance + amount]
    withdraw: func [amount] [balance: balance - amount]
]

other-account: make account [
    name: "Rubble"
    balance: $200
    ss-num: #012-XX-3456
]

I presume a streamlined MAKE would now enforce the following form?

do in other-account: make object! account [
    name: "Rubble"
    balance: $200
    ss-num: #012-XX-3456
]

; sort of a parallel to the way the following
; similarly makes a clone
make text! "Foo"

hostilefork · January 1, 2021, 9:52pm

Over on the CONST thread, I mentioned a property of generic quoting that I found disconcerting, in that it subverted CONSTness. Because there was no other clean way for the API to bypass an "evaluative wave of constness" otherwise.

I resolved to put my discomfort aside, and accept that generic quoting offers a way for things to be put into an evaluative situation...behaving as-if they had been accessed by a WORD!. Because sometimes (e.g. in the API or when COMPOSING) you don't have a word to access through...to get that behavior if it's what you wanted.

The proposed workaround for when that const subversion was not intended was that people use @ (...) instead of '(...) in cases that they wanted the constness to match the flow of the block.

Which reminded me of the situation in this thread...

The two connected in my head, to ask the hypothetical question what if quoted things had their binding left untouched...as if the thing you get had been fetched by a word, instead of subtracting a quote level?

some-rule: [num: 10]

make object! [rule: some-rule, num: 20]
;   word access, no binding influence on RULE

make object! compose [rule: (some-rule), num: 20]
;   becomes make object! [rule: [num: 10], num: 20]
;      following the (...) is don't splice and ((...)) is splice convention
;   today this would virtually bind NUM in the rule member into the object
;   so from RULE's point of view, NUM is 20
;   but thanks to virtual binding, SOME-RULE is unaffected

make object! compose [rule: '(some-rule), num: 20]
;   becomes  make object! [rule: '[num: 10], num: 20]
;      because COMPOSE keeps the decoration on the composed thing
;   what if the quote subverted the virtual bind, as in the WORD! case?
;   so the NUM is seen however it was in SOME-RULE?

A wild--but promising--thought.

Though it raises the question of how things in quotes would ever get bound--such as the SOME-RULE inside of '(some-rule). So if this technique were used, it couldn't apply to all bindings.

But where it did apply, it would mean using a literalizing operator that didn't employ quotes (like what is proposed as JUST) more than we do now. And there would be a lot of unbound things, since most quotes aren't programmatic with QUOTE or COMPOSE.

When this was in effect, you couldn't write stuff like get 'x (which you wouldn't usually, because you could say just x or :x)...but you'd have to say get/any just x instead of get/any 'x. If you were going to write if condition [[a block]] that would be different from if condition '[a block].

It would mean that append block [x] and append block just x would mean something different from append block 'x.

It has a plus side in data exchange. I've talked about the hazards of "stray bindings", not only do they offer possible unwanted linkages to internals of things that weren't intended when using words as a kind of enum symbol, they also hold GC things live which they may have no interest in.

PARSE's mechanics for recognizing things literally wouldn't be affected, as bindings aren't heeded by the matching process. parse [[a] [a]] [some '[a]]

Anyway... I just saw some potential synergy with the "act like you got it from a word" issue run up against in const. I've mentioned how the problems that you run into with the API are no different than the issues you run up against with COMPOSE when you don't have getting things out of a variable...which is why rebQ() is so important...but it also points out that this comes up in non-API scenarios, like this binding problem.

hostilefork · August 20, 2024, 12:28pm

hostilefork:

If you wanted something that was a cleaned-up version of today's MAKE OBJECT! that used virtual binding to avoid contaminating the input block (e.g. because it was read-only material in a function body, or passed as a const parameter), you could say:
 obj: make object! block
 do in obj block  ; I've repurposed IN to be the virtual binding operator
 obj   ; the object as the result
This still has the possibly unintended effects of #1 above, where all COMPOSE'd components inherit the binding. But you could use a more conservative virtual bind that only applied to the top level set-words.

Bingo. Think I hit this on the nose almost 4 years ago... it just took a while for the tech to catch up!

But what we were missing at the time was a way to deal with how "methods" when FUNCs (or METHs) inside such a constrained construction had no way to reach the object to know what words it had. Pure Virtual Binding gives us the resources we need to create a whole new way to access members:

Since changing MAKE OBJECT! would be too far-reaching a change at the moment, I've retaken CONSTRUCT for this shallower version:

>> x: 10
>> obj: construct [x: x * 2, f: func [] [print ["x is" x "and .x is" .x]]]
== make object! [
    x: 20
    f: ~#[frame! []]~
]

>> obj.f
x is 10 and .x is 20

The only ANY-WORD!s in the block passed to CONSTRUCT that are bound to the object being generated are the SET-WORD!s at the top level. Everything else trickles down as normal in pure virtual binding. So it sees X inside that block as it is defined outside.

Hence the FUNC of F does not see the object's X being created. But the twist is, that when you say OBJ.F, the object is folded into the function cell being executed. That is then folded into the environment during execution. And the tuples starting with blanks, rendering as .x or .x.y or whatever, they don't search the normal binding for the symbol but they look in that "coupling" binding.

The benefits to most code by using this pattern are tremendous!

You can refer to same-name variables from outside the CONSTRUCT inside it without a COMPOSE
But should you find the need to COMPOSE material in from another place, you don't get incidental contamination of the binding of that material from the fields of the object being created
AND you can tell when reading a function's body what things are object members, in a lightweight and general fashion, using the TUPLE! datatype which is already handled by places that need to look up variables.

If we wanted to, we could say that CONSTRUCT (or MAKE OBJECT! or {...} or whatever this is) would slip the object's identity into a virtual LET binding under a name like self or this, so that it could be mentioned freely inside the code. But that wouldn't have to be a field cluttering the object itself!

>> x: 10
>> obj: construct [x: 2 x: this.x * 3 f: func [] [print ["x is" x "and .x is" .x]]]
== make object! [
    x: 6
    f: ~#[frame! []]~
]

>> obj.f
x is 10 and .x is 6

Bingo, again. Another prognostication from 3+ years ago which needed the rest of the system to catch up to make it feasible.

When the system can function as people expect when most all the code they're working with is unbound at the block level, that's a very good sign.

foo: func [x] [
    code: copy [add x]
    append code to word! "x"
    print ["Doubled:" do code]
]

>> foo 10
Doubled: 20

I'm now far, far more confident in where binding is at... and it points back to these early realizations.

It's definitely going to require people getting comfortable with new ideas... old code just won't work (e.g. get 'x would be getting something with no added binding, so it has to be get $x or get in [] 'x or whatever). People have to understand why.

For those who just want to load %file.jpg they may think they don't have to care, but the features and dialects they can experience will be limited by the feasibility of others writing those for them... and writing C and Red/System code to avoid composability problems can only get you so far.

hostilefork · November 18, 2024, 2:14am

hostilefork:

Since changing MAKE OBJECT! would be too far-reaching a change at the moment, I've retaken CONSTRUCT for this shallower version:
>> x: 10
>> obj: construct [x: x * 2, f: func [] [print ["x is" x "and .x is" .x]]]
== make object! [
    x: 20
    f: ~#[frame! []]~
]

>> obj.f
x is 10 and .x is 20

I've had some success deploying this .word as member idea, with a shallow object creation... and it seems like the right fit for many cases--enough so that I think it should be the "default" for what we would traditionally think of as the creation of traditional "objects" and "classes".

But as expected, there are cases that don't fit. For instance, in the bootstrap process, the configuration files make objects that are a bit more like modules:

REBOL [Title: "Filesystem configuration file"]

name: 'Filesystem
source: %filesystem/mod-filesystem.c

os: platform-config.name
if os = 'Windows [  ; means OS as it was just defined...
    ...
]

This looks module-like, but it's done with objects (in particular because the bootstrap executable doesn't do modules correctly). And additionally, it inherits from a base object.

So when I talk about the "component" operations, we have to consider:

What kind of thing are we making (OBJECT!, MODULE!, ERROR!, ...)
Is there a parent to inherit from? (if so, we presume this answers 1, which is why MAKE doesn't need to know the type if you pass an instance instead... you can't MAKE ERROR! from a parent that's a MODULE!)
Do you want the definitions of the object fields being defined to be visible to the code that's assigning those fields, or do you want the existing bindings to be visible?

With the CONSTRUCT I mention above, that gives you an OBJECT! with no parent...and the fields of the object are not visible to the code doing the assigning. (I propose this as the default behavior for FENCE!.)

I added a CONSTRUCT:WITH to pass in a parent, so you could add a parent...though it would be better if that argument came first

construct:with [
    something: ...
    ...
] parent

vs.

construct:with parent [
    something: ...
    ...
]

I could just go ahead and make it arity-2, though I'm not thrilled by it.

construct object! [...]  ; shallow version, no parent
construct parent [...]  ; shallow version, parent

make object! [...]   ; inside version, no parent
make parent [...]   ; inside version, parent

Why would MAKE vs. CONSTRUCT Be "Inside Binding"?

Outside of the inertia of history, I don't know how much sense this makes. It could be construct:inside vs. plain construct. :-/

>> x: 10
>> construct [x: 20, y: x + 100]
== #{x: 20 y: 110}

>> x: 10
>> construct:inside [x: 20, y: x + 100]
== #{x: 20 y: 120}

The inside version being implemented by wrap:context was one idea I had:

>> x: 10
>> wrap:context [x: 20, y: x + 100]
== #{x: 20 y: 120}

What this has going for it:

"WRAP" really does sound like it's enclosing the code in a context
It aligns with what WRAP does to a BLOCK! normally... this would just say "make me a context out of that, vs. giving a block".

But then you have the other parameterization points... MODULE! vs. OBJECT! etc., and whether you want to inherit or not.

There's simply a lot of parameters here. Anyway, there's a lot of new mechanics that need to settle out while contemplating this, just wanted to write that down.