GET+SET vs PICK+POKE - What's The Difference?

hostilefork · December 7, 2021, 3:31pm

In trying to think about what the fundamental pieces of this system really are, I've crept toward the idea that PICK and POKE are fundamentally cell-based operations... with no path processing.

This would mean that all path processing logic is actually driven by GET and SET, which implements itself on top of PICK and POKE.

To give you an idea of what I mean by that... let's look at a hypothetical poke of an immediate value:

>> obj: make object! [d: 21-Nov-2021/18:56:45-5:00]

>> poke obj.d 'time 12:00  ; poke receives cell bits but *not* an address
== 21-Nov-2021/12:00-5:00

>> obj
== make object! [
    d: 21-Nov-2021/18:56:45-5:00  ; no change to stored cell
]

So we see POKE has the smarts to be given the immediate value of a DATE! (which fits in 4 platform pointers), and some field (e.g. time) to produce a new DATE!. But it wasn't equipped to be able to change that original value.

It may be that this specific case should give an error if you use it without a flag showing that you know you're not changing anything.

>> poke obj.d 'time 12:00
** Error: DATE! will not be mutated via POKE, use /IMMEDIATE if this is ok

>> poke/immediate obj.d 'time 12:00
== 21-Nov-2021/12:00-5:00

But the idea would be that when you use a SET-WORD! or a SET, the translation is different:

obj.d.time: 12:00  -or-  set 'obj.d.time 12:00
=>
poke obj 'd (poke (pick obj 'd) 'time 12:00)

So in other words, it's SET that drives the process--breaking it down into atomic PICK and POKE, taking on the burden of writing back any changed cells. But a lone POKE itself would not be able to do any writeback of an immediate...as it received only a cell and not an address.

How Does This Relate To "Subcell Addressing"

I talk about cases like date.time.hour: xxx because it gives a case where date.time synthesizes a TIME! value which does not have a source cell of its own...so it has to be poked back.

It may not be too hard to accommodate such cases. The real problem would be things like an FFI abstraction, like:

>> struct.million_ints_field.10: 20

If the clause struct.million_ints_field generates a BLOCK! of a million integers out of the compressed form, and then you change the 10th one to 20...and then write the million integers back... that's a pretty inefficient way.

I've mentioned that this is not made up. Shixin's FFI tried to parrot the methodology of GOB! with its compressed size field, so that the struct would be able to tell it was being asked for (million_ints_field.10), and be able to do a GET or SET of that without blowing up into a BLOCK! of a million integers.

Trying to generalize this complicates the system immensely, and we are probably better off asking the datatypes which want such granularity to not expect the field selection mechanic to bear the design burden. Perhaps you write instead:

>> struct.[million_ints_field 10]: 20

This puts a bit of a syntax burden on those using custom datatypes. But having tried to legitimize Carl's GOB! trick / Shixin's FFI trick has led to what I consider to be more of a mess than it is actually worth.

If you think you need efficiencies out of "subcell addressing", the likely truth is that you need to break your data model into more cells.

What I'm saying here is that the "recursive" nature instead becomes a backpropagation in SET. So it can just linearly go backwards across the path it has processed.