Why We Don't Use Copy-on-Write

hostilefork · May 5, 2019, 5:25pm

I plugged in an old computer I hadn't used for a while, which had a file on it with a heading that said: "HOW COPY ON WRITE IN REN/C WORKS".

"But Ren-C doesn't use copy on write!" - you might point out. (You might also point out that we haven't typed it with a slash for a long time.) No, it doesn't use copy-on-write...and this piece of old unfinished writing was part of the patchwork of reasoning for why it was ultimately not pursued.

Long story short: what we've ended up with are the much better CONST and MUTABLE. It addresses the motivation with a relatively simple mechanic, and provides means for Redbol to run under "traditional" rules. But as long as I'm deleting this file I might as well archive some notes on it and the issue...

Reminder of What the Problem Was

With series being mutable by default, everyone would get bitten by forgetting to COPY a series.

rebol2>> loop 3 [x: "" append x "a"]
rebol2>> x
== "aaa"

It can be counter-intuitive to realize you are writing self-modifying code. It's easier to see if you make a function out of it:

rebol2>> foo: func [] [
    x: ""
    append x "a"
    return x
]

rebol2>> loop 3 [foo]
== "aaa"

rebol2>> source foo
foo: func [][x: "aaa" append x "a" return x]

For people who like puzzles, this can be an intriguing mechanic to understand. For people who dislike bugs, it's kind of horrifying.

What if SET-WORD! Assignments Copied Literal Arguments Implicitly?

After the open-sourcing of R3-Alpha, people could try out whatever pet theories they had. An early thing that Andreas (@earl) looked into was how to make the REB_SET_WORD case notice it had a literal series argument, and COPY it. So x: "abc" would make a copy, but x: some-string would not.

He did some performance analysis on typical cases and said it didn't seem to be that bad. But I pointed out that it wasn't COPY/DEEP. It seeme kind of arbitrary to have this give you a fresh block every time:

loop 2 [
    x: [a b c]
    append x 'd
    probe x
]
[a b c d]
[a b c d]

But as soon as you go a level deep it doesn't:

loop 2 [
    x: [a [b] c]
    append x/2 'd
    probe x
 ]
 [a [b d] c]
 [a [b d d] c]

This made me wonder if there could be some kind of system-wide mechanic, where any COPY/DEEP you did (implicit or explicit) wouldn't actually make the copy until there was a change. That's what copy-on-write is.

Here's what the file said

The file was starting to outline some of the work that was tried:

Rebol has historically maintained the "backing store" of data behind a STRING! or BLOCK! or other SERIES! type inside of a data structure that is a dynamically growable array, called a REBSER.

(Note: Saying that a REBSER is a "REBol SERies" is a little misleading in terminology, because if something is a SERIES! as the user experiences it then it also has an index position. There is no index inside of a REBSER itself; that is established in a REBVAL which is of series type that has a REBSER plus an index. Whether we want to call it "REBSER_DATA" or "The Series! Payload" or the series's "Tape" is up for debate, but we'll just call it a REBSER for the time being.)

In Rebol there are frequently cases where data is copied or even COPY/DEEP "just to be safe". This includes things like function bodies or specs... for fear that a user might unknowingly modify something they intended to reflect. These copies can be expensive in terms of memory consumed as well as in the time spent making them; and it might turn out that they are never modified at all.

There is already a hook point established that can trap requests for modification. This is used to implement the PROTECT feature. Ren/C adapts this hook point to implement a Copy-on-Write strategy.

(Note: Ren/C is more aggressive in its application of language features like C's const to make sure that operations like PROTECT cannot be subverted. This gives more confidence to PROTECT as well as that the Copy-on-Write will work.)

At first glance the implementation of Copy-on-Write seems simple. Add a pointer into each REBSER to the next REBSER that is a member of its "clone set"...circularly linked. Should a modifying operation be requested on a clone, the copy is made of the data at that time. The modified series is removed from the list...and once the last clone vanishes, the pointer on the final remaining copy is set to NULL.

If it seems so easy, one might wonder "why didn't someone already do this." Reasonable question. However, the real benefit comes when you can do Copy-on-Write on COPY/DEEP. That's trickier, because if you are navigating a series and descend into it, you need to remember which point of view you had in order to decide whether to copy or not.

Imagine a series like [a [b [c [d]]]] that you were to COPY/DEEP. To some clients you give your version...the source of the copy, which they should mutate freely and affect the source reference you have. To other clients you give the copy... and if they tinker in the middle it should not mess with your source. It should only be affecting other clients who got the COPY/DEEP version.

This means that series values themselves need to somehow carry along the knowledge of which clone they are traversing. If you expected the REBSER* that is poked into a series value (along with index as mentioned) to be enough information, it means that REBSER* exists... so you already did the Copy/Deep. At the outset of a Copy/Deep of a REBSER that contains 10,000 REBSERs inside it, you only want one cloned REBSER. Otherwise "you paid too much" and it's hard to really understand what the benefit was; only terminal series with no sub-series in them will have the Copy-on-Write advantage.

Inside a Series_Ref there is enough room for a pointer. That pointer is used to say which clone a series reference is in. If you extract a REBSER from a series reference the default is to get a const pointer which cannot be modified. If you extract using a mutable access, then the deep copy will happen at that time.

The cloning status has to be checked each time to see if it is still in effect for that value. A REBVAL which is writable can be adjusted by the mutating accessor.

...and it stopped there. Rebol's series model (with cycles, even) does not make it easy to synchronize and "detach" COPY/DEEP structures coherently. And the tracking structures for doing so are on the same order of magnitude of size as the structures themselves. Then, it's more code, too.

It did too much behind your back...CONST is better

Although the idea would have applied to any COPY or COPY/DEEP, it was geared toward this notion of helping avoid the bugs cheaply. But the SET-WORD! on literals was a red herring. Is there anything you can really say is different between:

 x: "abc"
 append x "d"

...and...

 x: ("abc")
 append x "d"

It's problems waiting to happen. I think that when you read through the examples of how CONST is working, it's much more in the right spirit.

So an interesting retrospective on ideas from then, vs now.