Value (vs. Series) Modification Bit: CONST and MUTABLE

CONST and MUTABLE have been around for a while now, and I think the chosen balance has worked out rather well.

One historical problem point with these mutability features is that there was no compile-time checks to make sure code wasn't violating it. There were tons of cases of PROTECT bits not being honored, simply because there wasn't a check for mutability in some routine. The person hacking on the C code to REVERSE or SORT a series would have to explicitly remember to think that was a mutating operation and check the bit.

The obvious-sounding way to stop these problems from creeping in would be to leverage the const annotation in C and C++. All the routines that modified series would require the caller to have a non-const pointer in their hand...while routines that could be done on read-only series could take either a const or non-const pointer.

So consider the simple example of getting an element at a position in an array:

 Cell* Array_At(Array* array, Index n)
     { ...lots of code... }

Historically this would take in a mutable Array (the only kind there was) and give back a mutable Cell. But what we want is for mutable arrays to give back mutable cells, and const arrays to give const cells. So we could simply create a wrapper that calls into the mutable implementation but reskins the result as const for const input:

 Cell* Array_At(Array* array, Index n)
     { ...lots of code... }

 inline const Cell* Array_At(const Array* array, Index n)
     { return Array_At(m_cast(Array*, array), n); }

There's just one problem... C doesn't support overloading. You can't have two functions with the same name and different signatures and have the compiler pick between them. There'd have to be two different names:

 Cell* mutable_Array_At(Array* array, Index n)
   { ...lots of code... }

 inline const Cell* Array_At(const Array* array, Index n)
   { return mutable_Array_At(m_cast(Array*, array), n); }

This might not seem like that big a deal, but the combinatorics add up. Because now you can't write a generic macro that speaks about array positions...you have to have macros with different names that call the differently named accessors. And consider there are lots of these routines (Array_Head, Array_Tail, Array_Last... Binary_Head, Binary_Tail... Series_Data, etc. etc. etc.) It's pretty horrific when you start having this explode with mutable_XXX variations and mutable_XXX variations of everything that calls them.

I came up with a trick to get around it. Basically, the trick is to sacrifice some amount of const checking in C. First, define a macro for something that resolves to const in C but vaporizes in C++:

#ifdef __cplusplus
    #define const_if_c
#else
    #define const_if_c const
#endif

Then, define the functions like this:

Cell* Array_At(const_if_c Array* array, Index n)
  { ...lots of code... }

#ifdef __cplusplus
    inline const Cell* Array_At(const Array* array, Index n)
         { return Array_At(m_cast(Array*, array), n); }
#endif

So the C build will give you back a mutable array no matter whether your input array was const or not. But the C++ build only gives back const arrays for const input.

This makes systemic enforcement of mutability checking practical. If you're inside the implementation with a const array, string, or binary... you won't be able to make a call to a C routine that will mutate it. The only way you can get mutable arrays is through specific entry points that extract the array with a runtime check to make sure it's mutable.

It's all in the implementation guts...so it only affects those using the core API, not libRebol. The only thing you need to do is make sure you at some point build the code with a C++ compiler, and it will tell you where any problems are.

2 Likes