Every imperative language has the problem of when you modify a data structure, you run the risk of confusing other code that is looking at that data structure at the same time. Whether it's threads competing for access, or just a loop whose body modifies the data being looped over...the problem comes up.
The options for dealing with it are:
-
Crash - You can just crash. R3-Alpha had this in several places--for instance PARSE, where a data pointer (REBVAL*) into the rules series was incremented along the rules...even though code was permitted to change the rules as you went (via GROUP!s). The rule series might grow and be retargeted at a larger memory block, leaving the old pointer recycled and invalid.
-
Nonsense - You can make the iteration never crash, but be protected in such a way that it does something to avoid crashing. Basically this means you modify only in ways that can be checked by iterations. However, this means additional checking must run in the iterations to make sure it picks up on the signal that something has changed. For instance: in the PARSE example above there could be a check of the length and a re-fetch of the pointer by index; if the length is out of bounds, it could error. If still in bounds--even if between rules--it could take a shot at keeping running, and just fail as if the nonsense had been how it were written all along.
-
Predictable - Some languages offer guarantees about what kind of modifications are safe, and which ones are not. For instance it may say if you are iterating a map, then anything you remove that you haven't visited yet won't be seen...or if you insert then anything you add won't cause something already visited to be seen again. They might tell you explicitly when all bets are off--and say the effects may vary, or even if they may crash. (C++ has container guarantees, and things outside the guarantees may crash.)
-
Illegal - This has been Ren-C's strategy so far...to simply say you cannot modify things that are being iterated. Some routines (like FOR-NEXT) aren't considered to be iterating, because they are just running NEXT on a variable each time, and so have the same between loop invariants as if you did it yourself.
It's likely that most people agree that crashing is not a good choice. Python says:
"I'm not saying it's uncrashable. I'm saying that if you crash it, it's a bug unless proven harebrained."
So what they get are exceptions, e.g.:
my_list = [1,2,3,4,5,6,7,8,9]
for i in range(len(my_list)):
if my_list[i] == 8:
del my_list[i]
IndexError: list index out of range
Rebol (and Red's) history seems to aspire to non-crashing Nonsense. There's not a lot of emphasis on what the promised invariants are, but they give lip service to the idea of "crashing is bad". So if you point out a crash, a patch is made at that point based on some testable property (index out of bounds, etc.) and a non-crashing behavior is picked. Repeat as crashes are discovered.
I've felt like illegal has worked out pretty well for Ren-C. It's conservative...but the general rule of thumb is that modification during iteration is a so-called "code smell". You can always do it another way. And if this ever became a real performance problem, a dedicated native that caters to a certain explicit stylized modification could optimize for that particular case (e.g. REMOVE-EACH).
But stackless introduces a problem for locking series: it means locks are no longer necessarily taken in a stack-like manner, so locking cannot be done via a single bit.
Previously series had a single bit on them... SERIES_INFO_HOLD. This bit is in the same bitflag set as other sources of read-only-ness, such as SERIES_INFO_PROTECTED (can be turned on and off) and SERIES_INFO_FROZEN (permanently immutable). So all three could be conveniently checked in one masking operation on modification.
What would happen in an operation that wanted to take a hold is that it would look to see if the series already had SERIES_INFO_HOLD. If so, it would do nothing. If not, it would set a flag on itself to say that it took a hold it needed to release...and release when it was done.
This does not work in a stackless model. A sort of minimum would be that the series would need to have a lock count that was incremented and decremented. The miserly design for series does not currently have space to put that in the series node. Something will have to grow, somewhere.
Python's approach seems to please people enough...with questions on Stack Overflow about modifying while iterating having people say "avoid it". They don't have locks, and the code just does what it does.
This is a deeper question for Rebol than for Python, because the locking questions apply to source code itself (source blocks, parse rules, etc.)...not just user data structures.
I hate to give up putting iterative holds on series, and letting chaos win. We wouldn't want that in our filesystems, why allow it in programming languages? The problem of releasing locks is something I think that the DEFER mechanism can probably handle pretty well.
I've been thinking about some hybrid approaches which use a bit if it's sufficient, but only break out into a locking table if there's more than one iterator. Build options could decide: no locks (allow crashes), cheap locks (reference counts), heavy locks (be able to tell you which frame took the lock).
I'll probably give an enhanced locking method a shot, and stay the course on prohibiting modification during iteration (unless using an iterator that specifically accounts for that).