Most languages with arrays have you keep the index separate. Languages that do something fancier typically abstract the process into "iterators"...which may be more complicated than simple integers.
But Rebol made a strange decision to fold an integer into its arrays and strings (the ANY-SERIES!). It is summarized by someone who didn't like it on the C2 Wiki page "Why isn't Rebol Popular":
>> s: "I hate this approach" == "I hate this approach" >> s: next s == " hate this approach" But this 'I' isn't lost. >> back s == "I hate this approach" >> s == " hate this approach" >> s: head s == "I hate this approach"
Whether you love it or hate it, having a hidden index opens a tremendous can of worms.
If you ask around, few people would know the answers to semantic questions involving this. You get an inkling of how complex it is if you just COPY a series that's not at its head...the data before the index is not copied.
>> x: next "abcd"
== "bcd"
>> y: copy x
== "bcd"
>> back x
== "abcd"
>> back y
== "bcd"
This issue crops up everywhere. What about when you REDUCE a block into another? What about if you COMPOSE?
(UPDATE 1/23/2021: Red has some acknowledgment of the category of problems, see Red Issue #4810 and the linked tickets in, where differences between MAKE and COPY are mentioned)
It also means ANY-SERIES! are actually iterators on data that can be mutated through other references. If you are pointing into a string or block at index 1000, and someone clears all the data out of that string, your value cell still holds the index at 1000. What should the semantics be?
Firstly: why does Rebol have this feature at all?
For such a weird thing you'd have to think it would be good in some way. And it does have several concrete advantages:
-
It cuts memory use by more than half for series + index. The index is slipped into an otherwise-unused spot in a Rebol "cell". That spot is the size of a platform pointer. (R3-Alpha actually had another spot available, but Ren-C utilizes this for "binding", which is how blocks representing function bodies can stay connected to the specific instance of a function invocation they represent, to give "specific binding" on the words nested underneath them). Storing an independent index would require another INTEGER! cell. But it's worse than that, because that cell would need to live in a variable--meaning there'd have to be a context key cell for it.
-
It means you don't need multiple return values to return a series and an index.
-
It reduces the amount of code you have to write. It's not just a matter of putting the series and index into the same variable for the storage size that represents. It's all the storage and processing for the units of code in the references. Where today you can just pass
series
, you would have to passseries index
.
So weird though it may be, it's pragmatic. The language would be pretty different without it, and would need an iterator concept or it would be far worse.
Beyond saving space, there's no "magic" involved
There's not any kind of "strong theoretical basis" for Rebol's inclusion of an index in an ANY-SERIES! value. It has all the weaknesses of an independent integer index in a C/JavaScript/Python-type language. Sticking it in the value itself solves nothing.
So I think it's a bad idea to have the behavior be any different from if the index was being held independently.
This means series should be able to hold an arbitrary integer... negative, or past the end. back back back "abc" should take 3 steps back from the head at 1, to be at index -2. And it should take next next next to get it forward to the head.
Every operation that can be done on a series with its internal index should thus be a synonym for doing that operation on at series index with an external index. This means instead of defining two sets of behaviors, we only have to define one.