Should modifying variables during iteration affect it?

hostilefork · December 2, 2018, 11:18am

Here is the behavior of changing a REPEAT variable in Rebol2 and R3-Alpha:

rebol2>> repeat n 3 [print "What happens?" n: 3]
What happens?
== 3

Here's that same behavior in Red:

red>> repeat n 3 [print "What happens?" n: 3]
What happens?
What happens?
What happens?
== 3

One might then ask what happens if you set the loop variable to something not even an integer. In Rebol2, it just exits the loop. In R3-Alpha, it errors:

r3-alpha>> repeat n 3 [print "What happens?" n: <boo>]
What happens?
** Script error: tag! type is not allowed here

Is there always a clear meaning?

Not necessarily. What about FOR-EACH?

foreach x [1 2 3] [x: "what should this do?"]

If you put on your imagination hat, you might imagine this as a way to mutate the series being iterated. In that world, changing X would actually change the element in the array. But Rebol's model isn't completely geared for this...though it could be done (by marking the variable with a bit, observing if that bit is cleared in an overwrite, and doing a write-back if clear).

That's not necessarily a bad imagination-hat to have on--though I'd worry about implementing a feature exploiting bit-level knowledge that a user couldn't do themselves. (So it would have to rely on equality, and write back non-equal things--just having an optimization that avoided the test if the bit hadn't been mucked with.)

(Note: C++ can do this through "references", and it is the difference between for (auto x : collection) {...} and for (auto &x : collection).)

But even if you could stretch and come up with a meaning in FOR-EACH, the broader point is it's not always possible. Some iterations that put values in a variable may not be able to sensibly understand a change made to the iteration variable, and interpret it in a way that affects the next iteration.

What about performance?

Red's decision on REPEAT gives it a certain freedom from trying to figure out the meaning of any changes to the variable. The iteration holds internal state--images that into a variable for you---and if you change it that's just for your temporary purposes in the loop. There's no fetch of the loop variable on each iteration, only a write.

But Red clearly thinks it's important enough in FORALL to pay attention to changes to the iterated variable, because that works:

red>> forall data [probe data data: next data]
[1 2 3 4]
[3 4]
== [4]

In fact, in a fairly rare case of Red not repeating a bug in R3-Alpha, they fully fetch the variable each time through the FORALL.

Paying attention to changes to the variable is not going to be the most performant choice. By avoiding the fetch of a potentially modified N in the REPEAT loop above, Red doesn't have to re-fetch the variable on each loop iteration to make sure you didn't change it into a TAG!, or whatever.

But I think it is worth noting that it is now 6 years since R3-Alpha was open sourced, and 7 years since Red's announcement. When you think about making decisions based on "performance", e.g. the speed of a REPEAT loop, choosing something less flexible for purposes of marginal efficiency makes less sense than ever.

Solutions?

Being able to change loop variables allows more patterns of solution. The FORALL case is a good one (Ren-C calls this category FOR-NEXT because you can change the series to whatever you like, and it runs a NEXT on it at each iteration). I think it would be less expressive to not be able to see response from the iteration to a change, and by extension one would argue that changing the index in a REPEAT (or plain FOR) should also have an effect.

Yet I've pointed out it doesn't always make sense--such as in FOR-EACH.

One idea that comes to mind could be to PROTECT any loop variables whose modification cannot semantically be mapped back into meaning for the next iteration. If you find you can't change a variable, then that means that the loop wouldn't be able to react if you did.

I like that because it feels like it's giving feedback to the user about the contract of the loop. Though it might appear to "waste" a variable slot you could otherwise use as a temporary.

What do you think? Should loops lock any variables from modification if they can't meaningfully interpret a modification to affect the next iteration?

BlackATTR · December 2, 2018, 4:56pm

I’m in favor of the PROTECT approach.

IngoHohmann · December 3, 2018, 2:15pm

Me too, if it's not too costly.

hostilefork · December 3, 2018, 3:11pm

The next question comes down to which loops should allow the mutations, and how they should react.

I don't know if REPEAT specifically (now COUNT-UP in Ren-C) can make a clear decision, such as if you set the number to a negative value...should that start it over?

If you run the following in R3-Alpha:

flag: true
repeat n 3 [
    if flag [n: -3 flag: false]
    print ["n is" n]
]

You'll get:

n is -3
n is -2
n is -1
n is 0
n is 1
n is 2
n is 3

This kind of exposes the guts of the implementation in a way that seems to break the contract. REPEAT doesn't naturally have a definition for what to do with a negative count.

But FOR is talking more specifically about a "bump" and a range, so perhaps it is different.

So...should REPEAT (COUNT-UP) not allow modifications? Or should it allow them, and treat anything outside the range as an error, otherwise proceeding in step? Or should it just be a specialization of a FOR loop, and inherit whatever FOR does...exposing this aspect of its implementation?

iArnold · December 4, 2018, 9:57am

If REPEAT and COUNT-UP get different behaviour with respect to protection of the running index/counter, then I imagine REPEAT to have the protective mode and COUNT-UP just continues counting up even if the counter was set back, making it perfect for simulating Game of the Goose

hostilefork · December 4, 2018, 10:12am

It is a good point I guess that COUNT-UP does say specifically the variable is being counted up to 3...a nuance the word REPEAT does not have. But that makes it all the more sensible that the current plan is for REPEAT to not offer a counter (hence reclaiming "LOOP"):

>> repeat 3 [print "repeating"]
repeating
repeating
repeating

This seems consistent with the use of the word in lists of directions. ("repeat the process 3 times" is rarely coupled with "and on the 2nd time, do something different". That's the exception that proves the rule.)

So by that definition, the counter would be protected in REPEAT and internal to the implementation, because you couldn't access it at all! It could be counting up, or down, or using whatever mechanism it liked (0-based instead of 1-based, for instance).

Then COUNT-UP can be disrupted and still have a mission to fulfill: it was counting up. So if you change the variable out from under it, it can still keep adding one and measure if it's gotten to 3 or not. But COUNT-DOWN is a little more confusing, though I guess it could just seek towards 1.

The question would be what to do when you go out of range. If you COUNT-DOWN from 3 to 1, but interrupt it to set the variable to 0, does it consider itself finished...or infinite loop since it never counts down to 1...does it error...or does it suddenly start counting up to try to get to 1 again?

iArnold · December 4, 2018, 11:32am

The strength of the REPEAT over LOOP has always been the availability of the index variable that LOOP goes by without.
COUNT-UP and COUNT-DOWN should stop when/after the limit value has been reached. Use cases can reveal more about how to implement the functions.
COUNT-UP and COUNT-DOWN could have start and end values and a stepsize (always > 0) and those can have internal references to keep track to be used by the developer to manipulate the status in the loop .. but I think I prefer a keep it simple approach. What would kids do when using COUNT-UP and COUNT-DOWN.

And why would you want to reclaim loop by repeat? To make LOOP the more complex one?

hostilefork · December 9, 2018, 1:14am

Yes. LOOP is a short word and there is precedent of a "loop dialect" in Lisp, which I'd like to see tackled in Rebol:

http://www.gigamonkeys.com/book/loop-for-black-belts.html

Imagine something designed to be tailor-made to the domain of looping, much like PARSE is for parsing. Doesn't it make sense that would get the most generic and short name?

I don't know exactly what Rebol's take on this would look like. A good start might just be to copy Lisp as verbatim as possible for starters. It would be like making a query language...start from something that has proven itself successful like SQL, and then improve.