I've explained why we need stacklessness:
Switching to Stackless: Why this, why now?
It's taken a lot of thinking. But as proof that it's real, you can try this:
>> counter: func [value] [
if 0 = modulo value 1000 [print [value]]
counter value + 1
]
How many stack levels you get will depend on the available memory of your platform--not your CPU stack.
In the WebAssembly build that runs in the browser, I got this on Chromium on Linux:
>> counter 0
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
14000
15000
16000
17000
It just terminates, but the error that shows up in the browser console is:
Uncaught RuntimeError: Aborted(Cannot enlarge memory arrays to size 16867328 bytes (OOM). Either (1) compile with -sINITIAL_MEMORY=X with X higher than the current value 16777216, (2) compile with -sALLOW_MEMORY_GROWTH which allows increasing the size at runtime, or (3) if you want malloc to return NULL (0) instead of this abort, compile with -sABORTING_MALLOC=0)
The same amount was gotten on Firefox, probably because emscripten requests the same amount of Wasm memory space on each.
Point Is, It Ran Out of Memory... not Stack!
We can see there's work to do on responding to out-of-memory errors (we've never really had a great story for that). But at least we can (!) -- I've explained already that stack overflows can't be caught the way a failed memory allocation can. That's a huge motivator for the change.
But still, over 17000 stack levels is kind of neat in and of itself!
So how do desktop executables "stack up", on my 16-core Lenovo ThinkPad i7 with 64GB of memory...?
As a quick comparison: for Red on Windows, it doesn't make it to 2000:
red>> counter 0
0
1000
*** Internal Error: stack overflow
*** Where: %
*** Stack: counter counter counter counter counter counter counter counter coun
ter counter counter counter counter counter counter counter counter counter cou
nter counter counter counter counter counter counter counter counter counter co
unter counter counter coun...
Running an actual count got me to 1943 and then it crashed.
(I take it that encountering a stack overflow -during- a PRINT leads them to a harsher situation than just during a recursion. But this is all the more reason you can't play fast-and-loose with the CPU stack, it cannot be trapped like an out of memory error! Right now we may be crashing too, but it's substantially closer to being managed.)
For Rebol2 on Windows it makes it to 14262, and does not crash. (So Red's an order of magnitude worse, and crashes.)
I hadn't built a 32-bit optimized Windows executable for a while (and actually had to fix a couple of compiler warnings to get one). But I figured I'd do so for an apples-to-apples comparison with the Rebol2 and Red EXEs.
32-bit Ren-C makes it to...well, jeez. Printing it out one by one could go on forever, so I tried only doing a print on modulo 10,000. It starts to slow down a bit around 2 million, but ticks past 3 million pretty easily...and makes it to 4 million. It eventually terminates without a message at 4780000 (it's an optimized build, and our out-of memory handling needs work...this would likely be an assert in a debug build).
So generously rounding Red up to 2,000... Ren-C can handle over 2000x the stack levels, at least.
Now, To Reap The Benefits...
-
I need to rework the JavaScript extension to use this intrinsic stackless property, instead of "Asyncify".DONE!-
Asyncify gave us the pseudo-stackless ability of suspending arbitrary non-stackless code in order to run browser events. This let us do things like ASK for input while in the middle of a loop. With real stackless, we (shouldn't) need that.
-
Asyncify added bloat to our code generation that I recall making the .wasm about double the size. We'll see how accurate my memory is on that, but hopefully the size of libr3.wasm will go way down.
-
-
One by one the natives have to be redone to stackless. Only then can we truly ditch Asyncify, because if something like FOR-EACH isn't stackless we wouldn't be able to do something like PRINT or ASK inside of that kind of loop... it would raise an error.MOSTLY DONE!-
We can probably punt on some things and drop asyncify even if not everything is stackless yet.
-
You might not be able to do something like
foo.(ask "Field Name?"): 10
until TUPLE! lookup has stackless processing bubble up through its GROUP! evaluations, but I think we can live with that being an error for a while--if it halves the size of the wasm.
-
-
Generators and Yielders need to be resurrected. They were pretty well designed before, but locking issues created headaches...and it's time to attack those again.
-
Out-of-memory errors have to be reined in--not just so stack overflows can be handled gracefully, but that any kind of out-of-memory is handled gracefully!
- We also might want users to be able to set a policy (if they wish) to limit the stack before memory runs out, just so that infinite recursions get caught earlier in casual programs that aren't intentionally using big stacks.
In any case, it feels good to get the core of the work done two years ago hammered into shape...and see it running in the browser!