So... browsers aren't making it easy to use SharedArrayBuffer
--a requirement for the pthreads web build. This is due to something called COOP/COEP. It's making what was already some tough work to turn on threads even tougher:
My ranting response on the topic:
(What I see is a lot of half-baked security theater, where the answer is putting flags on things saying "yes, I meant to use that resource...the one I said to use". You're not able to put this canon list of "I meant to do that" in a nice central location...instead you have to weave the list around piecemeal in various places that express the flag (sometimes in different cases). Then some places just haven't been parameterized to allow the flag, which breaks the whole thing in a frustrating way.)
At one time, the pthreads build was quite a lot faster than the "emterpreter" build. But emterpreter was replaced with the much superior "asyncify" method (if you want to be more specific actually version 2 of it, called "bsyncify"). So right now the performance difference is negligible...and the principal advantage of the pthreads build is that the .wasm generated is less than half the size of the asyncify build.
But if stackless is implemented (which I intend), even asyncify should not be necessary...because we'd be able to unwind our own stack. We wouldn't need asyncify's magical backbone woven into the source that assists in teleporting out-of and back-into any point of the evaluator. So the size difference would go in the reverse direction: pretty much all of the pthreads extra hassle would be make-work, and all those files and code would make it the bigger build.
I'm seriously considering simplifying life and the build/test matrix by dropping the pthreads build. I believed it would be available in all browsers by default, and that part is shaping up to be true. But what I didn't anticipate was the mire of serving concerns that makes it such a hassle--it won't get better, and it may get worse.
Whether it's jsfiddle or any other page where you can't bend every server in the chain to your will...you just couldn't use the pthreads. We're going to need a non-pthreads build. And if we make that the one build that we focus on, deploy, and test...it's just easier.
Remember that pthreads was only a mechanical tool for switching stacks, it had nothing to do with concurrency. You can't make a single-threaded interpreter suddenly able to run in parallel by linking it to a thread library. We were only running one thread at a time: one for JavaScript, and one for Ren-C...so they could pass off to each other without losing their place in what they were doing.
This isn't to be confused with features related to concurrency...which is something being looked to. But that would be with techniques like "green threading", that don't actually ever have multiple CPU cores potentially competing at a simultaneous moment for the same byte of memory.
JavaScript doesn't have "threads" either: it has "workers"...which are isolated from each other in most every way, to the point you can just about think of them as separate processes that just pass messages back and forth--without sharing data structures. The single example of them actually sharing a data structure (SharedArrayBuffer) has created a lot of hullaballoo; getting disabled due to Spectre, and quarantined with all kinds of crazy flags that make it rather difficult to use.
For the sake of sanity and ease of deployment, I'm leaning to the idea of killing off the pthreads build. Asyncify changed the game by closing the performance gap, and Stackless will be an even better answer. Let's make life (and testing) easier.
One Lingering Thought...
The thing that pthreads actually gives us is not the ability to run Ren-C code in parallel, but to run Ren-C code in parallel with JavaScript code. We weren't doing that with it. But theoretically we could have.
We could still do it...if you load the library into a worker. Then your main thread is all JavaScript, posting requests in JavaScript format that are picked up by the JavaScript in the worker...which then makes calls to the interpreter and proxies the answers back.
But this means the Rebol code would not have access to the DOM, and that's not a very interesting working model. (That would be an issue anytime you are trying to run code concurrently...only one thread can access the DOM.)
Then another thing pthreads gives that stackless does not is the ability to "signal" code to wake up, instead of using a polling strategy. That might sound more efficient, but unfortunately a of time when you look under the hood at how these "signals" are implemented actually involves a bunch of timeouts that are isomorphic to polling.
I'm still hedging a bit. What I'll probably do is start ripping out the pthreads code and then pause to reflect if I find anything and think "hmm, that's valuable and would be hard to put back".