Influences On Startup Time And Memory Use

hostilefork · September 18, 2022, 10:11pm

Right now it's not ideal to be focusing on things like startup time and memory use. There are a lot of fundamental features being rethought--and recall that rules of optimizing code at the cost of clarity and flexibility of design are:

Rule #1: Don't Do It
Rule #2 (Experts Only) Don't Do It... Yet.

...BUT, the issues can't be ignored forever. And it's reasonable for one to ask why there's been a dramatic increase in boot time and memory use between the build being used for bootstrap and a current commit.

So it's worth having a thread here to track some of what's involved.

ENCAP Detection

By default we still run encap detection on all desktop builds, scanning the executable. On Windows I think Shixin's version loads the whole binary into memory, and on Linux it still does quite a lot.

You can skip the detection by using --no-encap.

https://github.com/metaeducation/ren-c/blob/master/scripts/encap.reb

But the encap and de-encapping tools will still be bundled in the executable. They're not an extension, so if you don't want to pay for that...you need to entirely remove early-boot modules like encap and unzip which are built in another way

Obviously platform-specific C code would be faster and lighter than PARSE. And there was some before, but it entangled things in the core with FILE I/O...and it was dedicated finicky C for a purpose we're not really focusing on, especially in the web build.

The decision to move encapping to userspace tools was mine, and not something I regret. But since we're not using it, all it's really doing is acting as a test. I've made a separate thread to talk about the fate of Encap, and whether we should depend on it more or distance from it further:

Boot Footprint: Giant String Literal vs. Encap?

A Big Cost Is Going To Come From UPARSE

UPARSE right now is an elaborate exercise of the ability to build complex feature-filled dialects in userspace. And it does so at great cost to the evaluator.

Of course the plan is to cut that down, because COMBINATORs are just functions. They could be written as natives. And even more importantly, the process of combinating itself needs to be native.

I have done some experiments with this:

Progress on Nativizing Parser Combinators

But those experiments are currently inactive, because the design needed more work. And it's easier to churn through that work with userspace code.

What can we do about it? Well until UPARSE goes through an optimization phase, we can just use PARSE3 in boot...or at least for whatever subsetted codebase is in this metric. The main thing is just to get it measured so we know how much of this is known UPARSE-ism vs. other unknowns. I'm going to bet it's a lot...even though it's not used all that terribly much in boot, it's going to be big.

Cutting it out for the moment would at least help focus on the next bigger things.

Another Pain Point Is Going to be GET+SET Atop PICK+POKE

I spent quite a while working through what a GET and SET and PICK and POKE actually were. Ultimately I concluded:

GETs are just sequences of individual PICK steps (where a GET of a WORD! starts the chain with the binding of the word, and PICKs the word out of that object)
SETs are a sequence of PICK steps which are kept track of...followed by POKE. That POKE can return nothing (in which case you're done) or it can return an adjusted value. If the value needed to be adjusted that means it then gets POKE'd back into the cell back in the chain, and this ripples back so long as the bitpattern in cells need to be adjusted.

I haven't gone back to this prototype and optimized it. That means it quite literally is building evaluation chains of PICK and POKE every time it does tuple processing (what would be "path picking", e.g. variables out of objects). I wasn't sure if this was the answer or not, so it seemed best to keep it general to be able to play with it.

It's tough to know how much "hardening" should be done on this. It's nice to be able to hijack and hook and bend things. I think I still want to consider it to be calls to PICK and POKE, but we can do those calls via frames built just for those functions...and not generic evaluation. I'll have to look at it.

Each Extension Adds Memory Use, But Also Has Startup Code

By default the desktop includes every extension, even for making animated GIFs...as well as currently

If one wants to make a non-kitchen-sink test build of Ren-C...obviously use debug: none, and chopping extensions out with - instead of +, for starters. Note that extensions can be built as separate DLL/.so with *

Other Factors Need Managing On a Case-by-Case Basis

Those would be among the only things that can be done without some attention to the C, which hasn't been vetted for this metric in years. But it isn't a priority right at this exact moment--there are much more important things.

(If you want some of my general philosophy about why Ren-C will be competitive with R3-Alpha despite "increased complexity", then seeing some old stats on SPECIALIZE might be illuminating)

WickedSmoke · September 20, 2022, 1:07pm

To collect startup times and memory use, I use "valgrind --tool=massif" to collect data.

hostilefork · September 26, 2022, 2:06am

Thanks. I occasionally run Valgrind (catches slightly different things than address sanitizer) but was not aware of this particular tool.