The Sea Of Words

hostilefork · March 18, 2021, 5:19pm

 === I HAVE SORT OF A PLAN-OF-ATTACK ===

 echo [For how to deal with the explosion of WORD!s
     that would result if you start sanctioning people
     doing things like this...]

TO RECAP: the user context proactively creates variables for every word that you even so much as mention. The idea is that it doesn't know if that word will come into existence later, and it wants to be able to facilitate things like:

foo: func [] [bar]  ; doesn't leave unbound, in case definition comes later
bar: func [] [print "Hello"]  ; hooks up to previously existing bar variable
foo  ; prints hello

We've seen that there are several downsides to this strategy:

Even without pathological cases like sentences-of-word-using === and echo, the user context swells to a very large size. You get definitions for function arguments that were never intended to have a global variable for them, every LET variable gets a user context binding it doesn't use, etc.
You have no way of knowing when a variable isn't valid or intended. Any stray assignment or misspelling writes into the global space:
```
 foo: func [argument] [
     argment: default [10]
     print ["You just defaulted the wrong" argument]
 ]
```
The aggressive creation of variables that are local copies of everything from the lib context means that you won't see changes if the lib definition changes.
```
 >> foo: func [] [newfangled "Hello"]
 >> append lib compose [newfangled (:print)]
 >> foo
 ** Script Error: newfangled is ~unset~
```

Proposal: "Attachment" => Binding Without Creating a Variable

What if there were a way to say that WORD!s could be attached to a context, but not actually have memory for a variable behind that reference?

If a new variable came into existence, those references would see it. But those references themselves would not create the variables...at least not without some extra effort.

>> foo: func [] [y: 10]

>> foo
** Error: y is attached to a context, but no definition exists for it

Going with the current way for making variables appear in contexts:

>> foo: func [] [y: 10]

>> append system/contexts/user [y: _]

>> foo
== 10

>> y
== 10

But we could imagine there being some new LET-like construct which would enforce the existence of the variable. It would peek ahead at the word, and then force a variable if there wasn't one already, then vanish. Let's call it EMERGE for now.

This addresses the issue of references that exist "back in time":

>> bar: func [] [y: 20]  ; Y scanned and bound before the emerge
; (so calling bar right now would error)

>> foo: func [] [emerge y: 10]

>> foo
== 10

>> y
== 10

>> bar  ; was allowed to assign and overwrite
== 20

I've started a small hacked-together proof-of-concept. It seems to have potential, but there are a lot of questions...

When Are SET-WORD!s Implicitly Gathered?

Right now, modules use the tactic of only considering top-level SET-WORD!s to be gathered.

But what if something isn't a module? What if it's a string of code, like:

>> do "x: 10, print [x]"

Had you written that as a BLOCK!, it wouldn't work...

>> do [x: 10, print [x]]

...because in that case, the actual loading process the console ran was for the string "do [x: 10, print [x]]", and that doesn't have a top-level SET-WORD!.

My feeling is that top-level SET-WORD!s being "emerge'd" implicitly is that should be a choice for a module, but not a basic behavior of DO or TRANSCODE.

How Do Top-Level SET-WORD!s mix With SET-BLOCK!, anyway?

Not all contents of SET-BLOCK!s are words being assigned, e.g.

[x (first word-list)]: some-func ...

So we don't want declarations for FIRST and WORD-LIST, since they're only finding the name of the word to be evaluated...and if they were bound as new variables by a "scan for SET-XXX" process they would not work for their intended purpose, since they'd lose their binding to the FIRST function and the WORD-LIST variable in the process.

Additionally, not all SET-BLOCK!s at the top level necessarily mean assignment. They can be used for dialected purposes. That's true of SET-WORD!s too...even at the top level.

How Do You Mix EMERGE with LET?

There was a problem with LET wanting to do multi-returns, e.g. where one variable needs a new definition but another one already exists. That is being resolved with quoting, you add a tick to say you want to pass-thru the definition of a variable:

let [new-variable 'reused-variable]: some-func ..

(This same concept of quote use is being applied to say not to create a new loop variable for things like FOR-EACH, e.g. for-each 'reuse [1 2 3] [...])

But now we have the same problem with EMERGE. I guess you could put it in a GROUP! and then say to reuse the emerged product, like:

let [new-variable '(emerge global-variable)]: ...

What Does BIND Mean When You Bind to Everything?

It's generally the case that BIND BLOCK LIB or BIND BLOCK USER is something only done at the beginning of constructing code from raw material. Most people expect LOAD to do this for them.

If these kinds of contexts consider themselves candidates for all words, they'd never not-bind.

It may be that this category of context (synonymous with MODULE! ?) is something you have to subset into a collection of words before using in binding operations...and if you bind without that subsetting then you just reset everything in the material you're working with.

Despite The Questions, This Is Probably What's Needed

The small demo I have working makes me reasonably optimistic that this is the right direction. I'll keep looking at it.

hostilefork · March 18, 2021, 9:14pm

As a show of promisingness...

m1: module [] [
    talk: does [print "M1 talking"]
    show: does [newfangled]
]

m2: module [] [
    print: func [x] [write-stdout "?!?#@"]
    talk: does [print "M2 talking"]
]

The idea is that module isolation will be cheap-as-free. So there won't even be an [Isolate] option, every module is isolated from what others might choose to do to their own versions of things in LIB:

>> m1/talk
M1 talking

>> m2/talk
?!?#@

But this isolation doesn't mean they are frozen and unable to see new things that appear. Previously, isolation would have meant that the NEWFANGLED reference in M1 would be unbound, and could never become bound.

Yet in this new model, M1's NEWFANGLED is actually bound to M1 with no variable. And since M1 inherits from LIB, this non-variable is able to receive an inherited word at the time it shows up!

>> m1/show
** Script Error: newfangled word is bound but not declared

>> append lib compose [newfangled (does [print "Picked up!"])]

>> m1/show
Picked up!

This is already quite a significant improvement over history! There are an intense number of headaches caused by things not working this way, and once they can do this, it cleans up a lot.

Lots of questions will need to be answered before this can be usable, but fingers crossed...

BlackATTR · March 18, 2021, 9:49pm

I'm still trying to process this. I think rebol bindology is a hindrance for adoption, at least for someone at my (low-ish) level of sophistication.
I keep thinking some of the recent forum topics will generate some fervent reactions. We'll see about this one. I think it's wild.

gchiu · March 19, 2021, 7:27am

Looks promising. Modules have never been solved before.

hostilefork · August 16, 2021, 3:43am

I've been away from working on Sea of Words for a while...because I started attacking really far out features like string interpolation and getting rid of rebArg() by having JS-NATIVE and natives be able to mention locals directly in the text code they scan. There were several cans of worms opened by all of that.

This held up the promising work on Sea of Words, which includes a lot of simplifications to LOAD and MODULE mechanics while I was hacking on it.

So I've rebased up to present, and factored out the "working sea of words stuff" from the speculative stuff. They're two branches now, and I want to get the good stuff in.

Cleaning Up The User Context

If you start up R3-Alpha, you can ask it "what's in the user context?"

r3-alpha>> words-of system/contexts/user
== [system words-of contexts user]

Errr... well it was empty at startup. But the mere act of asking populated it with things. This will go on indefinitely:

r3-alpha>> data: [apple banana orange]
== [apple banana orange]

r3-alpha>> words-of system/contexts/user 
== [system words-of contexts user data apple banana orange]

You might think you've declared one variable. But you've declared 8. (I explain why this is in the original post here, as well as other places.)

Can Sea of Words do better? You bet your Sea of Fishies it can:

>> words of system.contexts.user
== [intern export import]

That needs a bit of explanation, because the user context is now a "user module". (Modules are what have the sea of words behavior.) Each module gets its own version of the import and export functions, because it has to know where to import to or where to export from.

But words and of and system and contexts and user are notably absent. Yet the code still ran!

Let's keep going:

>> data: [apple banana orange]

>> words of system.contexts.user
== [intern export import data]

Well that seems like a much more sane result. Assign one variable, get one variable. But can it still handle the idea of "declarations that show up after their references?"

>> foo: func [] [bar]

>> words of system.contexts.user
== [foo intern export import data]  ; no bar mentioned, hmmm...

>> bar: func [] [print "Hello"]
== [foo intern export bar import data]  ; now there's a bar, is it too late?

>> foo
Hello  ; not too late!!! bar in `func [] [bar]` was pre-emptively "attached"

Well that sure looks like great news, doesn't it?
:

Blocking Issue: What Does BINDING OF Mean Anymore?

Let's look at the darker side of the mechanic behind that behavior:

Proposal: "Attachment" => Binding Without Creating a Variable
What if there were a way to say that WORD!s could be attached to a context, but not actually have memory for a variable behind that reference?

Starting again from a fresh interpreter:

>> words of system.contexts.user
== [intern export import]

Let's pursue a different line of questioning:

>> type of :append
== #[datatype! action!]

>> words of binding of 'append
== [intern export import]

BINDING OF gave us the answer of the context that append word was ATTACHED to, not the context where its actual variable lives.

Assigning APPEND will overwrite the variable with a definition in the User context so it will no longer see the definition in the Lib context.

>> append: does [print "Overwritten."]

>> words of binding of 'append
== [append intern export import]

Well, BINDING OF gave us the same answer as before...but now it's got an APPEND definition in it vs. just inheriting the one from Lib

How do we reconcile this notion with what people are trying to get at when they ask the question BINDING OF?
Do we need a menagerie of more questions, like ATTACH OF? INHERITED-WORDS OF to get the full list?

These Are Big Questions and I Don't Know The Answers

I feel like the direction is solid. But so long as Sea of Words is off on some side branch, I'm not facing the implications day to day to try it out.

Maybe we go with BINDING OF giving back an answer that's not NULL, but not the context when something is attached. This is a nice application of BAD-WORD!s actually... How about ~attached~ ? That's informational and not misleading, and can also be acted on. Then have ATTACH OF to answer the more general question.

In any case, I'm going to try to pin something down so we can start using the darn thing.