Contemplating the GENERIC Mechanism

hostilefork · September 1, 2021, 5:24am

In traditional object-oriented languages you put the thing to act on to the left of a dot:

whatever.append("thing to append")

Rebol syntactically turns this around:

append whatever "thing to append"

When code was styled this way, things like APPEND were what historical Rebol called "ACTIONS". I have called these "GENERICS" to align with the same concept coming from Lisp:

Generic function - Wikipedia

Are These Just Two Ways of Saying the Same Thing?

In practice, no.

Rebol's way means the intepreter encounters the word APPEND first. And it could look up to a variable, or to a so-called "free function"...a global function that's not a member of any object.

So how does the intepreter know whether to dispatch APPEND to a method in WHATEVER, or to call a global function APPEND that takes a WHATEVER as a parameter?

It doesn't know. This means an extra global function called APPEND that takes a WHATEVER parameter has to be made...that re-dispatches the call to the method in the WHATEVER.

Hence (whatever.append ...) vs (append whatever ...) is NOT just a syntax difference.

If you add a new method to your whatever object, then whatever.newmethod will find it. But newmethod whatever won't work automatically. There has to be an additional generic definition for NEWMETHOD...a global function that the evaluator finds, and uses to do the dispatch to the object/port/datatype's method.

A Deadly Cost To Evolution and Agility

Having to add this extra global declaration to forward to the port or type method isn't just added work. It has been impossible work: there has never been a means of adding a new generic definition in Rebol.

Not only that, how can you say up front on a general definition of APPEND what all targets will consider to be legal things to append to? Maybe you can APPEND a GOB! to a GOB!, but it seems like it should be an error to add one to a file. And adding a PORT! to a string seems like it should error. The type signatures become unmaintainable and meaningless...everything has to take ANY-VALUE!.

Then there's refinements. Historically Rebol has been in a lose-lose situation... if a refinement is not on the generic's interface, then no PORT! or series or class could add it. But if there were a bunch of premade refinements on the generic definition, then there's no way of a caller knowing if the recipient actually paid attention to it or not... (see R3-Alpha's no-op of adding /LINES to READ on URL!s)

No doubt about it: This aesthetic syntax twist to put the verb in the driver's seat has wound up holding Rebol back tremendously.

New nouns come up constantly, and verbs are interpreted differently and parameterized differently for each. The object.method syntax has handled this just fine. But hardcoding a table of verbs has been a dead-end street.

Can We Keep The Syntax, But Lose The Downsides?

Err. Well....

We could say that we try to do an evaluation step, and if it fails, we don't report the error immediately. Instead, we evaluate the next thing and see if it is a PORT!, a FILE!, or a URL!.

Hence let's start from the idea that there is no definition at all for READ:

>> read
** Script Error: read word is attached to a context, but unassigned

This would mean that historically if you put something after it, that would make no difference:

>> read http://example.com
** Script Error: read word is attached to a context, but unassigned

But what if it would make a difference. Instead of reporting an error, the evaluator would do another evaluation step...to come back with http://example.com. It would look in the HTTP scheme and see the READ was available. So it would create a frame for that READ method (with whatever arguments and refinements the HTTP READ offered) and fill its arguments from there.

Is Doing Another Evaluation Like That Safe?

One thing that kind of sucks about this is that any undefined word couldn't report an error immediately, it would have to go through another step to decide if it was really an error.

A pathological bad case for that:

>> ensure-safe-to-delete: does [  ; function with no arguments
       ... else fail "not safe to delete global-file"
   ]

>> ensure-safe-to-delte delete global-file  ; typo...
** Error: file port has no ENSURE-SAFE-TO-DELTE method
    ^-- but the DELETE happened!

Here I'm assuming that DELETE FILE returns a file port. So the evaluator actually even had a PORT! to try to look the action up in. But regardless of what it returned, you'd get an error.

This problem can be thought of as "picking the verb first, before you know if the noun is available...when the generation of the noun to check can have side effects".

We can come up with pathologies the other way, too...picking the noun first doesn't save you in any general sense:

// reactor will overhead and explode if it runs for more than 10 seconds
getNuclearReactorAndStartTimer().stoppReactorTimer(10);

(Note: Using security/safety oriented examples like these is silly, as it's a pretty ridiculous suggestion to make that baseline Rebol evaluation--or JavaScript evaluation, or Python evaluation--is the kind of thing people interested in rigor would use. It's playing with fire like coding by hand in assembly is playing with fire, and that's just how it is. You might build a secure dialect on top of it by controlling when and how evaluation is done, but...probably not.)

A more relevant issue is that if we're running things out of sequence, that people might wonder about the order of errors:

>> negtae multiplyy 10 20
** Error: multiplyy not bound and INTEGER! has no multiplyy method

Why did you find out about the multiplyy before knowing that there's no negtae? Because it was busy trying to figure out if there was a negtae method on the thing it was receiving.

What About HELP and Discoverability?

The port schemes are registered in a list the system knows about. So the error could say, instead of "I don't know what READ is", it could mention that "READ is available for the following object types and port schemes"

You could also get that list when you say help read. And you could get the specific entry when you say help [read file://] or help read 'file or help read file or whatever winds up being the syntax supported for that question (HELP READ would presumably tell you what to type to get at the specific READ help you were interested in from the list).

The issue comes up with HELP APPEND as well. You're now in a situation where the APPEND for series could be entirely different from the APPEND for FILE! ports, etc.

This triggers all kinds of design issues, like how would you be able to make actions that apply to sets of types... like ANY-SERIES!, vs having to make separate actions for each individual datatype.

How Would You Do Something Like APPLY a READ Method?

Um. Hmm. Well, here we see the problem with magic.

So APPLY has the same problem as HELP. It has to know what kind of READ you are talking about before it can pick the frame it's specializing.

apply ['read 'file] [/source %foo.txt /seek 10]

But remember it's worse than that, because this would hit anything that was a GENERIC...that means APPEND and many other things people are used to "treating like functions".

This shows something those GENERIC definitions bought us...the ones that forward-define the number of arguments something like APPEND would take. It served as a sort of minimum contract that you could use for things like APPLY. It's like an interface definition, which says anything that is "applyable" obeys the contract.

Without these generic definitions, you have to say things like:

 apply :any-series!.append [...]

I don't know if that's so terrible... but it does make it seem appealing to be able to have something that can act as a common denominator to do the dispatch automatically.

Main Thrust on this Post Is "How Do We Eliminate the Liability"

I just want to emphasize how this verb-first has been hurting things, and that it has to be mitigated.

No answers yet. Please feel free to throw in thoughts.