How to bridge arguments to user natives / JS natives

The current plan for calling JavaScript from Rebol involves the creation of "JavaScript natives". These would have specs that are BLOCK!s, which would be familiar as in functions. Then they would have bodies that were TEXT! strings of JavaScript code.

A seemingly unambitious example which used JavaScript to perform addition might look like:

jadd: js-native [a [integer!] b [integer!]] {
    var a = reb.UnboxInteger("a");
    var b = reb.UnboxInteger("b");
    return reb.Integer(a + b);
}

So that would mean from Rebol you could say jadd 10 20 and it would actually perform that addition in JavaScript.

Let's talk about some axes of potential improvement.

Generic Unboxing

Because C lacks dynamic typing, we need separate routines for unboxing integers and strings. But JavaScript could conceivably do it automatically, calling rebUnbox() instead of rebUnboxInteger()...and then know if it was an integer type, to give back a JavaScript "Number":

jadd: js-native [a [integer!] b [integer!]] {
    var a = reb.Unbox("a");
    var b = reb.Unbox("b");
    return reb.Integer(a + b);
}

On the downside of this, you don't get an assertion or check that the thing you extracted is an integer...so you might get back a string or other object, if the type isn't what you expect. It's probably best to offer both and let people decide which they want.

Auto-conversion of JS Number non-Objects

It might seem cool if the API could automatically convert JavaScript numbers into Rebol values, and not have to use rebInteger(n) or rebI(n). For instance, this seems good:

 var n = 5;
 var s = reb.Text("Hello World");
 reb.Elide("loop", n, "[print", s, "]");

But right now, the problem with this is that s is a pointer to a Rebol value on the webassembly heap. And that's some big random-looking Number. There's no way to tell that 5 isn't meant to be a pointer too.

One trick I thought of, though, involves realizing that JavaScript has both primitives and primitive objects. For reference, see this article on JavaScript's Primitive Wrapper Objects. These lightweight objects come into existence as a means of being able to call methods, e.g. n.toString().

Hence--what if Rebol handles were passed back not as JavaScript numbers, but as number objects? These would presumably be more lightweight than an ordinary object, so not very costly. That way, when plain JavaScript numbers were used it could be assumed that they should be automatically treated as if they were numbers.

Our adding example could then simplify to:

jadd: js-native [a [integer!] b [integer!]] {
    var a = reb.Unbox("a");
    var b = reb.Unbox("b");
    return a + b;
}

Auto-conversion of JS String objects

Strings might benefit from the distinction of primitives vs. primitive objects as well. Today, plain non-object strings are LOAD-ed and executed as code:

REBVAL *v = reb.Value("Hello World");

What happens there is that it treats that as two WORD!s. But if you said:

REBVAL *v = reb.Value("{Hello World}");

That would be a TEXT! string. You could also use rebText() or rebT() if your string is in a variable.

But string objects could be handled differently, and assumed to be string literals. So you couldn't say return "hello" from a user native, but you could say return new String("hello");

That's more typing than just return reb.Text("hello"). But where it might come in handy in that you could write a generic JavaScript routine that could return a string to be passed unmodified to either JS or Rebol.

function genericName(...) { ... return new String(...); }
console.log("used direct from JS: " + genericName(...));
reb.Elide("print [{used direct from Rebol:}", genericName(...), "]");

Parameter Unpacking as JavaScript

What I have in mind for both C user natives and JavaScript natives is not to try and give the generated JavaScript function any actual arguments. So back to the example:

jadd: js-native [a [integer!] b [integer!]] {
    var a = reb.Unbox("a");
    var b = reb.Unbox("b");
    return a + b;
}

We're generating and running a JavaScript function with no arguments, so if it wants to get the values of a and b it has to go through Rebol code (automatically bound into the function frame) to access them.

It is possible to give that function arguments. These arguments could be the raw Rebol values, or they could be pre-rebUnbox'd. At the extreme of pre-unboxing, you could write just:

jadd: js-native [a [integer!] b [integer!]] {
    return a + b;
}

I think it's better to not have arguments to the function.

  • The TCC-based C natives don't have the luxury of being able to do things like this in a platform-independent fashion. Feeding arguments to a C function varies from platform-to-platform based on the Application Binary Interface (ABI). It's more consistent between C and JavaScript extensions to not do it.

  • There's no support in the EM_ASM() bridge for variadic calls. Doing it in JavaScript and calling from C involves jumping through a lot of hoops, possibly using eval() when it wouldn't otherwise be necessary, and is less performant. If the function took zero parameters and returned an integer heap address it would be a lot cleaner.

  • JavaScript variable naming is more limited than Rebol parameter naming. So there'd have to be some invented mapping between what name you used for the parameters in your spec and the JavaScript names.

  • You don't really know what properties the JavaScript code wants from its parameters, and pre-extracting would be presumptuous.

So I think JavaScript natives should be running 0-arity functions, and have to go through libRebol APIs to get at their arguments. That will require some new mechanics.

Another thing that has been kind of on my mind is the relationship between JavaScript's undefined and Rebol's unset or bad states.

I've set it up so that if you have a JS-NATIVE that doesn't have a return statement, that returns a BAD-WORD!...also if you say just return; ... this makes it consistent with the behavior of Rebol now (you can say if true [return] and the absence of an argument causes it to return a ~void~ isotope)

However, it seems to me that reb.Value("print {Hello}"); should not return undefined the way that reb.Value("select [a 10 b 20] 'c"); returns null. One reason is because JavaScript is bad about conflating falsey things, and it considers undefined to be falsey. So it is probably good to have only one falsey result that can come back from running arbitrary code.

It also seems like it's probably better to error on undefined when given as an argument:

function foo() { }
reb.Elide("append data", foo())

Would you rather that run without error, appending a BAD-WORD! to data, or have reb.Elide() complain that an argument is undefined?

Guess my point is that it seems like an area where noticing the parallel is useful, but care should be taken in where to make the mapping apply...

This was the initial concept for how JavaScript natives would be able to access arguments...just by name. It means you could have just as easily written:

jadd: js-native [a [integer!] b [integer!]] {
    var sum = reb.UnboxInteger("a + b");
    return reb.Integer(sum);
}

(That would be a bit of a waste for why you didn't do anything with the JavaScript, but, point is that there's nothing special about arguments...they're visible to code execution.)

Binding didn't really offer any answers for how to do this. So we ended up with the ugly reb.Arg() and reb.ArgR() functions. It was a compromise that had to be made because binding was weak.

The problems have gotten even bigger now...as the JS-NATIVEs are being declared in modules. Those modules will presumably want to have visibility of plain FUNCs that you declare in that same module (or that you import) to call.

On both points, we should not compromise. So I've worked up a prototype that does both.

So hopefully it means the end of reb.ArgR() and reb.Arg() are coming soon!

I think that will also solidify the case for eliminating the Q() forms of the APIs (like reb.ValueQ(), reb.DidQ()).

2 Likes

Two years after asking the question, the model has changed. (Though note that I retroactively update posts to reflect modern terminology if it helps make the points easier to understand.)


One thing isn't an option at the moment: passing isotopes unquoted to APIs.

REBVAL *isotope = rebValue("~baddie~");  // evaluates to ~baddie~ isotope
rebElide("print mold type of", isotope);  // rebElide will error on this

You have to use rebQ() to quote (actually meta) an isotope to a plain BAD-WORD! for it to be legal in the instruction stream. Because what functions like rebElide() get are conceptually blocks.

They have to be conceptually blocks... because imagine you are a debugger and you want to break between the PRINT and the MOLD in the code passed to rebElide. Then you want to show the user the code state, as a block. you can't

[print mold type of ~baddie~]  ; !!! Plain BAD-WORD!, where's the isotope?

There's no way to capture a block and give it back to represent the isotope state, because isotopes can't live in blocks.

(Note: Though NULL can't be represented in a block either, API functions like rebElide make a special exception...NULL will be spliced in as the plain ~null~ BAD-WORD!. Then (@ ~null~) becomes pure NULL. It's all very clever, I assure you. :crazy_face: But we don't want all bad words to do this...it's only because of null isotopes special behavior to become null on assignment that this compromise is plausible.)

So At Least We Know What The Answer Isn't...

Passing an isotope is off the table, so what's left?

One fairly straightforward option is that we could make reb.Q(undefined) in JavaScript produce the non-isotope BAD-WORD! ~undefined~

This means that:

function foo() { }
reb.Elide("print type of", foo())  // rebElide() error, "~undefined~ isotope"

function foo() { }
reb.Elide("print type of", reb.Q(foo()))  // BAD-WORD!, e.g. ~undefined~

That doesn't editorialize the JavaScript undefined concept into being anything it's not.

It's a conservative choice, so I think I'll go with it. It's easier to search and replace ~undefined~ later than to tease out which ~unset~ are which...

1 Like

This ideal concept has been tried in various forms, with a problem that if you use the native for context you run up against problems that if it calls service routines, there's no way for the system to know that such calls have occurred. This means your "a" and "b" resolution won't just be from within the JS-NATIVE body, but also any other JavaScript service function that uses the API you call.

So reb.Arg() and reb.ArgR() have stuck around as a somewhat ugly solution.

jadd: js-native [a [integer!] b [integer!]] {
    var a = reb.Unbox(reb.ArgR("a"));  // arg with auto-Release
    var b = reb.Unbox(reb.ArgR("b"));
    return a + b;
}

But if strings had binding, we could do better...

jadd: js-native [a [integer!] b [integer!]] {
    var a = reb.Unbox(@a);
    var b = reb.Unbox(@b);
    return a + b;
}

Since @ isn't legal in JavaScript, the JS-NATIVE could scan for them and then hook them up to substitute handles appropriate to the parameters (effectively automatic reb.ArgR() calls). Though it would have to do so with an awareness of JavaScript string syntax, so it didn't try to do such substitutions inside of strings!

This could also allow access to other variables in visibility of the string that weren't function parameters.

But what happens when you append strings to each other or otherwise mutate them? :-/ This is another epicycle of the question of how "scopes" get composed. But unlike when structured operations are performed, there's nowhere to define these merges. :frowning:

So big questions arise, but still a promising-seeming concept.

1 Like