How to bridge arguments to user natives / JS natives

hostilefork · September 15, 2018, 11:04am

UPDATE 2024: This problem now has a solution!

API Breakthrough: Scope Detection In JavaScript and C !

The current plan for calling JavaScript from Rebol involves the creation of "JavaScript natives". These would have specs that are BLOCK!s, which would be familiar as in functions. Then they would have bodies that were TEXT! strings of JavaScript code.

A seemingly unambitious example which used JavaScript to perform addition might look like:

jadd: js-native [a [integer!] b [integer!]] {
    var a = reb.UnboxInteger("a");
    var b = reb.UnboxInteger("b");
    return reb.Integer(a + b);
}

So that would mean from Rebol you could say jadd 10 20 and it would actually perform that addition in JavaScript.

Let's talk about some axes of potential improvement.

Generic Unboxing

Because C lacks dynamic typing, we need separate routines for unboxing integers and strings. But JavaScript could conceivably do it automatically, calling rebUnbox() instead of rebUnboxInteger()...and then know if it was an integer type, to give back a JavaScript "Number":

jadd: js-native [a [integer!] b [integer!]] {
    var a = reb.Unbox("a");
    var b = reb.Unbox("b");
    return reb.Integer(a + b);
}

On the downside of this, you don't get an assertion or check that the thing you extracted is an integer...so you might get back a string or other object, if the type isn't what you expect. It's probably best to offer both and let people decide which they want.

Auto-conversion of JS Number non-Objects

It might seem cool if the API could automatically convert JavaScript numbers into Rebol values, and not have to use rebInteger(n) or rebI(n). For instance, this seems good:

 var n = 5;
 var s = reb.Text("Hello World");
 reb.Elide("loop", n, "[print", s, "]");

But right now, the problem with this is that s is a pointer to a Rebol value on the webassembly heap. And that's some big random-looking Number. There's no way to tell that 5 isn't meant to be a pointer too.

One trick I thought of, though, involves realizing that JavaScript has both primitives and primitive objects. For reference, see this article on JavaScript's Primitive Wrapper Objects. These lightweight objects come into existence as a means of being able to call methods, e.g. n.toString().

Hence--what if Rebol handles were passed back not as JavaScript numbers, but as number objects? These would presumably be more lightweight than an ordinary object, so not very costly. That way, when plain JavaScript numbers were used it could be assumed that they should be automatically treated as if they were numbers.

Our adding example could then simplify to:

jadd: js-native [a [integer!] b [integer!]] {
    var a = reb.Unbox("a");
    var b = reb.Unbox("b");
    return a + b;
}

Auto-conversion of JS String objects

Strings might benefit from the distinction of primitives vs. primitive objects as well. Today, plain non-object strings are LOAD-ed and executed as code:

REBVAL *v = reb.Value("Hello World");

What happens there is that it treats that as two WORD!s. But if you said:

REBVAL *v = reb.Value("{Hello World}");

That would be a TEXT! string. You could also use rebText() or rebT() if your string is in a variable.

But string objects could be handled differently, and assumed to be string literals. So you couldn't say return "hello" from a user native, but you could say return new String("hello");

That's more typing than just return reb.Text("hello"). But where it might come in handy in that you could write a generic JavaScript routine that could return a string to be passed unmodified to either JS or Rebol.

function genericName(...) { ... return new String(...); }
console.log("used direct from JS: " + genericName(...));
reb.Elide("print [{used direct from Rebol:}", genericName(...), "]");

Parameter Unpacking as JavaScript

What I have in mind for both C user natives and JavaScript natives is not to try and give the generated JavaScript function any actual arguments. So back to the example:

jadd: js-native [a [integer!] b [integer!]] {
    var a = reb.Unbox("a");
    var b = reb.Unbox("b");
    return a + b;
}

We're generating and running a JavaScript function with no arguments, so if it wants to get the values of a and b it has to go through Rebol code (automatically bound into the function frame) to access them.

It is possible to give that function arguments. These arguments could be the raw Rebol values, or they could be pre-rebUnbox'd. At the extreme of pre-unboxing, you could write just:

jadd: js-native [a [integer!] b [integer!]] {
    return a + b;
}

I think it's better to not have arguments to the function.

The TCC-based C natives don't have the luxury of being able to do things like this in a platform-independent fashion. Feeding arguments to a C function varies from platform-to-platform based on the Application Binary Interface (ABI). It's more consistent between C and JavaScript extensions to not do it.
There's no support in the EM_ASM() bridge for variadic calls. Doing it in JavaScript and calling from C involves jumping through a lot of hoops, possibly using eval() when it wouldn't otherwise be necessary, and is less performant. If the function took zero parameters and returned an integer heap address it would be a lot cleaner.
JavaScript variable naming is more limited than Rebol parameter naming. So there'd have to be some invented mapping between what name you used for the parameters in your spec and the JavaScript names.
You don't really know what properties the JavaScript code wants from its parameters, and pre-extracting would be presumptuous.

So I think JavaScript natives should be running 0-arity functions, and have to go through libRebol APIs to get at their arguments. That will require some new mechanics.

hostilefork · February 27, 2024, 1:21am

2 posts were split to a new topic: The Meaning of JavaScript's "Undefined" in the API

hostilefork · March 23, 2021, 6:30pm

hostilefork:

A seemingly unambitious example which used JavaScript to perform addition might look like:
jadd: js-native [a [integer!] b [integer!]] {
    var a = reb.UnboxInteger("a");
    var b = reb.UnboxInteger("b");
    return reb.Integer(a + b);
}

This was the initial concept for how JavaScript natives would be able to access arguments...just by name. It means you could have just as easily written:

jadd: js-native [a [integer!] b [integer!]] {
    var sum = reb.UnboxInteger("a + b");
    return reb.Integer(sum);
}

(That would be a bit of a waste for why you didn't do anything with the JavaScript, but, point is that there's nothing special about arguments...they're visible to code execution.)

Binding didn't really offer any answers for how to do this. So we ended up with the ugly reb.Arg() and reb.ArgR() functions. It was a compromise that had to be made because binding was weak.

The problems have gotten even bigger now...as the JS-NATIVEs are being declared in modules. Those modules will presumably want to have visibility of plain FUNCs that you declare in that same module (or that you import) to call.

On both points, we should not compromise. So I've worked up a prototype that does both.

So hopefully it means the end of reb.ArgR() and reb.Arg() are coming soon!

I think that will also solidify the case for eliminating the Q() forms of the APIs (like reb.ValueQ(), reb.DidQ()).

hostilefork · October 27, 2021, 7:17am

hostilefork:

What I have in mind for both C user natives and JavaScript natives is not to try and give the generated JavaScript function any actual arguments. So back to the example:
jadd: js-native [a [integer!] b [integer!]] {
    var a = reb.Unbox("a");
    var b = reb.Unbox("b");
    return a + b;
}

This ideal concept has been tried in various forms, with a problem that if you use the native for context you run up against problems that if it calls service routines, there's no way for the system to know that such calls have occurred. This means your "a" and "b" resolution won't just be from within the JS-NATIVE body, but also any other JavaScript service function that uses the API you call.

So reb.Arg() and reb.ArgR() have stuck around as a somewhat ugly solution.

jadd: js-native [a [integer!] b [integer!]] {
    var a = reb.Unbox(reb.ArgR("a"));  // arg with auto-Release
    var b = reb.Unbox(reb.ArgR("b"));
    return a + b;
}

There's no particular safety in them, in the sense that you could call reb.ArgR() from inside a service routine too, and it wouldn't know the difference. But it's fairly visible when you're doing it.