How to bridge arguments to user natives / JS natives


#1

The current plan for calling JavaScript from Rebol involves the creation of “JavaScript natives”. These would have specs that are BLOCK!s, which would be familiar as in functions. Then they would have bodies that were TEXT! strings of JavaScript code.

A seemingly unambitious example which used JavaScript to perform addition might look like:

jadd: js-native [a [integer!] b [integer!]] {
    var a = rebUnboxInteger("a");
    var b = rebUnboxInteger("b");
    return rebInteger(a + b);
}

So that would mean from Rebol you could say jadd 10 20 and it would actually perform that addition in JavaScript.

Let’s talk about some axes of potential improvement.

Generic Unboxing

Because C lacks dynamic typing, we need separate routines for unboxing integers and strings. But JavaScript could conceivably do it automatically, calling rebUnbox() instead of rebUnboxInteger()…and then know if it was an integer type, to give back a JavaScript “Number”:

jadd: js-native [a [integer!] b [integer!]] {
    var a = rebUnbox("a");
    var b = rebUnbox("b");
    return rebInteger(a + b);
}

On the downside of this, you don’t get an assertion or check that the thing you extracted is an integer…so you might get back a string or other object, if the type isn’t what you expect. It’s probably best to offer both and let people decide which they want.

Auto-conversion of JS Number non-Objects

It might seem cool if the API could automatically convert JavaScript numbers into Rebol values, and not have to use rebInteger(n) or rebI(n). For instance, this seems good:

 var n = 5;
 var s = rebText("Hello World");
 rebElide("loop", n, "[print", s, "]");

But right now, the problem with this is that s is a pointer to a Rebol value on the webassembly heap. And that’s some big random-looking Number. There’s no way to tell that 5 isn’t meant to be a pointer too.

One trick I thought of, though, involves realizing that JavaScript has both primitives and primitive objects. For reference, see this article on JavaScript’s Primitive Wrapper Objects. These lightweight objects come into existence as a means of being able to call methods, e.g. n.toString().

Hence–what if Rebol handles were passed back not as JavaScript numbers, but as number objects? These would presumably be more lightweight than an ordinary object, so not very costly. That way, when plain JavaScript numbers were used it could be assumed that they should be automatically treated as if they were numbers.

Our adding example could then simplify to:

jadd: js-native [a [integer!] b [integer!]] {
    var a = rebUnbox("a");
    var b = rebUnbox("b");
    return a + b;
}

Auto-conversion of JS String objects

Strings might benefit from the distinction of primitives vs. primitive objects as well. Today, plain non-object strings are LOAD-ed and executed as code:

 REBVAL *v = rebRun("Hello World");

What happens there is that it treats that as two WORD!s. But if you said:

REBVAL *v = rebRun("{Hello World}");

That would be a TEXT! string. You could also use rebText() or rebT() if your string is in a variable.

But string objects could be handled differently, and assumed to be string literals. So you couldn’t say return "hello" from a user native, but you could say return new String("hello");

That’s more typing than just return rebText("hello"). But where it might come in handy in that you could write a generic JavaScript routine that could return a string to be passed unmodified to either JS or Rebol.

function genericName(...) { ... return new String(...); }
console.log("used direct from JS: " + genericName(...));
rebElide("print [{used direct from Rebol:}", genericName(...), "]");

Parameter Unpacking as JavaScript

What I have in mind for both C user natives and JavaScript natives is not to try and give the generated JavaScript function any actual arguments. So back to the example:

jadd: js-native [a [integer!] b [integer!]] {
    var a = rebUnbox("a");
    var b = rebUnbox("b");
    return a + b;
}

We’re generating and running a JavaScript function with no arguments, so if it wants to get the values of a and b it has to go through Rebol code (automatically bound into the function frame) to access them.

It is possible to give that function arguments. These arguments could be the raw Rebol values, or they could be pre-rebUnbox’d. At the extreme of pre-unboxing, you could write just:

jadd: js-native [a [integer!] b [integer!]] {
    return a + b;
}

I think it’s better to not have arguments to the function.

  • The TCC-based C natives don’t have the luxury of being able to do things like this in a platform-independent fashion. Feeding arguments to a C function varies from platform-to-platform based on the Application Binary Interface (ABI). It’s more consistent between C and JavaScript extensions to not do it.

  • There’s no support in the EM_ASM() bridge for variadic calls. Doing it in JavaScript and calling from C involves jumping through a lot of hoops, possibly using eval() when it wouldn’t otherwise be necessary, and is less performant. If the function took zero parameters and returned an integer heap address it would be a lot cleaner.

  • JavaScript variable naming is more limited than Rebol parameter naming. So there’d have to be some invented mapping between what name you used for the parameters in your spec and the JavaScript names.

  • You don’t really know what properties the JavaScript code wants from its parameters, and pre-extracting would be presumptuous.

So I think JavaScript natives should be running 0-arity functions, and have to go through libRebol APIs to get at their arguments. That will require some new mechanics.


About the Emscripten category
#2

Another thing that has been kind of on my mind is the relationship between JavaScript’s undefined and Rebol’s VOID!.

I’ve set it up so that if you have a JS-NATIVE that doesn’t have a return statement, that returns a VOID!..also if you say just return; … this makes it consistent with the behavior of Rebol now (you can say if true [return] and the absence of an argument causes it to return a VOID!)

However, it seems to me that rebRun(“print {Hello}”); should not return undefined the way that rebRun(“select [a 10 b 20] 'c”); returns null. One reason is because JavaScript is bad about conflating falsey things, and it considers undefined to be falsey. So it is probably good to have only one falsey result that can come back from running arbitrary code.

It also seems like it’s probably better to error on undefined when given as an argument:

function foo() { }
rebElide("append data", foo())

Would you rather that run without error, appending a VOID! to data, or have rebElide() complain that an argument is undefined?

Guess my point is that it seems like an area where noticing the parallel is useful, but care should be taken in where to make the mapping apply…