No Preprocessing, No FFI, Just Awesome: rebFunction()

You can now create your own amazingly powerful Rebol natives in plain C, powered by the new binding, in a way that is OUT OF THIS WORLD.

:ringer_planet:

Here's a full C program using the Ren-C libRebol

The mechanics heavily rely on Pure Virtual Binding II, and having it look so clean is due to macro tricks involving shadowed variables as a proxy for knowing the C function stack:

#define LIBREBOL_BINDING  binding

#include "rebol.h"
typedef RebolValue Value;
typedef RebolContext Context;
typedef RebolBounce Bounce;

static Context* binding = nullptr;  // default inherit of LIB

void Subroutine(void) {
    rebElide(
        "assert [action? print/]",
        "print -{Subroutine() has original ASSERT and PRINT!}-"
    );
}

const char* Sum_Plus_1000_Spec = "[ \
    -{Demonstration native that shadows ASSERT and PRINT}- \
    assert [integer!] \
    print [integer!] \
]";
Bounce Sum_Plus_1000_Impl(Context* binding)
{
    Value* hundred = rebValue("fourth [1 10 100 1000]");
    Subroutine();
    return rebValue("print + assert +", rebR(hundred));
}

int main() {
    rebStartup();

    Value* action = rebFunction(Sum_Plus_1000_Spec, &Sum_Plus_1000_Impl);

    rebElide(
        "let sum-plus-1000: @", action,
        "print [-{Sum Plus 1000 is:}- sum-plus-1000 5 15]"
    )

    rebRelease(action);
    rebShutdown();
    return 0;
}

This outputs:

Subroutine() has original ASSERT and PRINT!
Sum Plus 1000 is 1020

:ringer_planet:

If you use C++, It Gets Niftier, But Same Internals!

  • Raw strings R"(...)" mean you don't need backslashes

  • Lambdas mean you don't need to name your implementation function

  • Variadic Template Packing allows custom conversions to Value* from int (no rebI() needed!) or any other datatype! Add your own converters for any C++ class!

    Value* action = rebFunction(R"([
        -{Demonstration native that shadows ASSERT and ADD}-
        assert [integer!]
        add [integer!]
    ])",
    [](Context* binding) -> Bounce {
        int thousand = Subroutine();
        return rebValue("add + assert +", thousand);
    });
    

But it's better than that because we can make Value a smart pointer that automatically gets released when the last reference goes away. RenCpp did that, but we can do it much more lightweight in libRebol...coming soon!

Elegant Mechanics, Without Resorting to FFI

The smarts of the API macros like rebElide() and rebValue() is that they pick up the binding by name that you give, so you don't have to pass it every time. When you're inside your native's implementation, the shadowing of the argument overrides the global variable.

And of course being to do this at all hinges on throwing out the playbook from Rebol's historical binding, and doing something coherent and useful.

The Function Gets a Definitional Return. But...Why?

So you might think there's no good reason to have a definitional return. Because how would you ever run it?

const char* Illegal_Return_Spec = "[ \
    -{Showing that you "can't" use RETURN in a rebFunction()}- \
    arg [integer!] \
]";
Bounce Illegal_Return_Impl(Binding* binding)
{
    rebElide("return arg + 1000");
    DEAD_END;
}

When you call rebElide(), it crosses the API boundary and the C code is still on the stack. You can't unwind across it... unless you use longjmp or exceptions, and that's very thorny and brittle.

But Ren-C has Continuations :play_or_pause_button:

Note that the function you supply to do the native's work doesn't return a Value*, it returns something called a Bounce.

Bounce is a superset of Value*, that includes the ability to encode other instructions. One of those instructions is to ask the evaluator to do more work on the C function's behalf--even though it's no longer on the stack--before returning a value. You can ask to be called back again after that work is done (rebContinue())...or you can just transfer control to some additional code and let what it does be the answer (rebDelegate()).

And within that code, it can use the definitional RETURN to deliver the value to the caller of your native!

const char* Working_Return_Spec = "[ \
    -{Showing that you *can* use RETURN in an API Continuation}- \
    return: [tag!] \
    arg [integer!] \
];
Bounce Working_Return_Impl(Binding* binding)
{
    int bigger = rebUnboxInteger(arg) + 1000;  // whatever C processing

    return rebDelegate(
        "if", rebI(bigger), "> 10000 [return <big>]",
        "print -{It wasn't big!}-",
        "return <small>"
     );
}

I believe this is one of the most clever C language bridging ideas ever made - bringing still more uniqueness to Rebol's already very unique offering. And of course, C++ can throw in many improvements (not needing rebI(...) and just using integers directly and getting values, lifetime management for API handles with smart pointers so you don't need to rebRelease() them, etc. etc.

So much is enabled by this new binding, it's light years ahead of what we're used to.

:ringer_planet:

1 Like

I should mention that RenCpp could register C callbacks as function implementations nine years ago:

auto watchFunction = Function::construct(
    " {WATCH dialect for monitoring and un-monitoring in the workbench}"
    " :arg [word! get-word! path! get-path! block! group! integer! tag!]"
    "     {word to watch or other legal parameter, see documentation)}"
    " /dialect {Interpret as instruction to WATCH vs. raw value}",

    [this](
        AnyValue const & argOriginal, AnyValue const & dialect
    )
        -> optional<AnyValue>
    {
        WatchList & watchList = *getTabInfo(repl()).watchList;

        AnyValue arg = argOriginal;

        optional<Tag> label;

        if (hasType<Block>(arg) || hasType<Group>(arg)) {
    .....

But the mechanics to try and get it to pass the values as parameters to a function like that (a C++ lambda in that case) were horrific.

That function.hpp file is a blight! But of course, today's techniques were far from possible, so it seemed like the only way to do it. Also I was spending a lot of time just whipping Rebol into shape so it could do anything like this.

Overall RenCpp was pretty well on track for what the API should look like and how it should function. But it took some clever innovations in Cell and Series design--plus a revolution in binding--plus me realizing what not to do--to make it happen in a truly good way.

1 Like

Comparison to redRoutine()

So libRed's whole model is pretty much a dead end. But speaking just about redRoutine() specifically, it's more or less the same principle for getting called as RenCpp's was.

libRed - Registering a callback function

#include "red.h"
#include <stdio.h>

red_integer add(red_integer a, red_integer b) {
    return redInteger(redCInt32(a) + redCInt32(b));
}

int main(void) {
    redRoutine(redWord("c-add"), "[a [integer!] b [integer!]]", (void*) &add);
    printf(redCInt32(redDo("c-add 2 3")));
    return 0;
}

You write a C function with a certain arity, and then line it up with a spec that has the same arity. The function receives the arguments as multiple C function arguments...so individual Cell pointers.

Here's the Red/System implementation that creates the routine:

red/libRed/libRed.red at dbc93da47047667023a66c5edf1aa1d63ff6f0d0 · red/red · GitHub

But let's get to what actually runs the routine, EXEC-ROUTINE

red/runtime/interpreter.reds at dbc93da47047667023a66c5edf1aa1d63ff6f0d0 · red/red · GitHub

They don't actually know how many arguments the function takes (RenCpp could know by recursive decomposition, but that required C++). Since they don't know the number of args, if there's a mismatch between your implementation function's args and how many args they pass in the spec it will likely crash.

A very weak point of this is lost on the casual observer, which is that all the API red_values (a pointer) are kept in a "ring". You can see it pushing it with push red/ext-ring/store arg This is a fixed number of API handles that are given out with no lifetime management. It's just that after you've allocated 50 handles one of your previously known handles goes bad. Yup, 50:

red/libRed/libRed.red at dbc93da47047667023a66c5edf1aa1d63ff6f0d0 · red/red · GitHub

So if you write one redRoutine, and if you get your arguments, those arguments could go bad if you do something that also uses that ring. Like if you call another redRoutine that takes 5 arguments 10 times, the arguments you received are now corrupt. But other libRed functions make things on this ring. Definitely broken.

Anyhow... being able to access the local variables and arguments by name in the C function as part of textual code is many orders of magnitude better... but it builds on a LOT of design and implementation work. libRebol is there for Red to steal from if they wish--which they should wish--but it would probably take them years (another decade?) to get parity in functionality.

2 Likes

While this is kind of cool, I wondered if it was a bit superfluous, and couldn't just be done with CATCH and THROW, and define with rebLambda() instead of rebFunction()

const char* Use_Catch_Spec = "[ \
    -{Showing that you can use CATCH in an API Continuation}- \
    return: [tag!]  /* here is the problem, discussed below */ \
    arg [integer!] \
];
Bounce Use_Catch_Spec_Impl(Specifier* specifier)
{
    int bigger = rebUnboxInteger(arg) + 1000;  // whatever C processing

    return rebDelegate("catch [",  /* here is the catch */
        "if", rebI(bigger), "> 10000 [throw <big>]",  /* a throw */
        "print -{It wasn't big!}-",
        "throw <small>"  /* another throw */
     "]");
}

The problem is the longstanding one that lambdas don't have a notation to say they do type checking. And type checking is a nice feature to have. But we don't have a way to denote it besides [return: ...] in specs, which implies that there is a RETURN function available. :thinking:

Could we ignore this, and say the RETURN: can be dissociated from the idea that a variable in the frame is also called return?

>> /foo: lambda [return: [integer!] return] [return + 1]

>> foo 5
== 6

>> /bar: lambda [return: [block!] return] [return + 1]
** Error: BAR is supposed to return [block!]

This doesn't seem as crazy as it might have seemed at one time. There's kind of a similar question about yielders, which currently do it with:

 >> /y: yielder [yield: [integer] x] [yield x + 1, yield x + 2]

 >> y 10
 == 11

 >> y 20
 == 22

Would it read better as:

 >> /y: yielder [return: [integer] x] [yield x + 1, yield x + 2]

In that case, I kind of feel like the yield: reinforces the fact that if you want to call an argument or local RETURN for some reason (and you well might, e.g. as a specialization of YIELD:FINAL for instance)... it feels better.

Anyway... the existence of LAMBDA has gone from an appeasement of people who wanted functions to drop out their last result to instead being a building block of necessity--for making higher level constructs like FUNC that have alternative definitions of RETURN, or for simply writing service code that wants to use RETURN as defined in its calling context. That can apply to API functions as well, so there will need to be a rebLambda().