C++ Interface Reinvented

I believe libRebol to be one of the most interesting pure C language bindings out there. However, when you have a C++ compiler on hand it seems it should be able to compile something like:

REBVAL *ten = rebValue("add", 4, 6);

But rebValue has been based on "sniffing" items in C's va_list, so it has required everything passed to conform to a bit-pattern of something it could "sniff". They all had to be pointers to something with a compatible layout ("add" is a pointer to UTF-8, leaving room for alternatives in the illegal UTF-8 space). So in the above code, 4 and 6 would get interpreted as pointers, and dereferencing memory location 4 or 6 would crash.

To get around this in C, you must use helpers that allocate detectable things, like rebI() for making a "transient integer" (e.g. one that would be freed by rebValue as it processed it):

REBVAL *ten = rebValue("add", rebI(4), rebI(6));

For C++ we expect more! So I'm pleased to say that a solution (even better than in RenCpp's) for passing C++ values has been implemented.

The Very Generic TO_REBARG

After some acrobatics involving C++'s templated variadic parameter packs, I've styled a solution for C++ that works by building on top of the mechanics used by the C.

If you want to make a type passable by value, you just write an overload of to_rebarg. Integer is a simple example:

inline static const void *to_rebarg(int i)
    { return rebI(i); }

While the C++ template is recursively unpacking the parameters, it calls to_rebarg to process each one. You can translate things as simply or as complex-ly as you like...and you don't have to edit the library to add support for a new type. Imagine:

class MyDate {
   friend const void* to_rebarg(const MyDate &date);  // allow access to private
   private:
       int day;
       int month;
       int year;
   public:
      // your C++ class code here
};

const void* to_rebarg(const MyDate &date) {
    return rebR(  // autorelease
        rebValue("make date! [", date->day, date->month, date->year, "]")
    );
} 

Now all you would have to do if you wanted to pass a date in would be something like:

void OutputDate(const MyDate &date) {
    rebElide("print [{The date is}", date, "]");
}

C++ Helping Catch Bugs in C89 Builds

Being able to build with a C89 compiler is not really that relevant, but we still shoot for it. Which means we don't take variadic macros for granted, and in the core all API calls have to be explicitly terminated with rebEND. So if we'd wanted to compile the code from up top, we'd need to say:

REBVAL *ten = rebValue("add", rebI(4), rebI(6), rebEND);

The way that C++ templates "unpack" variadic parameters is via compile-time recursion, which actually needs to differentiate the last step to terminate it. So as a special case, when you use the C++ build on a file with REBOL_EXPLICIT_END (e.g. anything from core), it warns you if a rebEND is not in that last slot.

It caught a few mistakes, though not all that many--because missing ones cause crashes, and any in active code would have been fixed. But those crashes take time to investigate.

UPDATE: I forgot but rediscovered that there actually is a reason to care about rebEND at some callsites. This is when you want to use preprocessor macros inside a call. You can't put macros-inside macros. So for example, from the FFI:

ffi_abi abi = (ffi_abi)rebUnboxInteger(
  "switch", rebQ(word), "[",

     "'default [", rebI(FFI_DEFAULT_ABI), "]",

  #ifdef X86_WIN64

     "'win64 [", rebI(FFI_WIN64), "]",

  #elif defined(X86_WIN32) \
       || defined(TO_LINUX_X86) || defined(TO_LINUX_X64)

    "'sysv [", rebI(FFI_SYSV), "]",

...

You can't do that unless you fall back on the C89-like mechanism. The way you do this now is:

ffi_abi abi = (ffi_abi)LIBREBOL_NOMACRO(rebUnboxInteger)(

That gives you a version that in C99+ doesn't offer the convenience of the rebEND. But the C++ version is not a macro, so it can have #ifdefs inside it and not require a rebEND.

Still More To Go

This is a necessary step in making the C++ API pleasant to work with. The next big question is on how to make a class that wraps up a value pointer and manages its lifetime, so you don't have to worry about rebR() or rebRelease(), rebUnmanage(), and other lifetime-related calls. But getting this done was a prerequisite to looking at that.

RenCpp explored several other ideas that are a mixed bag. One thing it tried to do was try to establish a constraint of what type a value was. Although a superclass of Any_Value existed, the idea was that if you could static_cast to something more narrow you'd get a class that offered specialized methods.

Here's a simple example from that era, showing types like Logic, Block, and Word:

int main(int, char **) {
    std::string data {"Hello [Ren C++ Binding] World!"}; 

    Word variable {"foo"};

    Block rule {
        "thru {[}",
        "copy", variable, "to {]}",
        "to end"
    };

    auto result = static_cast<Logic>(*runtime("parse", data, rule));

    if (result) 
        std::cout << "Success and target was " << variable() << "\n";
    else
        std::cout << "PARSE failed.";
}

If you look at variable you see an example of a "weird" idea in action. It is of type Word, and holds the Rebol WORD! foo. A plain reference as variable will give the word as-is (like when the rule block is being constructed), while using function application to it as variable() would act like get 'foo.

A lot of things like that were looked at, such as what block[n] would do (is that a PICK or a SELECT?), or if ++block would act like block: next block.

Some of this might be interesting, but I don't think it represents a use case anyone has...namely, using C++ to puppeteer a pokey interpreter in an obscure way. Most of the time you're looking at throwing data over the fence and getting an answer back.

To me, a central idea is that the interpreter you are hooking up to might be styled to act nothing like the default out-of-the-box Rebol. This means that giving people APIs like rebAppend() (or the infinite regression of rebAppendDup() and rebAppendOnlyDup()) that are "fast" is of not much use if that isn't what your dialect means by the word append. The same argument could apply to trying to pin down the behaviors of block[n] or similar.

1 Like