Code Review Solicitation: C/C++ interface %rebol.h

The API has been relatively stable for a while, with the only major change this year being quite a good one.

The actual %rebol.h file is generated by a piece of Rebol code that analyzes the definitions in %a-lib.c.

There I explain the premise of the API, and mention the trickery is accomplished via non-strict-alias endian-sensitive first-byte access, where patterns are chosen for Cells and Stubs that do not overlap leading valid bytes for UTF-8.

(Also explained in the Rebol 2019 video: Abusing UTF-8 For Fun And Profit)


This is the "external" API, and %rebol.h contains its exported
definitions.  That file (and %make-librebol.r which generates it) contains
comments and notes which will help understand it.

What characterizes the external API is that it is not necessary to #include
the extensive definitions of `struct Series` or the APIs for dealing with
all the internal details (e.g. Push_GC_Guard(), which are easy to get
wrong).  Not only does this simplify the interface, but it also means that
the C code using the library isn't competing as much for definitions in
the global namespace.

Also, due to the nature of the Node superclass (see %sys-node.h), it's
possible to feed the scanner with a list of pointers that may be to UTF-8
strings or to Rebol values.  The behavior is to "splice" in the values at
the point in the scan that they occur, e.g.

    RebolValue* item1 = ...;
    RebolValue* item2 = ...;
    RebolValue* item3 = ...;

    RebolValue* result = rebValue(
       "if not", item1, "[\n",
           item2, "| print {Close brace separate from content}\n",
        "] else [\n",
            item3, "| print {Close brace with content}]\n"
    );

 (Note: C can't count how many arguments a variadic takes, so this is done
 by making things like rebValue() a macro that uses __VA_ARGS__ and tacks
 a rebEND onto the tail of the list.  There's lots of tricks in play--see
 %make-librebol.r for the nitty-gritty details.)

 While the approach is flexible, any token must be completed within its
 UTF-8 string component.  So you can't--for instance--divide a scan up like
  ("{abc", "def", "ghi}") and get the TEXT! {abcdefghi}.  On that note,
 ("a", "/", "b") produces `a / b` and not the PATH! `a/b`.

I think @iArnold may be the only person to have experienced the API at the C level besides myself. But several people have engaged the JavaScript version, which bridges via WebAssembly to run the exact same code in %a-lib.c through some wrappers.


I'm ready for feedback on it, and this post can be a thread for that. (How's your C/C++, @bradrn?)

Since the rebol.h file is generated and not committed to the repository, I have made a gist of what it compiles to at this moment:

Snapshot of auto-generated Ren-C rebol.h file, 20-Aug-2024 · GitHub

There are a couple of bad names (e.g. REBDNG) which are left bad as a reminder that those are parts that need review, so skip those. The rest is about as good as I've been able to make it.

Pretty good. I can read it easily, and write it without too many difficulties (excepting that I can’t quite get my head around the intricacies of modern C++ memory management).

There’s been a lot of new posts here and I’m trying to read them all through… I’ll have a look at this code when I get around to it.

2 Likes