How Is An Integer Represented in the Interpreter?

When an integer! is created, its value is kept as in binary form. When in the console is entered:

>> x: 256
== 256

The value 256 is converted to a binary representation to save the integer! value in memory.
And afterwards the result of the operation is returned. The value that was assigned to x is converted from its binary form to the display format of 256.

Where exactly in the code is this conversion taking place?

The console has something called INPUT-HOOK. This is what it calls to get a string. The default is simply ASK TEXT!:

input-hook: meth [
    {Receives line input, parse/transform, send back to CONSOLE eval}

    return: "null if canceled, otherwise processed text line input"
        [<opt> text!]
][
    ask text!
]

It uses this to read one line at a time. Each time a line is read, it tries to LOAD it. If the LOAD fails due to a problem of a missing ] or ) or }, then it will try calling INPUT-HOOK again for another line of text.

Once there is a successful LOAD, the code is represented in memory as a BLOCK!. Hence the block you get is [256] in this case.

There is another step you can hook here, which is DIALECT-HOOK. This is any transformation code you want that will process blocks after they've been fully loaded. The default is to do nothing.

After this, the code is evaluated. The evaluation actually happens from C, with no console code on the stack. The reason for this has to do with cancellation... there is no way in usermode to "catch" cancellation, and we wouldn't want the console implementation itself being canceled. Only the code the console is asked to run.

Finally the result is printed with a hook called PRINT-RESULT.

How Does LOAD Work?

LOAD is built on top of TRANSCODE, which is the scanner. It's basically just C code that builds bricks of memory for each cell in a block, recursively.

Some hand-written and chaotic C code recognizes patterns and emits "tokens". The token scanner finds the start and the end of things that look like an integer, and emit TOKEN_INTEGER

A cell is made by Scan_Integer(), which relies on the C functions _atoi64() or strtoll() to do the conversion.

In terms of bits of memory, the cell is 4 platform pointers in size. On a 32-bit platform, 2 of those are used for the cell payload...a 64-bit integer. One is the cell header, marking that it's an INTEGER! and carrying other information about the memory that cell is formatted for. And then another is unused.

How Does PRINT Work?

When cells need to be converted to strings, it's a "Mold Function" that does it. The mold function handles both forming and molding, based on a flag it gets passed. Many types mold and form the same, integer is one of those.

The MF_Integer() function extracts the 64-bit integer from the cell with the macro VAL_INT64(). Then it passes it to "Emit_Integer()" which is based on INT_TO_STR that uses _ito64a() from C or an equivalent reimplementation.

There's really no magic in the "string to integer" or "integer to string" bits. These are common C routines. But the console hooks show Ren-C being used in its own implementation, where features are being developed aggressively.

1 Like