BINARY! vs. BYTES! vs BLOB!

It has occurred to me that it might be nice to have a datatype which represents a string of bits, where you don't have to have exactly multiples of 8 of them...and where positions were on a bit-level.

If there were such a type, it might be called BITS! to not conflict...but either way it does point out that BINARY! is kind of a misleading name.

Since today's BINARY! is a series managed in multiples of 8 bits, wouldn't it be better to call it BYTES! ?

I'll add that I'm kind of leaning toward saying that the representation be ${FFFF} or $"FFFF" as opposed to using & to replace #. This is more of a nod to historical $ sign for hex... leaves & free for other applications...and gives the ability to put characters in quoted strings cleanly, like:

rebElide("some-char: #{c}");

Although that particular case can be done without delimiters as #c ... but we need to work through what is and isn't legal to do without quotes or braces.

In COBOL the BINARY got replaced by COMPUTATIONAL or COMP.

It is indeed a little remarkable that there is not an arbitrary length bits field. Just enough bits for all flags used. But then again the whole memory is divided in words of some bytes length.

I think I do remember the $ hex notation being used....

I'm going to bring this back up, because binary is a very overused term.

If I said "Hey, can you get me the binary for that?" I'd generally be asking for an executable format file.

The common language use of binary is to say "one of two options". One might in this context imagine that the BINARY! type was actually LOGIC!.

And of course it can mean base-2, e.g. hexadecimal "FE" is "11111110" in binary

With all the other renamings, I don't think it would hurt to have a better name for BINARY! Redbol can of course stick with that and PAREN! and whatever else.


I actually think BYTES! is not bad.

  • It's plural.

  • It's shorter.

  • It's unambiguous (well, on today's machines; a "byte" could be 7 bits or otherwise on old machines. But that's not what anyone would mean today--while the confusions over binary are still very present in use)

  • It directly mentions the granularity at which you can step through, insert, or remove items--you can manipulate it on the byte level, and only have round numbers of 8-bit bytes

But if you want to talk about trends out and about in the world, it seems that BLOB! is pretty common for data of arbitrary content.

I can't think of too many other obvious possibilities. DATA! is really too generic. CDATA! is an XML-ism that deserves to die along with the rest of XML.

Anyone have a visceral feeling about BLOB! vs. BYTES! vs. your-suggestion here?

So the number of reasons to call the datatype BLOB! has gone up.

With the distinction between LIST! and array, we now talk about a Rebol "List" as having an array (and an index, and a binding)... not being an array.

  • Not only is choosing the more "nebulous" term better for suggesting its weird additional properties... but the shorter term is friendlier.

  • The terminology helps users, but also greatly helps the implementation be clear--you know what an Array is (a "Flex"-derived flexible data Array, implemented via "Stub") and know it as distinct from a List (a "Cell")

We stand to reap a similar benefit by using Binary inside the implementation for the data store (the "Flex"-derived Binary), and calling the Cell a BLOB!

Giving users a shorter name that doesn't trip over the loaded term of "binary" seems like a clear win to me.