BINARY! vs. BYTES!

hostilefork · January 27, 2021, 12:47am

It has occurred to me that it might be nice to have a datatype which represents a string of bits, where you don't have to have exactly multiples of 8 of them...and where positions were on a bit-level.

If there were such a type, it might be called BITS! to not conflict...but either way it does point out that BINARY! is kind of a misleading name.

Since today's BINARY! is a series managed in multiples of 8 bits, wouldn't it be better to call it BYTES! ?

I'll add that I'm kind of leaning toward saying that the representation be ${FFFF} or $"FFFF" as opposed to using & to replace #. This is more of a nod to historical $ sign for hex... leaves & free for other applications...and gives the ability to put characters in quoted strings cleanly, like:

rebElide("some-char: #{c}");

Although that particular case can be done without delimiters as #c ... but we need to work through what is and isn't legal to do without quotes or braces.

iArnold · January 27, 2021, 8:20pm

In COBOL the BINARY got replaced by COMPUTATIONAL or COMP.

It is indeed a little remarkable that there is not an arbitrary length bits field. Just enough bits for all flags used. But then again the whole memory is divided in words of some bytes length.

I think I do remember the $ hex notation being used....

hostilefork · July 3, 2022, 3:17am

I'm going to bring this back up, because binary is a very overused term.

If I said "Hey, can you get me the binary for that?" I'd generally be asking for an executable format file.

The common language use of binary is to say "one of two options". One might in this context imagine that the BINARY! type was actually LOGIC!.

And of course it can mean base-2, e.g. hexadecimal "FE" is "11111110" in binary

With all the other renamings, I don't think it would hurt to have a better name for BINARY! Redbol can of course stick with that and PAREN! and whatever else.

I actually think BYTES! is not bad.

It's plural.
It's shorter.
It's unambiguous (well, on today's machines; a "byte" could be 7 bits or otherwise on old machines. But that's not what anyone would mean today--while the confusions over binary are still very present in use)
It directly mentions the granularity at which you can step through, insert, or remove items--you can manipulate it on the byte level, and only have round numbers of 8-bit bytes

But if you want to talk about trends out and about in the world, it seems that BLOB! is pretty common for data of arbitrary content.

It's one more character shorter.
It's what JavaScript has decided to call such things.
A lot of SQLs call it this too.

I can't think of too many other obvious possibilities. DATA! is really too generic. CDATA! is an XML-ism that deserves to die along with the rest of XML.

Anyone have a visceral feeling about BLOB! vs. BYTES! vs. your-suggestion here?