A lingering feeling: Rebol may not want a LOGIC! type at all

There has been a "no keywords" mantra to the Rebol language, which has driven the decision that true and false should not be anything other than WORD!.

Instead, literals expressed historically as #[true] and #[false] were defined, and then these definitions provided:

true: on: yes: #[true]
false: off: no: #[false]

The unpalatability of the literal forms is means Rebol2/R3-Alpha/Red choose to render the literals as if they were the words. Hence:

rebol2>> type? first [true]
== word!

rebol2>> type? true
== logic!

rebol2>> true
== true  ; !!! very deceptive...

rebol2>> on
== true  ; !!! also very deceptive, and lossy to boot!

rebol2>> form true
== "true"  ; well, whatever FORM is, it's lossy for strings, so...

rebol2>> mold yes
== "true"  ; !!! this strikes me as bad.

r3-alpha>> mold/all yes
== "#[true]"  ; in R3-Alpha, not Red which is "true"

Without delving into the "what is MOLD vs MOLD/ALL" or "what is FORMing, anyway" discussion, I'll just say this feels like another one of those design errors that needs a better solution.

In trying to square this circle, I've had a nagging feeling in my mind that LOGIC! as a primitive type in Rebol may be just a bad idea in general. Rebol is a messaging language, so what if in my message for calendar appointments I have a ternary set of words like "yes / no / maybe"...if I want to capture that then why is my problem so different if it's just "yes / no" or "true / false"?

Don't confuse me saying I have this lingering feeling with saying I have a solution. But I thought maybe a thread to brainstorm about it--observations about it made over time, might be useful.

Case Study: What did C do, when it went from not having a boolean type (C89) to having one (C99)?

One famous language that did not define true or false is C. 0 was falsey, all other integers were truthy regardless of how big an integer type they were stored in (8-bit char, platform-size int, bit fields, etc.).

Semantically, though, interfaces frequently wanted to define parameters or variables that could only hold true or false. This led to a diverse set of ways in which people defined TRUE, FALSE, true, or false.

Yet regardless of what syntactic sugar you put on it, they couldn't take anything "smaller" than integers. You had no guarantee that value would be only 1 or only 0, because any integer was truthy. So you could write one_or_zero = some_boolean ? 1 : 0 to be assured of turning a "boolean" (actually integer) argument into 1 or 0. This was "cleverly" condensed via "not not", e.g. one_or_zero = !!some_boolean;. That gave you something you could meaningfully use to flip a bit in a mask or somesuch, whereas unpredictable integer values were messy if used anywhere outside of the condition of an if or while.

But it was hard to remember to do, and ugly. C++ went ahead and added a boolean type to the language. Assign any non-zero integer to a value of boolean type, it becomes true. Assign zero it's false. If this is ever used in a context where an integer conversion is applied, you get out 0 for false and 1 regardless of what true value went in. true, false, and bool were C++ keywords, not overridable by the user.

C99 wanted to add it and get the C++ feature, but they didn't want to break backwards compatibility. Here's what they did as a workaround:

  • Continue the history of 1989 (and before), and don't reserve the words "true" or "false" or "bool", keep them open to user definition.

  • ...but build on the fact that the 1989 spec told users they should not be defining any identifiers that begin with an underscore followed by a capital letter. So _myVariable is okay, but _MyVariable is not. Using the latter may mean the compiler implementation or C standard itself will someday define _MyVariable and your previous working program will get a new version of the compiler and fail.

  • Introduce a new _Bool datatype (screwing anyone over who didn't read the C standard and had defined that themselves to mean something, but that was their fault).

  • Make it so assigning 1020 to a _Bool makes it "truthy" but then trying to later use that value in an integer context gets it as 1.

  • Give people a header they can include (or not) called <stdbool.h>, in which you get true defined as 1, false defined as 0, and bool defined as _Bool.

  • In C there is no printf format specifier for bool. C++ iostreams can be put in a mode where they are rendered as the words true and false instead of 1 and 0, but you have to ask for that

Some Observations

  • C went with the notion of having a logic datatype, but no logic literal. Rebol struggles with this due to the idea that you're supposed to be able to round-trip data and get back structures that are equal? without writing some kind of serialization or de-serialization. But my examples above show this isn't working...it's just frequently being lossy from LOGIC! to a WORD! instead of lossy from LOGIC! to an INTEGER!.

  • Though if one is going to go with this mindset, I think in Ren-C that NULL serves as a better "universal falsey thing" than 0 does. And it's interesting that in Rebol, absolutely everything else can be truthy. Making it even harder to pick what the universal truthy canon value would be (maybe 0, just to be contrarian?)

  • Underscore followed by a capital letter in C reserves lexical space somewhat in the way that construction syntaxes does. _True offers some of the same "language designer's space, not user's space" situation that #[true] does.

  • C did not go with this approach due to a fear of being a language with keywords. This is a workaround for a "mistake". If they had a time machine, they would probably go back and make true, false, and bool keywords in 1989, as opposed to saying _If was the real "if", _While was the real "while"...and having people #include <c-language.h> to get #define if _If and #define while _While?

A post was split to a new topic: What Does "Rebol Has No Keywords" Really Mean?

To sum up a few points:

  • Just reminding people of this development: true? foo and false? bar in their historical definitions were bad mojo as they didn't mean the same things as foo = #[true] or bar = #[false]. Wanting to avoid bad didactic ideas, this gave rise to truthy? and falsey? as a compromise.

  • I've talked about how I think if 0 [print "this is truthy and prints even though it's zero"] vs. if null [print "this is falsey and won't print"] seems a good thing, not bad. It makes it easier to pick "something" from "nothingness", which is what things like ALL and ANY seem most useful for.

  • People are presumably quite happy with if 'true [print "the WORD! true is truthy and this prints"] along with if 'false [print "the WORD! false is truthy and this prints"].

Radical Thought Experiment

I keep having an eerie idea to get rid of #[true] and #[false] but bring back true? and false?. They could be coded as native and be fast, but look something like this:

true?: func [x] [
   either find [true on yes] x [
       0  ; or some other truthy thing
    ][
       if not find [false off no] x [
           fail "TRUE? and FALSE? only work on true|false|on|off|yes|no"
       ]
       null  ; the now only falsey thing.
    ]
]

false?: func [x] [
    not true? x
]

Then, don't define any of these words to mean anything. If you write foo: true or bar: no you get an error. print yes is an error, none are bound nor defined. Instead, foo: 'true and bar: 'no.

Not defining them is safer than the alternative, namely true: 'true and false: 'false, which seems all fine and good until you say if false [print "it's annoying if this prints, isn't it?"]

The idea is that you would write foo: 'false, if false? foo [print "now you're treating it like an enumerated type of sorts"] if you really wanted.

A lingering feeling: Rebol may not want a LOGIC! type at all

I said this isn't a solution, but a "feeling":

  • If C can get by with 0 being falsey and other integers being truthy, might Rebol get more mileage out of embracing null as the one falsey state?

  • Is the generality and accessibility of null such that it effectively adds something like C++'s std::optional to the language, to where very few real cases of boolean is needed? Do booleans in reality often come paired with another value, like has-label: yes, label: "foo" and can you cut them out with just label: null vs. label: "foo"?

  • Should LOGIC! really be a steep slope away from an enumerated types with 3 values? Wouldn't having an approach that thinks otherwise make everyone consider how to deal with evaluative contexts vs. non, preserving the "on" vs. "yes" vs. "true" distinction instead of discarding it to produce a literal no one likes?

  • Should math that demands 1 and 0 be offloaded into a dialect somewhere? Is boolean logic too much the low level domain when the goal is to match intent?

Sloppily phrased as it is, this is all tied into my "lingering feeling". I'm curious to see if anyone has observations to bring more clarity to the question.

(One very broad critique is that this idea of cutting-the-evaluator out of the equation has no clear limit. When do we go from red: 255.0.0 to having the WORD! red be undefined, and need to use color-of 'red just to prevent the evaluator from "forgetting what the user originally wrote"? It starts getting similar to what other languages feel like when they are doing string processing.)

I'd say the main argument for retaining logic! is that it is a way of describing something that isn't really covered in any other way. Yes—the language may flow without them, but as with a few types in Rebol, they're not always there for language flow, rather there for semantic correctness.

Unfortunately a literal type will never be as concise as the words true and false—it would though be desirable to have some way to distinguish words from values (see also datatypes) that isn't as ugly as #[true] and #[false].

Have mentioned before that using 0 and 1 in some literal form of true/false might sidestep the keyword issue and at least transcends English:

#(0)
#(1)