A lingering feeling: Rebol may not want a LOGIC! type at all

There has been a "no keywords" mantra to the Rebol language, which has driven the decision that true and false should not be anything other than WORD!.

Instead, literals expressed historically as #[true] and #[false] were defined, and then these definitions provided:

true: on: yes: #[true]
false: off: no: #[false]

The unpalatability of the literal forms is reflected in R3-Alpha (and today's Red) choice to render the literals as if they were the words. Hence:

>> type? first [true]
== word!

>> type? true
== logic!

>> true
== true ;-- very deceptive...!

>> on
== true ;-- also deceptive, and lossy to boot!

>> form true
== "true" ;-- well, whatever FORM is, it's lossy for strings, so...

>> mold yes
== "true" ;-- okay that strikes me as bad.

>> mold/all yes
== "#[true]" ;-- in R3-Alpha, not Red which is "true"

Without delving into the "what is MOLD vs MOLD/ALL" or "what is FORMing, anyway" discussion, I'll just say this feels like another one of those design errors that needs a better solution.

In trying to square this circle, I've had a nagging feeling in my mind that LOGIC! as a primitive type in Rebol may be just a bad idea in general. BLANK! is already "logically false", with everything else true (and void being opt-out or error). Rebol is a messaging language, so what if in my message for calendar appointments I have a ternary set of words like "yes / no / maybe"...if I want to capture that then why is my problem so different if it's just "yes / no" or "true / false"?

Don't confuse me saying I have this lingering feeling with saying I have a solution. But I thought maybe a thread to brainstorm about it--observations about it made over time, might be useful.

Case Study: What did C do, when it went from not having a boolean type (C89) to having one (C99)?

One famous language that did not define true or false is C. 0 was falsey, all other integers were truthy regardless of how big an integer type they were stored in (8-bit char, platform-size int, bit fields, etc.).

Semantically, though, interfaces frequently wanted to define parameters or variables that could only hold true or false. This led to a diverse set of ways in which people defined TRUE, FALSE, true, or false.

Yet regardless of what syntactic sugar you put on it, they couldn't take anything "smaller" than integers. You had no guarantee that value would be only 1 or only 0, because any integer was truthy. So you could write one_or_zero = some_boolean ? 1 : 0 to be assured of turning a "boolean" (actually integer) argument into 1 or 0. This was "cleverly" condensed via "not not", e.g. one_or_zero = !!some_boolean;. That gave you something you could meaningfully use to flip a bit in a mask or somesuch, whereas unpredictable integer values were messy if used anywhere outside of the condition of an if or while.

But it was hard to remember to do, and ugly. C++ went ahead and added a boolean type to the language. Assign any non-zero integer to a value of boolean type, it becomes true. Assign zero it's false. If this is ever used in a context where an integer conversion is applied, you get out 0 for false and 1 regardless of what true value went in. true, false, and bool were C++ keywords, not overridable by the user.

C99 wanted to add it and get the C++ feature, but they didn't want to break backwards compatibility. Here's what they did as a workaround:

  • Continue the history of 1989 (and before), and don't reserve the words "true" or "false" or "bool", keep them open to user definition.

  • ...but build on the fact that the 1989 spec told users they should not be defining any identifiers that begin with an underscore followed by a capital letter. So _myVariable is okay, but _MyVariable is not. Using the latter may mean the compiler implementation or C standard itself will someday define _MyVariable and your previous working program will get a new version of the compiler and fail.

  • Introduce a new _Bool datatype (screwing anyone over who didn't read the C standard and had defined that themselves to mean something, but that was their fault).

  • Make it so assigning 1020 to a _Bool makes it "truthy" but then trying to later use that value in an integer context gets it as 1.

  • Give people a header they can include (or not) called <stdbool.h>, in which you get true defined as 1, false defined as 0, and bool defined as _Bool.

  • In C there is no printf format specifier for bool. C++ iostreams can be put in a mode where they are rendered as the words true and false instead of 1 and 0, but you have to ask for that

Some Observations

  • C went with the notion of having a logic datatype, but no logic literal. Rebol struggles with this due to the idea that you're supposed to be able to round-trip data and get back structures that are equal? without writing some kind of serialization or de-serialization. But my examples above show this isn't working...it's just frequently being lossy from LOGIC! to a WORD! instead of lossy from LOGIC! to an INTEGER!.

  • Though if one is going to go with this mindset, I think in Rebol that BLANK! serves as a better "universal falsey thing" than 0 does. And it's interesting that in Rebol, absolutely everything else can be truthy. Making it even harder to pick what the universal truthy canon value would be (maybe 0, just to be contrarian?) BAR! is another option, which has the advantage of being another unit type.

  • Underscore followed by a capital letter in C reserves lexical space somewhat in the way that construction syntaxes does. _True offers some of the same "language designer's space, not user's space" situation that #[true] does.

  • C did not go with this approach due to a fear of being a language with keywords. This is a workaround for a "mistake". If they had a time machine, they would probably go back and make true, false, and bool keywords in 1989, as opposed to saying _If was the real "if", _While was the real "while"...and having people #include <c-language.h> to get #define if _If and #define while _While?

[begin devil's advocacy]

What does "no keywords" mean? How realistic is the idea, where else do you see this in Rebol?

DATE! literals certainly contain English months, you can't override it with months from other languages without changing the way the code acts... you can't say ahora: :now | mes: 'month and then write fecha: ahora | fecha/mes.

Sure in that case you could say fecha/:mes, but that's twisting the callsite. The point is about where some things have been hardcoded with bias to some words, and those words are English. Isn't that what a "keyword" is?

Why put up a fight? Every other language takes some words away from the user, there's an obvious English bias. Here we have clear evidence from history that even MOLD wants to spit out something WORD!-looking for a LOGIC!. Why not give in, if the system is already breaking the "spirit of the law" and making it impure in other ways?

It's just a couple of patterns of letters: true and false. People will understand those are LOGIC!, learn it, and accept they can't override them. If it's a dialect, you can give TRUE and FALSE any meaning you like, it's only DO where the non-WORD!ness is a physical liability. But who's going around saying true: 'banana and false: func [x] [return x + 17] anyway, and if they are, aren't we better out without those users?

[/end devil's advocacy]

(As evidenced by the title of this thread, I doubt that TRUE and FALSE should be taken out of "word space" and made LOGIC! literals. But I am interested to see if this can provoke a reasoned response.)

To sum up a few points:

  • Just reminding people of this development: true? foo and false? bar in their historical definitions were bad mojo as they didn't mean the same things as foo = #[true] or bar = #[false]. Wanting to avoid bad didactic ideas, this gave rise to truthy? and falsey? as a compromise.

  • I've talked about how I think if 0 [print "this is truthy and prints even though it's zero"] vs. if _ [print "this is falsey and won't print"] seems a good thing, not bad. It makes it easier to pick "something" from "nothingness", which is what things like ALL and ANY seem most useful for. (In this way, BLANK! seems an eerie kinship with the apparently-not-completely-intuitive invention of zero itself.)

  • People who didn't buy into my devil's advocacy post I made above are presumably quite happy with if 'true [print "the WORD! true is truthy and this prints"] along with if 'false [print "the WORD! false is truthy and this prints"].

Radical Thought Experiment

I keep having an eerie idea to get rid of #[true] and #[false] but bring back true? and false?. They could be coded as native and be fast, but look something like this:

true?: func [x] [
   either find [true on yes] x [
       0 ;-- or some other truthy thing
    ][
       if not find [false off no] x [
           fail "TRUE? and FALSE? only work on true|false|on|off|yes|no"
       ]
       _ ;-- the now only falsey thing.
    ]
]

false?: func [x] [
    not true? x
]

Then, don't define any of these words to mean anything. If you write foo: true or bar: no you get an error. print yes is an error, none are bound nor defined. Instead, foo: 'true and bar: 'no.

Not defining them is safer than the alternative, namely true: 'true and false: 'false, which seems all fine and good until you say if false [print "it's annoying if this prints, isn't it?"]

(...or is it par for the course if first [false] [print "...in a language where this is going to print anyway?"])

The idea is that you would write foo: 'false | if false? foo [print "now you're treating it like an enumerated type of sorts"] if you really wanted.

A lingering feeling: Rebol may not want a LOGIC! type at all

I said this isn't a solution, but a "feeling":

  • If C can get by with 0 being falsey and other integers being truthy, might Rebol get more mileage out of embracing BLANK! as the one falsey value?

  • Is the generality and accessibility of BLANK! such that it effectively adds something like C++'s std::optional to the language, to where very few real cases of boolean is needed? Do booleans in reality often come paired with another value, like has-label: yes | label: "foo" and can you cut them out with just label: _ vs. label: "foo"?

  • Should LOGIC! really be a steep slope away from an enumerated types with 3 values? Wouldn't having an approach that thinks otherwise make everyone consider how to deal with evaluative contexts vs. non, preserving the "on" vs. "yes" vs. "true" distinction instead of discarding it to produce a literal no one likes?

  • Should math that demands 1 and 0 be offloaded into a dialect somewhere? Is boolean logic too much the low level domain when the goal is to match intent?

Sloppily phrased as it is, this is all tied into my "lingering feeling". I'm curious to see if anyone has observations to bring more clarity to the question.

(One very broad critique is that this idea of cutting-the-evaluator out of the equation has no clear limit. When do we go from red: 255.0.0 to having the WORD! red be undefined, and need to use color-of 'red just to prevent the evaluator from "forgetting what the user originally wrote"? It starts getting similar to what other languages feel like when they are doing string processing.)

I'd say the main argument for retaining logic! is that it is a way of describing something that isn't really covered in any other way. Yes—the language may flow without them, but as with a few types in Rebol, they're not always there for language flow, rather there for semantic correctness.

Unfortunately a literal type will never be as concise as the words true and false—it would though be desirable to have some way to distinguish words from values (see also datatypes) that isn't as ugly as #[true] and #[false].

Have mentioned before that using 0 and 1 in some literal form of true/false might sidestep the keyword issue and at least transcends English:

#(0)
#(1)

There is something emerging in the "ISSUE!s are immutable and unified with CHAR!" world...

The empty ISSUE! # is thus truthy and otherwise fairly "ornery" (it can't be appended to TEXT! because it represents the elusive "zero codepoint" that is illegal in ANY-STRING!, and it can't be itself mutated).

So now there's a juxtaposition of ISSUE!'s # as meaning "opt in" and BLANK!'s falsey _ as meaning "opt out".

This is much cleaner than previous attempts searching for a canon truthy value (like [], or more fancifully [o]). There's no associated series or storage for #, it's just a cell with no more overhead than _. It looks like a "filled in box" where blank looks like a not-filled in one:

radio-buttons: [
    [_] "This option seems blank"
    [#] "This option seems marked"
    [_] "Another blank-looking option"
]

This is far from sealing the deal that 1 > 2 returns _ and 1 < 2 returns #. But I feel like it's the closest to a viable pairing of non-LOGIC! types we've been that could be argued for.

So presumably it would imply:

true: #
false: _

There'd be a cascade of things that would have to happen, with helpers for turning WORD!s to # or _, and vice versa. But these are needed anyway; people want the words true and false in their dialects, not some underlying type...and this would make that more obvious.

It would ruin the idea of using BLANK! as "NAN" for chains of opt outs, which hinges on a distinction between LOGIC! and BLANK!.

More generally, LOGIC!-taking routines couldn't leverage BLANK!-in-NULL out... but remember this is kind of saying "there's no such thing as a logic-taking" routine. A dialected function takes a WORD!, and if those words are either true or false that's like any other "enum" you might handle.

Just wanted to point out that we are probably reaching the limit of semiotics here for a non-LOGIC! world. It's not likely to get any better. So it may be time to reason about it, or if it's time to give in to $true and $false or similar.