Naming of bitwise OR, AND, XOR

hostilefork · January 16, 2018, 8:07pm

The plain words OR and AND (and XOR, NOR, etc) were frequently desired to be made "conditional":

Early on, Ren-C implemented this...as well as the notion that the prefix forms would be the behavior of INTERSECT, UNION, etc. (Then much later, it went down the path of how exactly to handle short-circuiting, which is its own topic...and it's why you have to parenthesize the second argument of AND and OR.)

But the question came up of what to name the infix bitwise operations. What were chosen were or+ and and*...which were allusions to the symbols for the operations used in boolean arithmetic:

https://www.allaboutcircuits.com/textbook/digital/chpt-7/boolean-arithmetic/

But since that early decision, * has taken on a kind of special meaning...what one might often think of as "/ONLY" or the more "core" or fundamental version of an operation. (e.g. select [a 10 b 20] 'c gives back a BLANK! for convenience, even though that is conflated with select [a 10 b 20 c _]. It is built on top of select* which would give a void in the first case, and a blank in the second.)

select: redescribe [
    {Variant of SELECT* that returns BLANK when not found, instead of void}
](
    chain [:select* | :to-value]
)

This usage of * has become so pervasive in the naming scheme, that it makes and* seem misleading.

Hence I think or+ and and+ may be a better choice to say "the bitwise mathy or" and "the bitwise mathy and", rather than bow to this kind-of-obscure use of symbolism from boolean math.

Haven't changed it just yet, but plan to, unless anyone has better ideas.

Mark-hi · January 17, 2018, 12:12am

(1) and&, which is just as confusing as + (which should be v) and therefore perfect
(2) and^, since that is the actual mathematical symbol used for and

hostilefork · January 17, 2018, 12:32am

We've discussed allowing escaping in WORD!, e.g. spaces^-between (I'm an advocate for ^- being space instead of tab, else it would be ^_), which would take away caret as a simple word character.

I'm still partial to moving to & for something, possibly characters to line up with HTML entities. It's an ugly word character, and this would help with the extreme overloading of meanings for #:

if &c = first "cat" [
   print ["first character in cat" space &RightArrow space "c"]
]

BrianOtto · January 17, 2018, 1:05am

I actually quite like this, it helps you figure out the logic as you read it, but I see your point about * being used elsewhere.

+ makes the most sense to me, to give it a "mathy" look. I think all other characters have either been ruled out (here and in that ticket) or don't give it that "mathy" feeling.

What about a different word? I think I still like + better, but throwing this out here ...

band (bitwise and)
bor (bitwise or)

gchiu · January 17, 2018, 4:21am

To make it even more topical we could call them

bitand, bitor

and now to find a use for bitcoin

iArnold · January 17, 2018, 2:32pm

I like bitand and bitor.
Then bitcoin should flip bits randomly?

BrianOtto · January 23, 2018, 5:59am

+1 for this, I like it.

hostilefork · January 23, 2018, 11:08am

One of the goals of the change is that essentially:

 and+: enfix tighten :intersect
 or+: enfix tighten :union
 xor+: enfix tighten :difference

That is to say that these operations are the abstract mathematical concepts, like DIFFERENCE is the Symmetric Difference. When applied to scalar numbers, there's no real other interpretation other than to think you mean it bitwise. But they're not constrained to that, they work on sets too...

 >> [a b c e g] and+ [b e f]
 == [b e]

So putting "bit" in the name would suggest it's more limited than it is.

For the moment I'm just switching AND* to AND+, but going ahead and getting the intended change committed, so the issues related to moving plain AND and OR to being conditional can march forward. (And killing the definitely-badly-named AND~, OR~, XOR~.) By all means, people can keep thinking about the relevant issues.

hostilefork · July 25, 2018, 2:07am

To re-summarize the trick of today for plain AND and OR: achieving short-circuitness requires some kind of array on the right hand side of an AND or an OR. Given that an array needs to be there, the idea was that GROUP! vs. BLOCK! would give different behaviors.

 <left> and (<right>) => #[true] ;-- GROUP! means force result to LOGIC!
 <left> and [<right>] => <right> ;-- BLOCK! means act like `all [<left> <right>]`

...well, here's something that didn't come up yet--for some reason--to let the GROUP!'d form be UNION/INTERSECT-like. They work on LOGIC!, and now we have the nice word DID. Even in the worst case scenario that x and y aren't LOGIC!, and neither of them already needed a NOT on them anyway, you'd have:

flag: did x and (did y) ;-- interpreted as `(did x) and (did y)`

So with this, we might be able to have our cake and eat it too. AND and OR could be synonyms for INTERSECT and UNION when used with a GROUP!, and act like ALL and ANY when used with a BLOCK!.

AND+ and OR+ are still kind of ugly and never really settled well
I'm not finding a huge need for AND/OR making LOGIC! variables
GROUP!s being used for COMPOSE means BLOCK! is often better for conditionals anyway
It would be a bit of a nod to historical Rebol

What would be a bit annoying about it is that the blocks have a chunky aesthetic on a conditional line:

if blah blah and [blah blah blah] [
    ...
]

if blah blah and (blah blah blah) [
    ...
]

But as I mention, you've got your DID and your NOT to let you use the second form if what you're testing isn't LOGIC! already.

I guess one must weigh it all in the balance, and if killing off AND+ and OR+ is worth it. Maybe? I imagine it would be more error-prone, which is probably a good argument against it. But just weird it hadn't come up yet at all.

hostilefork · October 9, 2020, 5:54am

We've had quite a lot of time with this change.

...and I've been wondering if we might want to revert it... Maybe even to make them bit-operations and set-operations and prefix.

This isn't firmed up, but it's a thought that has been on my mind. Here's some reasoning.

Reason One: The Need For Non Mutating INTERSECT/UNION/etc.

In Planning Ahead for BigNum Arithmetic I discuss the tricks involved in defining variants of addition in mutating forms like ADD with a non-mutating variant like PLUS, and then enfixing PLUS as +.

I won't rewrite that post here. I'll just say that a similar issue comes up with things like INTERSECT, UNION, and DIFFERENCE...which I always thought had "mutating-sounding" names anyway. They're going to need an optimized non-mutating form that doesn't create a new numeric identity if what it makes fits in a cell.

So rather than making AND and OR and XOR some logic-heeding enfix knock-offs of THEN and ELSE, they might be better purposed to those non-mutating forms. Possibly they'd be best not even being enfix, to keep people from using them wrongly.

After all, you can make anything enfix with SHOVE:

1 >- and 2

Reason Two: Other Places To Get Our Enfix Fix

One of the arguments for why people wanted enfix conditional AND and OR is that they just fit the pacing in people's mind better. Using ANY or ALL for short-circuit evaluation has you looking at this:

if any [condition1 condition2] [
    ...
]

This has two prefix conditions and then two argument blocks. There's nothing breaking it up into a more natural grammar. So it's a bit of a speedbump.

But now we can say:

any [condition1 condition2] then [
    ...
]

I think the difference is particularly better for long condition lists:

either any [
    condition1
    condition2
    ...
    conditionN
][
    ...
][
   ...
]

any [
    condition1
    condition2
    ...
    conditionN
]
then [
    ...
]
else [
]

So you get a bit of a better pacing...something more "natural language-like". It's getting a word-in-edgewise between the blocks.

Reason Three: No Satisfying Alternative Names

I don't like bitand and friends, and I've been unhappy with and+, or+....

Reason Four: Lack of Use

Empirically I'm just not using the infix AND / OR / etc. much.

The fact that short-circuiting requires putting the right hand side in a block kills a lot of the benefits.
Then you add that to the annoyance of having to put any interesting expressions on the left in parentheses.

Pretty much everything I want to do is covered by THEN, ELSE, ALSO, and the other things that have come along. Reasoning about the AND and OR variants in these cases hasn't seemed worth the mental tax.

Reason Five: Favoring Status Quo If Benefit Not Obvious

So Ladislav made some good arguments for why this would be nice in theory...and why NOT being conditional and shoving off the non-conditionality to COMPLEMENT offers a hint that users just expect coditional operators from AND, OR, and XOR.

I guess what I'm saying here is...the benefit hasn't seemed to come across yet.

Maybe the problem is that it hasn't been done right? Perhaps AND and OR belong in a dialected situation where their expressions aren't being processed by the rules of the evaluator. LOGICAL as a analogue to MATH (or maybe just part of it?)

 if math [
     (some expression and some other expression)
     or some expression on the next line
 ][
     ...
 ]

Maybe the default implementations of AND and OR don't exist at all, but point you to using this MATH dialect... where divisions also happen before additions, and things are "how people expect"?

I don't know, but I wanted to come back to this and put some of my qualms on the radar.

hostilefork · October 17, 2020, 10:01pm

Contemplation Point Six: INTERSECT and UNION etc. Have /SKIP

From the tests, I'd noticed that operations like OR differed from their "set-based" counterparts like UNION on BINARY:

r3-alpha>> #{0FF0} or #{FFF0}
== #{FFF0}

r3-alpha>> union #{0FF0} #{FFF0}
== #{0FF0FF}

OR was acting in a bitwise fashion... interpreting the binary as a collection of bits. While UNION was treating the BINARY! bytewise... interpreting it as a collection of bytes.

If you notice that the set operations have a /SKIP parameter, it gets even more nuanced:

 r3-alpha>> intersect/skip "abcd" "cd" 2
 == "cd"

 r3-alpha>> intersect/skip "abcd" "bc" 2
 == none

So you can treat a series as a collection of subunits of a certain size--even strings and binaries.

This makes Ladislav's point that INTERSECT is like AND...or that UNION is like OR...seem not all that relevant. Operations that act on series with series-like properties are a rather distinct concept, and really shouldn't be wedged into the same space.

Ideas Of The Moment

Here's a possible game plan:

conditional but non-short-circuiting AND/OR/XOR
implement bitwise AND/OR in the BITWISE dialect. Instead of writing x: y and z you could say that as x: bitwise [y and z] It looks better than x: y bitand z, and reads out loud like what it does.
set operations are mutating, unrelated to the bitwise ops, and don't have non-mutating forms (copy input if you need to). COMPLEMENT would be mutating and not applicable to integers, so you'd say x: bitwise [not y] instead of x: complement y.

Trying to take all the evidence in, the argument for the appeal of conditional operators to newbies is still strong. Yet if the operations don't have short-circuit behavior, they will not fit most people's expections...

 y: [a b c]
 if (integer? y) and (y + 1 > 5) [
      print ["oops, left false but still ran right...the addition fails."]
 ]

But what I'm weighing is that maybe the lack of short-circuiting is a good teachable moment for the evaluator...and why ANY and ALL are different. And there are plenty of cases...e.g. checking multiple refinements...were not being short-circuiting isn't some huge catastrophic performance or logistical problem:

foo: func [arg /ref1 /ref2] [
    if ref1 and ref2 [
         fail "Would it be so bad if this worked?"
    ]
    ...
]

Besides such cases being common...used intelligently, a non-short-circuiting AND can fit in as a tool for things that short-circuiting ALL can't express as cleanly. Plenty of applications in code golf...

So I think the real referendum here isn't on conditional AND/OR/XOR...but on trying to wedge short-circuiting into it.

It's time to go ahead and make the BITWISE dialect, even if it isn't great at first...to try it and see what the use cases dictate. It looks like the right solution to this problem.

IngoHohmann · October 18, 2020, 7:11am

How does "only mutating set operations fit into bignum plans?

hostilefork · October 18, 2020, 4:21pm

Not sure what you mean...as bignums are different and addition is much more common than intersecting sets, so it would be untenable if + mutated. It's more important to have an answer there than for set unions.

But I'm still contemplating mutation. The issue is that implementation algorithms tend to involve making new sets. But it just seems ever-random in a language that mutates by default on things like APPEND or UPPERCASE to be picking other foundational things that don't mutate. The topic needs a broader philosophy. :-/

In other news: it may be possible to do short-circuiting AND via a refinement like /SHORT. Then since infix doesn't dispatch from a path (unless you use SHOVE >-)...make it a modal argument that controls the refinement.

>> 1 and 2
== #[true]

>> x: [...]
>> (integer? x) and @[x + 1 < 2]
== #[false]

>> (integer? x) >- and/short [x + 1 < 2]  ; long way to write it
== #[false]

It's a feature people could use if they wanted to, or avoid if they didn't

hostilefork · November 11, 2020, 4:11pm

All right, I think we have the actual, final answer (at least for the conditional operators, not the bitwise ones. :-/)

The answer comes from the notion that "Plain GROUP! branches only run if branch taken"

Basically, we have precedent now that:

>> branchy: func [flag] [either flag '[<a>] '[<b>]]

>> either true (print "a" branchy true) (print "b" branchy false)
a
== <a>

Most people seem to think this makes perfect sense...and if you want the less-intuitive behavior of running both branches you can get it with GET-GROUP!.

If you think that makes perfect sense, why be surprised at this:

>> false and (print "right" false)
== #[false]

>> true and (print "right" true)
right
== #[true]

Not exactly universe-shattering to absorb that consequence. Then, if you want the right hand side to run unconditionally, you use a GET-GROUP!:

>> false and :(print "right" false)
right
== #[false]

That looks pretty sensible to me. It just means that the right hand side of an AND has to be in a GROUP! or a GET-GROUP!, which you'd probably want a lot of the time anyway.

iArnold · November 11, 2020, 6:40pm

This was my suggestion in the other thread.