Issues with "Invisibles": a truly disappearing COMMENT


#1

So @MarkEye brought back up an idea that has crossed my mind every few months, about what it would take to make something that was truly “less than null”. Some way of returning a complete-absence-of-information, including even information about the absence of a value. :-/

The most “obvious” application most people would jump to (which turns out–in fact–not to be so obvious at all) would be COMMENT. So imagine:

9 = do [1 + comment "a" comment "b" 2 * 3]
9 = do [1 comment "a" + comment "b" 2 * 3]
9 = do [1 + comment "a" comment "b" + 2 * 3]

Despite the simple appearance, there’s a lot of holistic concerns of such a thing showing up in the Rebol ecology. Here’s some:

"I’ve made an acid that can eat through anything…"

This can’t come down to returning a new type of value (e.g. a COMMENT!). Because what would happen in your function when you said return make comment! ...? It would be skipped.

How would you test for them? if comment? c [print "it's a comment"] would turn into if comment? [print "it's a comment"]

Having it as a value type is not an option, so it would have to be some new character of the function definition itself.

You can’t “GROUP! them” and keep their semantics

One might ask if there should be a difference between these two statements:

 1 + comment "a" 2
 1 + (comment "a") 2

COMMENT isn’t a very motivating scenario, it’s single-arity and it quotes. But what if you had a more complex operation in this class, with multiple arguments, including evaluated ones?

Rebol has used () as a “null generator” for a long time. But might it be revisited so that GROUP!s that wound containing no content–or just comments–to vaporize? That would mean all these were the same when running DO?

1 + comment "a" 2
1 + (comment "a") 2
1 + () 2
1 + 2

The short answer is No. The long answer is N: (o)

UPDATE: Later it was decided the real answer is actually much longer–it rules out these particular cases, while allowing groups to vaporize in interstitial positions. Hence you can group them and keep their (absence of) content, but you can’t put those invisibled groups in some spots you could have put them without the group.

1 key reason for using groups in the first place is to show the structure in a stream of varying arity. It provides an anchor to be able to say “that one GROUP! will turn into exactly one complete value, or it will error”. So if o vaporized in the (o) above, should N be 1 now? :confused:

Pulling the rug out from under that with “zero or one values” would have to be very worth it. And it’s very not. If an expression wants to be invisible and look convenient, make it a dialect and let it take a block:

1 i'm-invisible a <b> #c 'd + 2    ;-- don't define it like this
1 (i'm-invisible a <b> #c 'd) + 2  ;-- b/c this is void, not invisible

1 i'm-invisible [a <b> #c 'd] + 2 ;-- define it like this

Interaction with DO/NEXT…all invisible functions are effectively enfix

This is the biggest issue. Basically, a DO/NEXT cannot finish until it has consumed all these “invisible” expressions. Consider:

pos: _
do/next [1 + 2 comment "a" * 3] 'pos

For COMMENT to be truly “invisible”, then that should act as 1 + 2 * 3. And the only way it can do so is if when it reaches the comment "a" that it eagerly continues processing, so it can find out if there’s anything on the other side.

Furthermore, the only way to be actually “invisible” is not to damage the evaluator stack at all. You don’t want the presence of the comment above to suddenly turn 1 + 2 * 3 into the semantics of 1 + (2 * 3). That means the comment needs to be dissolved right at the moment the 2 is evaluated, so it can be seen past.

Technically this is easy enough to do, but the results might surprise someone. Let’s imagine you think it would be cool to modify something like the variable DUMP function to be one of these “invisibles”. So you might write:

 x: 1 + 2 dump [x] * 3

That seems pretty cool, and intuitive in this case that when you dump X it hasn’t been assigned yet–the expression isn’t completed. But would it be as intuitive if you saw:

x: 1 + 2
dump [x]

One might expect 3. But as the example above it shows, you can’t get that invisible property that way. If you did, then DO/NEXT would treat that as two expressions.

Is it best to be honest and just call these enfix functions?

Rather than getting into the complex details of defining a new category of functions that are “kind of exactly like enfix functions”, should we just say that’s what they are? They’re basically enfix functions which can pipe their left hand argument to the output in a transparent way. Says @MarkEye:

For the purposes of explication, can one consider COMMENT to be a tight infix operator that “returns” its left-hand side? (haha and its left-hand side is allowed to be empty!) Example: do/next [comment "thrillsville"] should behave exactly like do/next [], shouldn’t it?

There are a few technical challenges to implementing true transparency in this way, given that there is no END! datatype (yet behaviors can be distinguished internally to the evaluator between end and null). It could be worked past with some kind of return/proxy function that you just point at the argument you want to telegraph, and the evaluator takes care of it.

We don’t want to increase the number of parts in the box unnecessarily, so piggybacking on ENFIX may be okay. And also, making it a generic enfix mechanism means someone could design such an abstraction with non-tight semantics as well (if they’re okay with non-total-invisibility).

But it may be “weird”, and surprise someone who types HELP COMMENT and wonders why it’s not the “naive” form. Or as @MarkEye says it “explicates” the situation. Hiding the “latching” behavior on the previous result would only obscure the process.

Thoughts??


Enfixing ALSO, stylizing it after THEN and ELSE
Comment to end-of-line with * *
#2

So the reason I’m asking these questions is I actually wrote a first take on “invisible” functions. You would depict them by putting in their function spec that they really returned no type at all. (e.g. return: []). This necessarily meant that they couldn’t return any other types… (this is why that doesn’t mean void, because sometimes you want to return a void as well as other types as a set, hence <opt>).

The code I wrote is nuanced differently than enfix, and optimized, with interesting aspects. I’m a bit loathe to backpedal on those developments and surrender to the kind-of-meh-excuse of “let them use ENFIX”.

a hybrid option…


Introduce a tricky enfix operator, maybe with a tricky name like ELIDE

COMMENT was long ago changed to disallow active parameters like GROUP!, because comment (1 + 2) looks confusing. So does comment print x (was that what you actually wanted?). It’s was easy enough to put things in blocks–so seemed better to put it in a block that shows more clearly what you meant.

So hybrid plan step one: make an enfix operator in the spirit that @MarkEye was mentioning, that’s fancy and pipes left tight to output regardless of args. For now I’ll call it ELIDE.

Imagine if you give it a BLOCK! it won’t evaluate the contents, but if you give it a GROUP! it will, etc. (More or less, a version of EVAL that “elides” its result value). This could provide invisible injections in the middle of any location in your evaluation stream:

x: 10
y: 1 + elide (print x) 2

(Tech note: To make ELIDE with today’s enfix, one has to effectively “see an end” on its “left”–which is actually “up” into into the argument acquisition. It needs to have special behavior w.r.t. simulating 2 as the only thing there. We could imagine such magic telegraphing powers being given to all enfix, er, somehow.)

ELIDE can also be the multi-line arbitrary endpoint COMMENT you always wanted but were afraid to ask for:

 x: 10
 y: 1 elide [+ 2
 z: 30] + 7

(Note you cannot do that by wrapping the code in a curly-brace string.)

If a user looks at the definition and implementation of ELIDE, and sees it’s tricky with a grab and route of enfix left tight to output, they will feel they’re getting what they pay for–and be impressed. If they meet it someday in the debugger, they won’t be shocked by what it does…because it has to do that.

But for the modest user who was just trying to use COMMENT…


Make plain-old COMMENT look plain-old, but back it with new tech

The above ELIDE operation can be done roughly with existing ENFIX, tweaked slightly. But let’s say COMMENT’s definition is:

comment: func [
    return: []
    :value [block! any-string! binary! any-scalar!]
][
    ; nada
]

And let’s say since it has return: [] (or however we want to spec this) means COMMENT forces completion of the left hand side, or it will error. Like an expression barrier would.

So unlike ELIDE, COMMENT disrupts order of an in-progress evaluation, to the point where it never acts enfix. This isn’t going to jolt anyone’s world–because no one used COMMENT mid-stream in evaluations before (it would leave behind voids and corrupt the expression).

But like ELIDE, this new COMMENT would not disrupt the value dropping out.

>> do [x: 10 + 20 comment "KOY4GOFF"]
== 30

So rather than acting “enfix”, it acts more like today’s expression barrier. Not only does today’s expression barrier force left expressions to finish, it optimizes itself out by flushing… if you say do/next [| | | 1 + 2] they all get processed in that same DO/NEXT operation.)


An even deeper motivation: -avoiding- a new parameter type

I’ve tried to sell the above on its plausible-merits-to-the-layman, without talking about the “esoteric” case that actually made me come up with it. But in the beginning when I was making the <| and |>, I wanted:

 >> x: 1 + 2 * 3 <| print ["x is" x] blah blah blah
 x is 9
 == 9

I wanted an operator that could ask for the left hand side to be completed fully, and this was my motivating case. I didn’t like this being interpreted as:

 x: 1 + 2 * (3 <| print ["x is" x] blah blah blah)

And as @MarkEye will remember, I very much wanted a model of enfix that permitted it. But really, this is the only operator in that class, and it doesn’t even care what the right hand side evaluates to, nor does it want to see the left hand result. It doesn’t want to be parameterized by the left hand side, it wants to elide itself.

Left-completion could be a way to do it. But ordinary evaluative enfix could not force completion of the left, because of:

 return if x < 10 [20] else [304]

If more than one unit of expression got evaluated on the left–taken to its logical conclusion–you’d get return happening before ELSE had a chance:

 (return if x < 10 [20]) else [304]

And the #tight parameter class was deemed necessary for other reasons. So would we be needing @full parameters?

 >> foo: func [@x] [print x]
 >> foo "a" "b" "c"
 == "c" ;-- Variadics can do this, but, @x... seriously?

With this proposal, we dodge a new left enfix convention, and get a perfectly sensible definition for <|:

<|: func [
    {Evaluate any # of expressions, but completely elide the results.}

    return: []
        {Returns nothing, not even void (like COMMENT)}
    expressions [<opt> any-value! <...>]
         {Any number of expressions.}
][
    do expressions ;-- yes, you can DO or DO/NEXT a VARARGS!
]

…Any questions?..

I think this looks like the best of both worlds.


#3

I want to make a quick point:

pos: _
do/next [1 + 2 comment "hi" 7 + 9] 'pos

You might think it’s equally viable for POS to come back as [comment "hi" 7 + 9] and leave the comment for the next processing step, as it is for it to take care of the comment in the first step. You can still have the first step evaluate to 3 and the second step evaluate to 16.

But that only works because there was something after it. The stated goal is that do [1 + 2 comment "hi"] come back with 3. So what if the 7 + 9 hadn’t been there? You’d wind up with POS as just [comment "hi"], and that would become void…a DO/NEXT of that can’t fabricate 3 out of thin air. Hence running several DO/NEXTs on the block would have a different outcome from DO…which is bad.

It might seem a little weird to consider all COMMENTs as being part of an expression that precedes it. But that’s how the directional arrow of Rebol’s evaluation dictates it.

UPDATE: This was overturned some months later with a clever reimagination of DO/NEXT…allowing to not glue invisible constructs with the evaluation on their left. It’s an important development that has made invisibles act as one expects in control flow, making their use much easier.


The Case for Literal (or soft-quoted?) branches in CASE
Re-imagining DO/NEXT (now called EVALUATE)
#4

Both ELIDE and COMMENT seem to be working, though the comment style is not as strict as to only work in expression-barrier slots. It forces the left to completion as far as it can, but accepts being in a parameter slot when it reaches the limit. We can decide if this is too permissive, and it should raise an error if it can’t complete all the parameters of a function.

Though it only has a few explicit tests at the moment, it has survived boot and a bootstrap build. And some other abstractions, some of which are committed. So please add more tests, or propose “can it do…this?” cases.

I’m curious what kinds of ideas people have for what to do with these mechanics…besides just debug output / breakpoints / comments…

It seems like it could be useful in code-generating systems, where you want a side-effect to happen at certain moments but you don’t want to break the emitter stream of the expression you’re building. I don’t know.

Maybe it could be getting more competitive with Forth in the stack area, with the ability to invisibly PUSH expressions without having to worry about that being observed by your current expression pipe.
But then a non-invisible POP could get the expressions back off?

Hopefully food for thought.


#5

I wound up swapping my stance on this, in terms of which-operator-should-get-which-technique.

ELIDE’s unpredictability of moment-of-evaluation made it really only useful if it had no side-effects. Which is to say, it was really only useful when it acted like a truly-invisible comment.

When ELIDE was changed to use the “simpler” (yet not truly invisible) mechanic, it became easy to use…and it could drop the requirement for its argument to be in a GROUP!. This simple mechanic is applicable to those who wish to study it and make things like DUMP…all they have to do is make it return [].

Meanwhile, the number of COMMENT-like abstractions is likely to be very few…because COMMENT pretty much covers it. It can have a somewhat more wily definition with the argument that it needs to, because it seeks the “true invisibility”.

UPDATE: It seems I spoke too soon on COMMENT being the only comment-like thing you need. It turns out even that had more directions to go with being variadic and detecting to end of line. Who knows how many more uses might come up?

This opened up a number of doors, including to retake ALSO and kill AFTER, since a usable ELIDE is more flexible than either:


#6

One idea for using invisibles is to chain functions where you make some functions disappear from the chain if they’re not needed.

Imagine an image processor where you apply sequential effects but only the ones specify as refinements.

imagemagick: func [ data 
    /monochrome
    /sharpen
    /blur
    /pixelise
][
    ... code ..
    sharpen: me ?? :sharpenf !! :nilhil
    blur: me ?? :blurf !! :nihil
    pixelise: me ?? :pixelf !! :nihil

    return sharpen blur pixelise data
]

imagemagick/sharpen imagedata

giving a much cleaner look. If the refinements are absent, the functions just disappear.


#7

So I do think these examples of GROUP!s vanishing without feedback are bad. But the reason that they are bad is because they are in argument fulfillment positions.

When you are between expressions, it’s not really an issue if something vaporizes or not. And probably preferable if it does.

So what I’ve got going is this:

>> 1 + comment "a" 2
== 3

>> 1 + (comment "a") 2
** Script Error: + is missing its value2 argument

>> 1 + () 2
** Script Error: + is missing its value2 argument

>> 1 + 2
== 3

But then, also, this…

>> block: [1 + comment [2 * 3] 4 elide print "Outside" (comment "inside") | |]
== [1 + comment [2 * 3] 4 elide print "Outside" (comment "inside") | |]

>> do block
Outside
== 5

>> block: try evaluate/set block 'val
== [elide print "Outside" (comment "inside") | |]

>> val
== 5

>> block: try evaluate/set block 'val
Outside
== _

>> val
== 5

It may be the best of both worlds. GROUP!s don’t synthesize any values that aren’t there and act like invisibles. But if they’re empty, they have the behavior of expression barriers.