Behavior of Plain GROUP! in PARSE

Historically, a plain GROUP! in PARSE runs the conventional evaluator. But it discards its result and doesn't put it into the stream of parse instructions.

That means you can't do things like this:

parse "aaa" [some (either condition ["a"] ["b"])]

Because the "discarding" behavior of GROUP!s in PARSE is so pervasive, it seemed hard to challenge. So Ren-C had adopted the GET-GROUP! as :(...) to be non-discarding form, which you could use to splice arbitrary arguments or instructions into the PARSE stream.

But to @rgchris's taste, having the colons may be worse than rethinking the rule...to instead one in which you have to explicitly throw away the result. So we'd presume PARSE would have an ELIDE keyword.

parse "aaa" [some ["a" elide (print "found an A")]]

In a new way of working...a failure to put the ELIDE there would mean you'd be trying to splice a #[void] into the instruction stream, which we'd imagine is an error. (You'd need a quoted void to match an actual void value).

The choice to elide can come from inside the group as well:

parse "aaa" [some ["a" (elide print "found an A")]]

Or use something that returned NULL...since a NULL splice is ignored by the parse stream. This is the mechanism by which branching constructs can be used to good effect:

parse "aaa" [some ["a" (if false ["b"])]]

Note: In the early days of expression barriers, I mused this kind of approach could be done with an idiom that would look like what some languages call "banana clips":

parse "aaa" [some ["a" (| print "found an A" |)]]

Under that proposal you wouldn't need the barrier on both ends, but it might help. Today's expression barriers only block evaluations and don't overwrite evaluations, though. But it's a bit interesting, and shows the proposal is not fully "new"

A Significant Change, But... Is It More Intuitive, and Better?

Using GROUP!s to splice material into parse rules...or to act as arguments to rules...is a desirable feature.

If you imagine people being introduced to GROUP! as not discarding, if it would just seem to make more sense to work with them generically. PARSE rules would be more creative with less repetition from the get-go.

And with ELIDE functioning in the evaluator, they could build on that knowledge in PARSE to know they have a way to wedge value-generating code into a place where they need that value to disappear.

The other worldview...that GET-GROUP! is a non-discarding form of GROUP!...and plain GROUP! discards the value...doesn't have a basis in anything.

Note: I'm thinking that @ groups will continue to have their "as-is" meaning, which is to say that if you keep @[some integer!] then you are asking to keep the actual two words some integer!, while keep [some integer!] would act as a rule that would match the input for some number of integers. Similarly, keep @(reverse [integer! some]) would keep the two words some integer! as well, while keep (reverse [integer! some]) would be a request to match input.

Concerns To Address

I'm not so much concerned about backwards compatibility, as I am about two issues:

  1. Is it actually, measurably, knowably better.

  2. Will the machinery bend to allow emulation of the Rebol2 semantics.

I think at first I was reluctant to think it could be better, based on thinking "there's a lot of points in parse rules that trigger code, and having to put ELIDE on all of them sounds burdensome".

But the game has changed in some ways. For instance, as invisibility is on the rise for you don't have to worry about injecting (-- pos) to dump a parse position variable, because the same invisibility that works inside something like CASE or ALL works here.

I'm interested in (2) because I kind of insist on being able to implement Redbol. So I'd like there to be some kind of hook for this. Ideally it would be the kind of hook that would have allowed a motivated individual to add something like the behavior of COLLECT and KEEP as it acts now to PARSE. I'm still trying to work that out.

I do feel like the idea deserves a shot, because when I look at it now it seems like how PARSE probably should have been done to start with. And we want to show best practices to people trying to implement dialects--and build infrastructure to help with those best practices--which are now informed by things like NULL-ness and invisibility...

1 Like

I hope we get some good community feedback here. I like the new functionality and I'm fine with the proposed consistent GROUP! behavior. OTOH I'm not tied to a legacy codebase to convert.

Try to stick to any other programming language over a longer period of time, chances are you have to make changes in your codebase to deal with the progression. Many projects in my working history have this as a common denominator.

Trying it out in practice, I do think it is conceptually more "healthy" to see the discarding case as the behavior requiring a notation. You don't generally conceive of things in parentheses being associated with discarding the result...it "groups" things.

Yet it does get wordy. Let's say you start with something like:

        parse skip executable string-header-offset [
            (mode: 'read) pos: section-header-rule
            (
                assert [sh_offset = string-section-offset]
                sh_size: sh_size + (1 + length of encap-section-name)
            )
            (mode: 'write) :pos section-header-rule
            to end
        ]

Becoming:

        parse skip executable string-header-offset [
            elide (mode: 'read) pos: section-header-rule
            elide (
                assert [sh_offset = string-section-offset]
                sh_size: sh_size + (1 + length of encap-section-name)
            )
            elide (mode: 'write) :pos section-header-rule
            to end
        ]

Sidenote for those who don't know: ASSERT is now an invisible.

Seeing the wordiness makes me wonder if an alternative notation should be available to mean the same thing. I don't think :(...) makes sense, because colons don't have any assocation with discarding.

/(...) is another option:

        parse skip executable string-header-offset [
            /(mode: 'read) pos: section-header-rule
            /(
                assert [sh_offset = string-section-offset]
                sh_size: sh_size + (1 + length of encap-section-name)
            )
            /(mode: 'write) :pos section-header-rule
            to end
        ]

@rgchris would hate it. Period is more subtle:

        parse skip executable string-header-offset [
            .(mode: 'read) pos: section-header-rule
            .(
                assert [sh_offset = string-section-offset]
                sh_size: sh_size + (1 + length of encap-section-name)
            )
            .(mode: 'write) :pos section-header-rule
            to end
        ]

But I'm looking at reserving .even? and such for predicates in parse, which is important:

 parse [1 2] [some .even?]

We could go for the doubling-up of GROUP! to force a discard:

        parse skip executable string-header-offset [
            ((mode: 'read)) pos: section-header-rule
            ((
                assert [sh_offset = string-section-offset]
                sh_size: sh_size + (1 + length of encap-section-name)
            ))
            ((mode: 'write)) :pos section-header-rule
            to end
        ]

It has the quality that you can tell on both ends that the discard is happening. It doesn't line up very well with the splice-meaning for the same notation in COMPOSE.


All that said, I don't think the :(...) for PARSE splicing is that terrible. It does seem the rarer operation, it feels learnable, it doesn't require breaking compatibility with history. I'd also considered that it might be done with ((...)) which could be seen as asking to "splice" the parse rule in...and maybe it stands out better and looks prettier and more symmetrical.

1 Like

I'm all for consistency. But GET-WORD!s are commonly used to set/inject a new position into PARSE, so it doesn't seem much of a cognitive stretch to use GET-GROUP to splice results in. I don't know what the chances are for getting a notation or shorthand for ELIDE... I don't care much for the forward slash and I'd rather be able to use something like a tilde or something that provides a decent visual cue.

I had to think this over, because I wasn't sure wether my reluctance was just because of change.

In my experience adjusting rules is an important usage of groups in parse, but not the most frequent usage.

Furthermore the number of elides just doesn't look good.

So I would vote to keep plain groups as vanishing and get-groups as splicing. Double groups could be used as well, but I think that might be harder when you are constructing rules programmatically.

1 Like

One thing that I have believed all along is that groups needed to act consistently. e.g. if a rule was going to see the result of an expression, it couldn't be in a plain group...because plain groups were discarded.

@rgchris was less concerned about this, and didn't see a problem with the likes of:

parse "aa" [some (first ["a" "b"]) (x: "c")]

...with the feeling that you should know how many arguments rules take, and so if a group is in a position that a rule expects an argument, then the value is seen.

That didn't feel right to me...PARSE rules are hard enough to read as it is. So at the very least I preferred to see:

parse "aa" [some :(first ["a" "b"]) (x: "c")]

This cues you to realize when an evaluation result is being actually used.

However...this pre-dated the idea of rule-splicing. So when I suggested it, I was only meaning that :(...) would be "passing an argument to a rule", and it would be an error otherwise.

But if we're talking about it always splicing content, then things start to get murkier again...making me wonder if :(...) should be for slipping arguments to rules but an error in other places, ((...)) should be for splicing rules but an error in arguments, (...) should be discarding and an error in arguments too.

I do feel that the :(...) with the colon is a bit slight and easy to miss for something that is fundamentally putting code into the parse stream...so my mind does go back to ((...)) which was the first implementation.

2 Likes

This example doesn't quite sit right:

If a group's product is to be used as a rule, it should be explicit, say:

parse "aa" [some use (first ["a" "b"]) (x: "c")]

This could be a GET-GROUP! where this type is available, I suppose (I'm not wholly convinced of GET-GROUP!'s necessity), though my leaning would be to solve through words.

Okay, well that would make it sound like we are in agreement.

My historical stance was that all plain GROUP!s should discard. Hence plain groups in positions as arguments to rules should raise errors.

If the result of a GROUP! is to be "used" then that would require some form of decoration.

Right now I kind of feel like the light-and-easy-to-miss GET-GROUP! is a fit for arguments to rules, while a doubled-up group might be better when splicing a new rule when it isn't an argument. But I don't know.


For a construct like KEEP we have the distinction of whether you want to keep via matching a rule, or whether you just want to add some raw material that isn't in the input series.

My current plan is that @value means to match a value literally (e.g. if VALUE is a block then it doesn't use it as a rule the way plain value does). But then @value is a rule, so keep @value should honor the idea of saying you're keeping if it matches that exact value. Hence that eliminates the idea of saying that keep @([<some> <stuff>]) would literally keep that data without matching.

GET-BLOCK! would be a possibility for that, though it would perhaps be weird to be running DO code inside PARSE in a BLOCK, vs. a group. keep :[reverse [<stuff> <some>]]


Anyway, no shortage of things to think about. Hope you're finding some free time to come help think about it. Lots of interesting stuff afoot, which I think is validating the design path taken.

1 Like