What to Call Historical "SKIP" in PARSE?

SKIP suggests you're not using the result. Yet historical SKIP doesn't really do that:

rebol2> parse [1] [set x skip]
rebol2> x
== 1  ; How was this "skipped" exactly?

We could use another name for "match next input element". What should it be?

Moving thread here, starting with @IngoHohmann's original proposal that skip mean what I've called "elide", e.g. don't use result in a block assignment:

>> uparse [1 2] [x: [skip integer!, integer!]]
>> x
== 2

Seems to me, that skip would be the natural name for your elide.

The current meaning of SKIP has always seemed a bit off to me (e.g. item: skip looks weird, why would you be using a result you skipped?). I wouldn't be that opposed to changing it.

Supporting the idea that not everyone loves SKIP, Topaz uses * as a synonym for match any single item. I'm not crazy about it because it runs against the grain of what * tends to mean in matching dialects--which is "match many arbitrary things"--you'd think it would be a synonym for ANY. The more typical case of "match one thing" would be ?, which Ren-C has left open for dialects vs. canonizing for purposes of help. item: ?, items: some ?

Ren-C has an experiment where I thought it would be interesting to try using BLANK!, because of some precedent of its use for "match anything here"...but the idea hasn't exactly taken the world by storm. It was a thinking point for how the source-level use of something in a dialect might differ from the fetched use...which is something I still wonder about.

Block rules could use ANY-VALUE!, which I've proposed offering a shortened form of as VALUE!. That's not generic with string or binary parsing, where I've been thinking datatypes would run a TRANSCODE-like operation. (e.g. parse "a" [c: char!] wouldn't work, but parse {#"a"} [c: char!] would.) This also runs up against the long-pondered issue of "what is a datatype, anyway".

Long story short... I think you're right that SKIP would look nice there, but we'd have to cook up another name for what today's SKIP does.

Just a thought: maybe OMIT
Then you'd have EMIT and OMIT.

(and perhaps a more plain-speech keyword could be: DROP)

1 Like

OMIT is a possible alternative term for ELIDE... a word more people know. But I personally find the word ELIDE more pleasing.

I changed the existing SET to only allow you to set to one item. This broke a bit of parse code in TLS on some servers, due to this rule:

        set cmd [
            <client-hello> (
                client-hello/version ctx [1.0 1.2]  ; min/max versioning
            )
            | <client-key-exchange> (
                client-key-exchange ctx
            )
            | <change-cipher-spec> (
                change-cipher-spec ctx
            )
            | <finished> (
                encrypted-handshake-msg ctx finished ctx
            )
            | #application set arg [text! | binary!] (
                application-data ctx arg
            )
            | <close-notify> (
                alert-close-notify ctx
            )
        ]

The problem is that #application set arg [text! | binary!] was expecting to set the command to #application.

Anyway, here was a real world case that was taking advantage of the "SET just sets to the first thing" property. It would be a case where the ELIDE would come in handy.

Regardless of whether or not skip is the right name for elide (it feels like it may be), I'm continuing to be bothered by the likes of:

>> uparse "a" [x: skip]
>> x
== #a  ; how was that "skipped"?

>> uparse [1] [x: skip]
>> x 
== 1  ; how was that "skipped?"

skipping just doesn't seem like it should be "value-bearing". And the Haskell uses skip for the meaning you would think of not gathering the data.

What we want is more along the shade of meaning of ACCEPT. TAKE is a good short word, but it suggests mutation.

My suggestion that TAG! or otherwise might be an instruction makes <?> which might look better than ?:

>> uparse [1] [x: ?]  ; seems thin
>> x 
== 1

>> uparse [1] [x: <?>]  ; a little more "heft"
>> x 
== 1

Actually, you could put it in its own block if you wanted to, and it should work the same:

>> uparse [1] [x: [?]]  ; slower/costlier but could be optimized
>> x 
== 1

The asterisk isn't terrible, and it may be silly to avoid it on the basis of its usual "match many things" meaning. COMMA! can make it easier to see how it plugs into the rules:

>> uparse [1 2] [x: *, y: *]
>> x 
== 1

>> uparse [1 2] [x: [*], y: [*]]
>> x 
== 1

Right now * seems probably like the best of the ASCII offerings. And at least one other person who has thought about it has decided on it, so, that's one vote from someone not present.

I'm going to switch SKIP to * for now and we can see how it feels.

I dunno, this is breaking with a near-universal pattern-matching convention. Most devs spend their time in other languages, not Ren-C.

If we're interested in bowing to "universal" conventions, we could use dot. It doesn't go along fantastically with commas, but you have brackets available there too:

>> uparse [1 2] [x: ., y: .]
>> x 
== 1

>> uparse [1 2] [x: [.], y: [.]]
>> x 
== 1

BLANK! remains an option, which was what Ren-C had been using:

>> uparse [1 2] [x: _, y: _]
>> x 
== 1

>> uparse [1 2] [x: [_], y: [_]]
>> x 
== 1

But this gets into the contentious meaning of what BLANK! is... as to whether it's a proxy for nothingness or a reified thing in its own right.

I need to reread my writing on the topic.

I'm not that concerned with following conventions, but avoiding clashes with them.
So the "this is kind of like what you may be familiar with in other languages, except if you use it like that it will bite you" is not a great flex. :slight_smile:

Yeah, I don't care for dot. I do not mind plain old ? and BLANK! works better than I'd expect-- visually it looks like an empty "slot".

Maybe one of:
Next, This, Item

2 Likes

WHAT should it be...

Could vary depending on the type used: binary! => byte(s); any-string! => char(s); any-block! => value(s) or a universal name atom(s).