Literal Matching with the @ Types In UPARSE

hostilefork · August 2, 2021, 6:58pm

I mentioned that the @ types were slated for use for literal matching. The most frequent example I have given is:

>> block: [some "a"]

>> uparse [[some "a"] [some "a"]] [some @block]
== [some "a"]  ; success gives result of last matching rule

Works with all types:

>> num: 1

>> uparse [1 1 1] [some @num]
== 1

I didn't mention things like @(gr o up) but those work too:

>> uparse [1 1 1] [some @(3 - 2)]
== 1

I realized I actually do not know how to write the above two cases in Red or Rebol2. You can't use the number as a plain variable in Red, since it acts as a repeat rule (UPARSE prohibits that, since it's a rule that takes an argument, you must use REPEAT for such behavior)

red>> num: 1

red>> parse [1 1 1] [some num]
*** Script Error: PARSE - invalid rule or usage of rule: 1

Also in Red, I'm not clear on why the following isn't an error, since the GROUP! product is just discarded:

red>> parse [1 1 1] [some (3 - 2)]
== false

This is something that would work in R3-Alpha, but doesn't in Red or Rebol2:

red>> parse [1 1 1] [some quote (3 - 2)]
== false

Your guess is as good as mine. Whatever the answer in their world is, it's not obvious. But I think the @ types give a clean answer in UPARSE.

But What About @[bl o ck] ?

We might say that it means match a block literally:

>> uparse [[some "a"] [some "a"]] [some @[some "a"]]
== [some "a"]

This seemed a bit wasteful, since we already have a way to match blocks literally by quoting them:

>> uparse [[some "a"] [some "a"]] [some '[some "a"]]
== [some "a"]

But UPARSE has changed the game for why @[...] and [...] can mean different things...because block rules synthesize values. And who's to say you might not want to match a rule and use its product as the literal thing to match against?

>> uparse [1 1 1 2] [@[some '10, (10 + 10) | some '1 (1 + 1)]]
== 2

In other words your rule can match and provide an answer for the thing to match next. We have zero experience with how often that might be useful, but maybe it is?

Lone @ For Quoting Next Item Literally

I've defined @ to synthesize the value after it:

>> uparse "a" [collect [keep @ keep, keep <any>]]
== [keep "a"]

So it's acting like this:

>> uparse "a" [collect [keep ('keep), keep <any>]]
== [keep "a"]

This can be helpful when you're trying to build a rule with COMPOSE and the contention over GROUP! and trying to nest inside of it becomes a bummer.

As an example, let's say you're trying to build a KEEP parse rule inside a parse rule (which I am actually at this moment trying to do in the whitespace interpreter project):

name: "Binky"
uparse ... [... collect [
     keep (compose [keep (to word! name)])
] ...]

Okay so that gives you [keep Binky]. Not what you wanted since it will give an error that Binky is not defined. You can quote it...

keep (compose [keep '(to word! name)])

Now you've got [keep 'Binky]. But you're not trying to match Binky in the input, you're trying to synthesize it out of thin air. Imagine we need it in a GROUP!... let's just go ahead and use the wacky engroup operator to do that:

keep (compose [keep (engroup quote to word! name)])

So we got our [keep ('Binky)]... but that was annoying.

Since @ can literally synthesize a value without matching in parse, we can make something less head-scratchy:

keep (compose [keep @ (to word! name)])

And that gives us [keep @ Binky] which is much nicer. The key to seeing why this breaks us out of the problem is that it lets us get at literal values without a GROUP!, which means we aren't trying to COMPOSE inside of COMPOSE groups.

hostilefork · August 15, 2021, 2:05am

There's a lightly unsettling disconnect in this--which I've kind of learned to live with--but I'll mention it.

Defined this way, the @ operator itself has a parity. In the normal evaluator:

>> var: @ x
== x

>> var
== x

And in UPARSE:

>> uparse "" [var: @ x]
== x

>> var
== x

All's good so far... this is the Gordian-knot slicer I talked about. Synthesizing values out of thin air without a GROUP!, so it plays nicely with COMPOSE.

But the parity only exists for the standalone @. It breaks down after that. In the normal evaluator:

>> var: @('x)
== @('x)

>> uparse [x] [var: @('x)]
== x

While we accept that UPARSE and regular code act differently, I'm a bit uncomfortable that @ would act the same but @(...) wouldn't. Just something to wonder about.

hostilefork · February 9, 2024, 12:34am

hostilefork:

I mentioned that the @ types were slated for use for literal matching. The most frequent example I have given is:
>> block: [some "a"]

>> uparse [[some "a"] [some "a"]] [some @block]
== [some "a"]  ; success gives result of last matching rule

This week I've been turning over in my head how the change in meaning of @ under normal evaluation should affect its meaning in dialects.

We are free to choose new meanings entirely. But when the dialect's needs intersect the meanings generally understood by the evaluator, it seems they should be related.

Hence these questions:

If @ is to be some kind of "pick up a binding" thing, what binding is it talking about? The binding of the parse rules going by? Or the binding of the data being parsed (assuming it's a block?)
How would the important function of literal matching be accomplished if @ is taken away from that purpose?

Binding Of Data Being Parsed Is Likely More Useful

If you're using PARSE to implement a dialect, you probably need to keep track of one or more binding contexts in effect as you recurse.

I've shown this being applied with the single-arity operator tentatively called *in*.

This assumes there's something in the parse state which carries a relevant notion of a "current environment", which could be the context of the input. But the person in charge of interpreting the dialect may have ideas about this that are different.

Should this instead be the behavior of @ ?

parse [[word: 10]] [
   subparse @block! [
       let word: @set-word! let val: integer! (
           set word val
       )
   ]
]

It's different, because it isn't literally quoting its argument in the context of the parse rules. If you wanted that, then you could theoretically accomplish it by putting it in a group, like (@block!) (this assumes the combinator for GROUP! uses the parse rules current environment).

It's a bit mind bendy... and suggests that if you used @(...) then it would use DO to run the GROUP! and then bind the result of that.

So What Would You Use For Literal Matching?

Hrrrrm.

The writing on the wall may be that this is what GET-GROUP! / GET-WORD! / GET-TUPLE! do.

Those have been used for substituting rules, but maybe a keyword like REPARSE (like REEVAL?) should be doing that.

 parse "aaa" [some :(either true ["a"] ["b"])]

 parse "aaa" [some reparse (either true ["a"] ["b"])]

In the past I've suggest generalizing this operator and calling it INLINE.

 >> inline (second [a: b:]) 10
 == 10

 >> b
 == 10

So parse "aaa" [some inline (either true ["a"] ["b"])] would work.

To go the other direction and try to make "match this value literally" require an operator, it would be a weird quoting operator. I think it's more coherent if the "no I mean literally" operator is GET-XXX!, because that lines up with "don't run this function" of GET-WORD! in the main evaluator.