Should PARSE Rules Always Be BLOCK!s ?

hostilefork · November 7, 2020, 5:26am

At the top level, PARSE allows only a BLOCK! rule.

>> parse ["abc"] "abc" 
** Script Error: parse does not allow #[datatype! text!] for its rules argument

But when a rule is being invoked by reference, you get the option of that reference not having to be a BLOCK!:

>> rule: "abc"

>> parse ["abc"] [rule]
== ["abc"]

Being able to break down your parse rules into named subexpressions is one of the huge advantages PARSE has over things like RegEx.

But there are questions raised about whether certain fetched contents from names should be matched as value, matched as rule, or considered errors.

As an example: What do you think this should do?

>> x: first [skip]
== skip

>> parse [skip skip] [some x]
???

There seem to be three possibilities for what [some x] does in this context:

First Possibility: match x as rule, e.g. act like [some skip]

This is what R3-Alpha does, and allowing "rule indirection"... e.g. if a word looked up to a word that was a rule, it could be used:

r3-alpha>> x: first [skip]
== skip

r3-alpha>> parse [skip skip] [some x]  
== true

On the surface, this has the appeal of generality and the substitution principle. It implies any term that could occur in the rules as source could be put behind a WORD!.

 >> x: '(print "Hi!")
 == (print "Hi")

 >> parse "aa" [some ["a" x]]
 Hi!
 Hi!
 == "aa"

I am not a fan of this. It may be what you meant when it's something like a word or group, I think the more likely intent with a generic X would be that you were trying to match the content literally.

(And retriggering rules looked up via WORD! is very sketchy...especially if the rule takes arguments. Beyond being sketchy, it simply doesn't work in the UPARSE combinator model.)

Second Possibility: match x as value, e.g. act like [some 'skip]

Despite this being more likely what you meant, there's no need to guess. We have a new means of accomplishing this with the @-types that is much more general:

>> parse [skip skip] [some @x]
== skip

This lets you fetch a value and match it literally. So BLOCK! is not matched as a rule either:

>> block: [some "a"]
== [some "a"]

>> parse [[some "a"] [some "a"]] [some @block]
== [some "a"]

Third (Best) Possibility: Error

I feel a twinge of prescriptivism in saying that you either use the @xxx syntax at your reference sites or you form your subrules as proper blocks -or- inert values (strings and binaries).

So I think this is the option we should go with.

hostilefork · June 26, 2023, 6:17pm

While words-looking-up-to-words were prohibited (along with words looking up to integer, or group, etc.), I thought maybe supporting QUOTED! would be all right. It seemed like variables holding quoted things were kind of rare, and it could help with efficiency.

>> rule: first ['skip]
== 'skip

>> parse [skip skip skip] [some rule]
== skip

This optimization over rule: ['skip] is only going to be applicable when the user of the rule is using it abstractly...e.g. they don't know enough to know whether it's something complex that's got a lot of rules in it or if it's just matching a single value. What we lose here is the optimization that the authors of such rules could avoid the overhead of a BLOCK! and still say rule undecorated vs. need to psychically choose between @rule and rule based on how the rule is expressed.

So this exception was carried over from PARSE3 to UPARSE.

But nowadays, QUOTED!s aren't so rare. You might be intending to actually match quoted skip, and giving an error could provide guidance to use the @ operations.

Also, making these "optimized rules" isn't the most intuitive thing in the world. e.g. the person making the rule has to quote the quoted somehow:

rule: ''something
rule: first ['something]

Cleanliness-wise, using a block is more obvious, and looks more "rule-like" in that it's how you would write the match inline.

rule: ['something]

In practice, I've just about never used the QUOTED!-to-avoid-a-block optimization. It's a very rare scenario...now that @ exists, that covers most cases in a much more clear way (because what you're matching in the variable can be as-is). So it's an optimization for an almost-never-occuring case of using a rule written by someone else that has nothing to match but a single literal item.

Actually...I can't think of this having come up, ever. Hence the potential for confusion of not catching when someone actually meant to match a QUOTED! is almost certainly too great. I'm going to kill it.

_{As usual...I'm glad we had this talk.}