At the top level, PARSE allows only a BLOCK! rule.
>> parse ["abc"] "abc"
** Script Error: parse does not allow #[datatype! text!] for its rules argument
But when a rule is being invoked by reference, you get the option of that reference not having to be a BLOCK!:
>> rule: "abc"
>> parse ["abc"] [rule]
== ["abc"]
Being able to break down your parse rules into named subexpressions is one of the huge advantages PARSE has over things like RegEx. But there are questions raised about whether certain fetched contents from names should be matched as value, matched as rule, or considered errors.
As an example: What do you think this should do?
word: 'reject
parse [reject reject] [2 word]
There seem to be 3 possibilities for what [2 word] does in this context:
- act like [2 reject]
- act like [2 'reject]
- raise an error
I think #2 should be out of the question. It's just too inconsistent with the rule-interpretation if it had been a BLOCK!.
What I have proposed to get #2 would be:
word: 'reject
parse [reject reject] [2 @word]
This lets you fetch a value and match it literally. So BLOCK! would not be matched as a rule either.
#1 has the appeal of generality and the substitution principle. It implies any term that could occur in the rules as source could be put behind a WORD!.
>> sub: '(print "Hi!")
>> parse "aa" [some ["a" sub]]
Hi!
Hi!
== "aa"
That seems kind of interesting. Though if we're truly allowing anything, it means QUOTED!s would match things at one level of quote below themselves.
>> sub: ''reject
>> parse [reject reject] [2 sub]
== [reject reject]
Though having to decorate the value you're matching to say "I'm meant as a match" is awkward. This is why I like the @word concept... it lets you put the "match literally" annotation on the reference, while leaving the thing you are matching at the right quote level.
I really think the above is clearer as:
>> sub: ['reject]
>> parse [reject reject] [2 sub]
== [reject reject]
It's enough clearer that I feel a twinge of prescriptivism in saying that you either use the @xxx syntax at your reference sites or you form your subrules as proper blocks. Which is why I'm bringing up option #3.
But the flexibility of the substitution principle is kind of hard to deny. You might want to define:
match-op: either condition ['any] ['some]
parse "aaaa" [some "a", match-op "b"]
If all rules are forced into blocks, you wouldn't get that parameterization. Because this wouldn't work:
match-op: either condition [[any]] [[some]]
parse "aaaa" [some "a", match-op "b"] ; some "a", [any] "b"
With a block-rule-only when fetching WORD! we could say that you can get such weird behavior only if you use :match-op as GET-WORD!.
I Guess #1 Should Win
What I've mostly done here is explain why #2 is not going to happen, and why you should be happy for the new @(...) matching.
(Note: @foo will not allow voids, but if you do @(get/any 'foo) and it's void it will match that void symbol...because inside the GROUP! you had to do whatever the rigamarole to get at a void was. This protects against typos so that @some-undef doesn't try and match ~undefined~
and silently fail.)
Note About Interoperability With The @[datatype] Proposal
Despite this new meaning for @a
, @a.b
, @a/b
, and @(a b)
... there'd not be any reason for @[a b] to mean the same thing as '[a b] as a parse rule.
This outlier status was the original premise guiding why @[integer] and friends were "relatively useless" enough to take for datatypes.
But the outlier status is also kind of confusing. Plus now that @[...]
is a common branch type, and wanting append [a b c] @[d e]
to give [a b c [d e]]
it complicates the picture.
With VOID! representation having undergone a eureka moment, it would be nice if datatypes have such a moment soonish.