I wanted to make a REWORD variation that would look for escaped parts of strings and extract them as words. So:
Input: "abc$(def)ghi"
Output: ["abc" def "ghi"]
It's a common-seeming and not entirely trivial task. The first thing I came up with is a bit convoluted...perhaps because I tried to not repeat the "$(" and ")" strings in the rule:
parse text [
collect [
try some [
not <end>
(capturing: false)
try keep between <here> ["$(" (capturing: true) | <end>]
:(if capturing '[
let inner: between <here> ")"
keep (as word! inner)
])
]
]
]
It basically alternates between a capturing mode and a non-capturing mode. It decides if it needs to run a capture mode with a variable.
It has to throw in a NOT <END>
for reasons I explain in another post. Because it's running alternating rules that may both opt out.
I use a GET-GROUP! spliced conditional rule, as UPARSE doesn't have any loop-interrupting constructs yet. So you can't say "Stop running this rule, but consider it to have matched." There's only LOGIC! of #[false] which means what FAIL used to mean...e.g. the overall rule did not match (so any collected material would be forgotten).
Since it can't break out of the rule and report success, it has to have a way to skip over a rule. So the rule for capturing inside the parentheses conditions itself out with an IF statement and a generated rule. I could have instead written that as an alternate rule, where if not capturing
was true it would bypass normal code:
parse text [
collect [
try some [
not <end>
(capturing: false)
try keep between <here> ["$(" (capturing: true) | <end>]
[:(not capturing) |
let inner: between <here> ")"
keep (as word! inner)
]
]
]
]
That feels more convoluted to me because of the inverse logic of the NOT, though.
It produces more empty strings than I would like:
Input: "$(abc)$(def)$(ghi)"
Output: ["" abc "" def "" ghi]
It would technically be possible for a rule like BETWEEN to succeed and give a NULL result if there were no content, instead of an empty string:
>> parse "()" [between "(" ")"]
== ~null~ ; anti
But this then means you can't get a good distinction of what happened in the case of an optional rule.
>> parse "" [try between "(" ")"]
== ~null~ ; anti...so were there parentheses or not?
So I guess it's another situation where if you want to filter out the empty strings, you have to capture into a variable and filter it.
I think UPARSE helps out here...but it's not quite the slam dunk I'd hope for.
Because it has two rules that may both opt themselves out, it's a thought piece for asking if the NOT END makes sense with TRY SOME. Or is it better off baking that into the TRY SOME rule and having another construct? Intuitively I feel like the tax of having two slightly different versions and explaining the use of one vs. the other is worse than just having the more general construct.
If there were a loop-ending construct that indicated the overall rule was a success (e.g. didn't discard the KEEPs), then we might avoid the capturing flag:
uparse text [
collect [
try some [
try keep between <here> ["$(" | <end>]
[<end> break |
let inner: between <here> ")"
keep (as word! inner)
]
]
]
]
But I don't know if BREAK is the right name for a loop-accepting operation (as in DO's while this typically causes most loop operations to return NULL). So I'd think it would perhaps discard anything kept. Perhaps STOP would be more consistent, and it could be value-bearing as well (stop (...))