So far there's only a little bit of UPARSE featuring related to errors. One is the FURTHEST return result:
>> [v furthest]: uparse "abbbabbabcabab" [some ["a" | "b"]]
; null
>> furthest
== "cabab"
What it's doing is it's recording the high water mark of whatever a combinator called success.
It's better than nothing, I guess. But for parsers that scan ahead it might be worthless. (I'll point this out to @Brett, since he suggested the feature...)
>> [v furthest]: uparse "[ababbbcabbab]" [
"[" ahead to "]" ; this pushes the high water mark to the ]
some ["a" | "b"]
"]"
]
; null
>> furthest
== "]"
So here we are not implicating the "c", which people would think of as the actual culprit. But it's harder than one might think to figure out who that is.
Recap of the New FAIL Feature
With the new FAIL in UPARSE, you have a little bit of support on implicating the point of the input to complain about.
The idea is that you make sure the parse position is where you want to implicate, by making the FAIL an alternate to that position:
>> uparse "{ababcababa}" [
into between "{" "}" [
some ["a" | "b"] <end>
| fail @["Between braces should be just a and b"]
]
]
** User Error: Between braces should be just a and b
** Near: "ababcababa"
(If you've forgotten why FAIL's argument needs the @, it's because the PARSE dialect has a meaning for BLOCK! already...and for the purposes of "regularity" in the dialect this tries not to override that. But this is an open issue if FAIL wants to break the rules.)
For demonstration purposes here, I didn't implicate the "c", but actually wrote it so the alternate is set to backtrack to when it started matching b. You get a different result to make the fail an alternate to the end:
>> uparse "{ababcababa}" [
into between "{" "}" [
some ["a" | "b"]
[<end> | fail @["Between braces should be just a and b"]]
]
]
** User Error: Between braces should be just a and b
** Near: "cababa"
A New Fuzzy Concept: ENSURE
We have ENSURE for values outside of PARSE. It runs a test and passes through the result if it matches, or stops and errors:
>> x: 10
>> ensure integer! x
== 10
>> ensure tag! x
** Error: ENSURE failed with argument of type integer!
It seems appealing to make PARSE able to do that too:
>> uparse [<x> 10 #y 20] [collect [while [
keep ensure tag!
keep ensure integer!
]]
** Error: ENSURE failed with argument of type ISSUE!
** Near: [... 10 \\ #y \\ 20]
So a similar idea to FAIL, where you get some feedback on the input location causing the problem.
But also similar to FAIL, this doesn't work within the model of having alternates. It sees something it doesn't like and errors in the moment, without giving any | options in the rest of the rules a chance. That's a bit harsh, but maybe still would fit a lot of scenarios.
The historical ENSURE only works on datatypes. Could this work on values, or alternate values?
Far-out idea:
>> uparse "abbbcababa" [some ensure ["a" | b"]]
** Error: ENSURE would have expected:
"a"
"b"
But it received "c"
The idea would be that once ENSURE started, it might have some way of collecting the "leaf nodes" of failed rules. But I have no idea how such a thing could actually work.
More generally I wonder how alternates figure into any system of error delivery.
Random Weird Dialect Idea: BAD-WORD!
Just wanted to write down a strange idea I had, to use BAD-WORD! to indicate a shorthand for FAIL with a message. The idea was to make it come after a complete rule and imply a message to give if the rule to its left didn't match:
>> uparse "[ababbbcabbab]" [
"[" ahead to "]"
some ["a" | "b"] ~a-or-b-expected~
"]"
]
** Error: a-or-b-expected
** At parse input location: "cabbbab]"
It sucks, but it was just a brainstorming idea as a shorthand for:
>> uparse "[ababbbcabbab]" [
"[" ahead to "]"
[some ["a" | "b"] | fail ~a-or-b-expected~]
"]"
]
Maybe this points to the need for an ELSE construct, as it might be a bit smoother than having to enclose everything in blocks:
>> uparse "[ababbbcabbab]" [
"[" ahead to "]"
some ["a" | "b"] else fail ~a-or-b-expected~
"]"
]