Replacing R3 PARSE's "DO" Rule w/UPARSE's EVALUATE

hostilefork · January 27, 2021, 9:11am

R3-Alpha PARSE had a partially-implemented feature that Red did not carry forward which was the do keyword. I'm pretty sure nobody used it--because it didn't work. It took me a while when I first looked at PARSE's code to even figure out what it was supposed to do.

It was trying to perform a "DO/NEXT" step on the input as a code block, and match the following rule against the evaluative result:

r3-alpha>> parse [reverse copy "aab"] [do "baa"]
== true

I gather that things like the following were likely intended to work, but didn't:

r3-alpha>> parse [reverse copy "aab"] [do ["b" some "a"]]
== false

It seems this feature was added to make it easier to do dialect processing where the dialect had code inline in the block you were processing...but not in a block or group. This means only the evaluator knows how much to advance to do one step of processing. So could be useful.

But the implementation was completely broken (partially because R3-Alpha PARSE evolved organically and non-trivial rules could not compose):

r3-alpha>> parse [1 + 2] [set var do [quote 3]]
== true

r3-alpha>> var
== 1  ; not 3, which you might expect

r3-alpha>> parse [1 + 2] [thru do integer!]
== false

hostilefork · May 23, 2022, 4:30pm

With UPARSE, I realized this had become trivial!

Meet UPARSE-EVALUATE

It's everything R3-Alpha's PARSE-DO wanted to be and more...and it's two lines of code!

uparse-evaluate: combinator [
    {Run the evaluator one step to advance input, and produce a result}
    return: "Result of one evaluation step"
        [<opt> any-value!]
][
    if tail? input [return null]

    return [# (remainder)]: evaluate input
]

(Note: EVALUATE is a multi-return, where the first output is the evaluated-to value and the second is the position after evaluation. [# (remainder)]: says we want the main output but don't want to have to name it (#) and this will be the overall result of the expression...what the combinator product is. (remainder) says the variable name to put the second output in is specified by remainder, and this is the protocol for how combinators tell UPARSE the amount of input consumed. So it's perfect!)

Let's make a goofy function that uses it, that either keeps or prints expressions it sees:

keeper-printer: func [block [block!] <local> mode value] [
    mode: #print
    uparse block [collect [
        some [
            mode: ['<K> (#keep) | '<P> (#print)]
            |
            [',]  ; skip over commas
            |
            [value: uparse-evaluate] [
                :(mode = #keep) keep ^ (value)
                |
                :(mode = #print) (if did value [print ["PRINTING:" value]])
            ]
        ]
    ]]
]

While this is goofy, it's nonetheless pretty impressive!

>> keeper-printer [
       1 + 2
       <K> (3 + 4) * 5 if true [6 + 7]
       <P> 7 + 8, if false [9 + 10] else ["print me!"] <K>, 11 + 12
  ]

PRINTING: 3  ; 1 + 2
PRINTING: 15  ; 7 + 8 
PRINTING: print me!
== [35 13 23]  ; (3 + 4 * 5) (6 + 7) (11 + 12)

The Power of the Evaluator, Inline In your Dialect!

It's just a test right now and not in UPARSE proper yet, I am doing some fiddling with the EVALUATE native and maybe it will change. But I'd say the odds are this is worthy of inclusion.

One thing I noticed, though, is that we don't have an easy way to parse INTO a the result of an evaluate, e.g. this wouldn't work:

 uparse [1 + 2 reverse "abc" 3 * 4] [some into [evaluate] [integer! | text!]]

That's because parsing hits its limit when things aren't in a series. This suggests a need for some kind of INTO variation that implicitly wraps the argument into a block. It ties into my remarks regarding the concept of UPARSE-VALUE implicitly wrapping things into a BLOCK!.

If it's a refinement to INTO (like INTO/ENBLOCK) it should be the same pattern as for the overall parse operation... INTO/VALUE and PARSE/VALUE don't seem quite right.

Anyway, progress!

BlackATTR · May 23, 2022, 8:06pm

Very cool indeed! Great feature to add!