FURTHER as its own separate UPARSE combinator

So UPARSE design is really clicking along now, and here's a new idea that helps address the loss of the advancement behaviors traditionally built into old ANY/SOME when using the new SOME.

The idea is to break out a test for advancing as a rule of its own:

>> uparse "a" [further [opt "a" opt "b"] (<at least one!>)]
== <at least one!>

>> uparse "b" [further [opt "a" opt "b"] (<at least one!>)]
== <at least one!>

>> uparse "" [further [opt "a" opt "b"] (<at least one!>)]
; null

(Note: I had originally called this ADVANCING, but FURTHER is less likely to be used as a variable name to track an advancing state, e.g. advancing: true. And ADVANCES sounds like it might actually be taking an action. FURTHER is a shorter word that is a strange part of speech that is unlikely to be a variable or misunderstood as taking action. Feedback welcome.)

You wouldn't have to use it with loops. But if you use it with a loop, it's quite clear, and stops you from infinite looping when everything "opts out" yet still succeeds a rule:

>> uparse "abba" [
       some further [opt ["a" (print "A")] opt ["b" (print "B")]]
   ]
A
B
B
A
; null-2

OOO...So much beauty, I cannot take it.

1 Like

PARSE was always a jewel, but you're really transforming it into a diamond cutter. :gem:

1 Like

I managed to get the system to boot after removing the "must make progress rule" from ANY and SOME in historical PARSE.

Only one change was needed, to CLEAN-PATH. I've extracted the relevant bit so you can get a sense of what kinds of situations the old behavior was for:

parse reverse target [  ; actually processing file path string *backwards*
    some [
        "../" (handle-slash-dot-dot)  ; backwards, remember?
        |
        "./" (handle-slash-dot)
        |
        "/" (handle-slash)
        |
        copy part: [to "/" | to end] (handle-fragment part)
    ]
]

Once you reach END, this will loop forever.

The reason why is that in this list of alternates, the last alternate decays to [TO END]. And TO END will always succeed.

This could be changed to use SOME FURTHER and it would resolve the problem. But there's a couple of oddities on this code. It's using SOME and it doesn't actually check the parse result at all...so it could have used ANY and it would have made no difference.

But FURTHER kind of says more than it wants to say. It wants to say that if the loop reaches END then it should consider itself done. Why not just say that... using the cool new WHILE?

parse reverse target [  ; actually processing file path string *backwards*
    while [not <end>] [
        "../" (handle-slash-dot-dot)  ; backwards, remember?
        |
        "./" (handle-slash-dot)
        |
        "/" (handle-slash)
        |
        copy part: [to "/" | to <end>] (handle-fragment part)
    ]]
]

For this case, it would be my preference over trying to find a way to use FURTHER. Because it's really about wanting to make sure the loop stops at the end. But if you're talking about something that isn't end-specific, that wouldn't work.

Anyway, hopefully this makes sense. I think it's a lot clearer than before!

2 Likes