Portable Bridge Notation (PBN) Parser

I decided I would do a small task in the web Ren-C build, which is to take the "Portable Bridge Notation" for representing a deal of a hand of cards, and turn it into blocks of symbolic data representing each player's hand.

The notation is pretty straightforward, e.g.

N:QJ6.K652.J85.T98 873.J97.AT764.Q4 K5.T83.KQ9.A7652 AT942.AQ4.32.KJ3

Separated by spaces are the cards for each of 4 hands. The suits are separated by dots, and the order is clubs.diamonds.hearts.spades. T is used for 10, while J/Q/K/A are the typical Jack/Queen/King/Ace. The first letter is a direction (N=North, E=East, S=South, W=West) of which player the first hand represented is for.

This case decodes like so:

(console escaping is now done by Rebol code, so this shows some colorization and live hyperlink as an example of what we might do with that).

I wanted the PBN conversion to be accessible and demonstrate "best practices". Here's what I came up with:

pbn-to-hands: func [
    {Convert portable bridge notation to BLOCK!-structured hands}

    return: [object!]
    pbn [text!]
][
    let rank-rule: [
        'A | 'K | 'Q | 'J | 'T | '9 | '8 | '7 | '6 | '5 | '4 | '3 | '2
    ]

    let suit-order: [♣ ♦ ♥ ♠]
    let suit: void

    let one-hand: void
    let one-hand-rule: [
        (suit: '♣)
        [
            collect one-hand [4 [
                any [
                    set rank rank-rule
                    keep :[
                        as word! unspaced [suit either rank = #"T" [10] [rank]]
                    ]
                ]
                :(
                    suit: select suit-order suit
                    if suit ["."] else '[ahead [space | end]]
                )
            ]]
        ]
        |
        (fail "Invalid Hand information in PBN")
    ]

    let hands: make object! [N: E: S: W: void]
    let direction-order: [N E S W]

    let start: void
    let direction: void

    parse pbn [
        any space  ; We allow leading whitespace, good idea?

        [
            set start ['N | 'E | 'S | 'W] (
                start: to word! start
                direction: start
            )
            |
            (fail "PBN must start with N, E, S, or W")
        ]

        [":" | (fail "PBN second character must be `:`")]

        [
            [4 [
                one-hand-rule (  ; Should set `one-hand` if rule succeeds.
                    hands/(direction): one-hand
                    one-hand: void
                    direction: (select direction-order direction) else [
                        first direction-order
                    ]
                )
                any space  ; Should more than one space between hands be ok?
            ]]
            |
            (fail "PBN must have 4 hand definitions")
        ]
        end
    ]

    assert [direction = start]  ; skipping around should have cycled
    return hands
]
2 Likes

Conceptual Problems Encountered While Writing This

This is pretty straightforward as tasks go. And certainly the PARSE-based code is more pleasing to look at than many versions. But I'll list a few gripes.

  • The C code I looked at to model from used a variable called first for the starting "direction". I used the same name and obliviously overwrote FIRST the Rebol operation. This made later code not work. It's certainly a cautionary tale for the casual overwriting of standard library functions...which makes you wonder if there's anything we could do to warn you about doing this accidentally, and then have an override notation. Something in the function header? Something on the LET itself? What would cause these warnings and why?

  • I've explained my reluctance to embrace locals-gathering via SET-WORD! as a built-in idea. It may be good for some things, but I think it needs more control. The solution I've proposed which co-opts a single word vs. all possible set words is LET, but I'm m still in a gray area about how LET is going to work. This continues to be a heavy concern, but I'm exploring what code in the LET-world would look like here.

  • Cards usually are written the other way, as 4♠ and not :spades:4 (something about rendering in Discourse seems to think the latter notation is worthy of a much bigger spade...no idea why, might be worth finding out). But that would not be legal as a Rebol WORD!. The impedance match of such things is not a unique problem of Rebol, and there are more options like <4♠> while most languages would just have one string type. But it was a bit disappointing nonetheless to have to make the concession.

  • Returning OBJECT! still makes me uncomfortable, as I feel there's a real inelegance to the single OBJECT! coming back as MAKE OBJECT! [...], and it goes out of the domain of concrete parts fast. I'm worried about adding methods and having that look ugly. I've written about wanting to maybe find some grand unifying theory where OBJECT! is just some constrained optimized BLOCK!. But that feels distant. In short, I'm just a bit torn over the return format here... I want to feel like we made it better than the PBN, not worse.

Technical Issues Encountered While Writing This

I deliberately chose to use the UTF-8 characters for card suits in the code. Partially to make it look "cool", but also to exercise the code paths.

One of the big problem areas with doing so was the Windows Console. If I tried to paste any code containing the card suits, they'd be invisible. This is because the console layer has no idea what a PASTE is, so what Windows does is it simulates key-down and key-up events for every character as if you were typing them. They must have gotten something wrong, because the card suits were not getting key downs... only key ups. Others have faced this problem and worked around it., so I incorporated their workaround.

Next is that I got it in my head that I wanted to use the lighter notation of 'N to match a letter in the input rather than "N" or #"N". It's 3 fewer apostrophe-style marks than a string, and it seems there's no harm in allowing you to match WORD!s against strings. After all, Rebol2/Red/R3-Alpha allow you to FIND that way:

>> find "abchellodef" 'hello
== "hellodef"

So I went ahead and added that ability, for both WORD!s and INTEGER!s in strings. This kind of opens a can of worms--as we might ask why looking for an INTEGER! in a string wouldn't be searching for the codepoint. But you can do that with find some-string to char! some-int. BINARY! is another story, and with doors open for searching binaries for strings it's the case that searching for integers finds the byte value and not the string-ized representation of that integer as ASCII. It's something to think about.

I decided to use COLLECT, but when I did I realized the thing I was collecting was not KEEPing material from the input, but a synthesized card symbol. The KEEP we had has the default interpretation of assuming you mean a pattern:

>> parse "aaa" [collect data [some [keep "a"]]]
== ""

>> data
== ["a" "a" "a"]

But what if each time I see an "a", I want to keep a "b"? That's not coming from the input.

@rgchris and I have been discussing the three things you might want to do with DO-style code embedded into a parse:

  • vaporize the result (currently ()) - this is usually the traditional behavior of GROUP! in PARSE. But when used as a parameter to a rule, this could vary. e.g. change [some parse rule] (some code) will use the code to generate what to change to.

  • treat the result as a rule (currently :()) - this has become a favorite of mine, as it frees us from the oddity of PARSE's IF and trying to map control structures into PARSE. You don't have to pre-COMPOSE a PARSE rule, but every time the code is visited it effectively re-runs a composition.

  • fabricate material unavailable in input (currently :[]) - when you look at things like CHANGE or the particular need to KEEP something that isn't in the input series, you have to be able to run code to make that new data.

So the answer with this setup of "match a but keep b" would look like:

>> parse "aaa" [collect data [some ["a" keep :["b"]]]]
== ""

>> data
== ["b" "b" "b"]

Whatever our beliefs about notation, these are desires you can have. In this case I made :[] work as described so I could keep my synthesized data.

I wanted to make the KEEP rule look like:

keep :[join suit (either rank = 'T [10] [rank)]

Because I had the idea that if you matched a WORD! in your text input, that the SET would come back as the WORD! for the character... not the character itself. This is a concept that needs to be thought out more, because I'm not sure we completely understand the rationale discerning SET and COPY and what their behavior should be.


As usual, Rebol-ish code can be nice to look at once it's written...but the path to getting there can be pretty hard. As usual: a debugger would really help.

2 Likes

Tag! doesn't seem all that bad an option, nor does reversing as :spades:4 either (epecially if as a word you set it to "4♠").

What about Map! ?—seems that is positioned as the more structured Block! within data structures where Object! is more for carrying class-like logic.