Time to Meet Your MATCH... (...dialect)

MATCH is a handy tool for testing a value against some basic rules, and passing it through if they match...or evaluating to null if they don't. The rules can be combined in some interesting ways that make them rather powerful!

It uses the "match dialect". This looks pretty simple on the surface, like what you put in a function spec block for the legal types:

>> match [integer! tag!] 1020
== 1020

>> match [integer! tag!] "this text value won't match"
;-- null

>> match [integer! tag!] <matches!>
== <matches!>

But with the new features of generalized quoting, now you can test for quotedness.

>> match ['word!] first [foo]
;-- null

>> match ['word!] first ['foo]
== 'foo

It actually dereferences what you give it and sums up the quote levels. So you can do things like:

 >> quoted-word!: quote word! ;-- Note: during transition, QUOTE is called UNEVAL

 >> match 'quoted-word! first [''foo]
 == ''foo

So since there is a quote level in the QUOTED-WORD! value, that gets added in with the quote on the match, so it looks for a doubly quoted value.

The premise is that each MATCH rule component is one item, and even types like INTEGER! are used...to test a length:

 >> match 2 [a b]
 == [a b]

 >> match 2 [a b c]
 ;-- null

You can use single arity ACTION!s as well, if you use a GET-WORD! or GET-PATH! to indicate them:

 >> match :odd? 304
 ;-- null

 >> match :lib/even? 1020
 == 1020

BLOCK! will OR rules together, PATH! will AND them

So this is a cool little trick:

>> match block!/2 [a b]
== [a b]

>> match text!/2 [a b]
;-- null

>> match text!/2 "ab"
== "ab"

>> match [block! text!]/2 "ab"
== "ab"

>> match '''[block! text!]/2 lit '''[a b]
== '''[a b]

>> match [integer!/[:even?] block!/[:empty?]] []
== []

>> match [integer!/[:even?] block!/[:empty?]] 1020
== 1020

Pretty cool huh? And as I mentioned, you can factor these rules out like in PARSE... note also that instead of :empty? you can just use 0.

 >> even-int!: lit integer!/[:even?]
 >> empty-block!: lit block!/0

 >> match [even-int! empty-block!] []
 == []

 >> match [even-int! empty-block!] [a b]
 ;-- null 

MATCH has an automatic erroring form, called ENSURE

If you want a quick and dirty way to typecheck something and pass it through, but error otherwise, use ENSURE.

>> ensure [even-int! empty-block!] [a b]
** Script Error: ...

>> ensure [even-int! empty-block!] 1020
== 1020

MATCH is now built in as a PARSE keyword...

The quoting features of MATCH were important for PARSE to help pick up the slack after LIT-WORD!.

>> did parse ['a b 'c d] [some [match [word! 'word!]] end]
== #[true]

There was a little bit of a incongruity previously, which is that MATCH did not want to quote its first argument. So you couldn't say match 'word! lit ['foo] and have it match, because the evaluator would strip off the quote. When all things were considered, it seemed to make more sense to have MATCH soft-quote its first argument, so it doesn't throw away the quote marks...but uses them in the rule.

PARSE then has a compatible expression, without a block! needed:

>> did parse ['a 'b 'c 'd] [some [match 'word!] end]
== #[true]

<opt>, falsey values, and /ELSE

You might imagine a lot of code wants to say if match [...] whatever [...]. This could lead to unsatisfactory results if the thing you're matching is a null, blank, or logic false -- even though you matched, the falsey nature of the thing you were testing would foil your intent.

To help catch those errors, any falsey input that matches will be voidified. So at least you'll get a clear error if you used the result. But since voids are values, you'll be okay if you use THEN or ELSE

>> match [<opt> integer!] null then [print "matched, and void cued then!"]
matched, and void cued then!

There's also an /ELSE refinement, so you can provide a branch of code to run if there's no match...and it won't mutate the result at all:

>> match/else [<opt> integer!] null [print "didn't match" 100]
;-- null

>> match/else [<opt> integer!] #foo [print "didn't match" 100]
didn't match
== 100

Useful Dialect, Good Testbed for PATH!s

You can see a detail above of how I want to use things like :even? as a test, and then use PATH! for AND-ing tests together generically. But then, :even?/integer! is a GET-PATH!, while integer!/:even? would be an ordinary path. The meaning gets confuzzled... how would you specify a function with refinements, or otherwise get something out of a path?

obj: make object! [even-int: lit integer!/[:even?]]
match :obj/even-int 4

To get this distinction, we have to treat :[:obj]/even-int differently from :obj/even-int. And this really does suggest to me that the notion of allowing GET-WORD!s, SET-WORD!s, and LIT-WORD!s in PATH! is a mistake...it doesn't generalize and will fall down at the head and tail. Even when it works, it's ugly.

I think this is going in the direction of making PATH! a stronger dialecting part. And hopefully, with more good examples we can keep pushing on some of the other things (like "does Rebol need a date format with slashes in it, and if so can it be accomplished naturally as a PATH!")...

1 Like

Very cool. Should be muy bueno for simple lexing and tokenizing values of a dialect-- as well as the handling of syntax errors.

The real thing to look at here, is the way that GET-WORD! and GET-PATH! are used, and why I'm prescribing that we disallow direct usage of GET-WORD!, SET-WORD!, and LIT-WORD! directly in PATH!, GET-PATH!, or SET-PATH!.

Now that there is a controlled point of path creation, and immutability after that, these rules are possible. I am looking at this idea of making a/(b): c equally efficient to a/:b: c, and if I can do so, then I think the dialect design of things using paths will get good guidance and be more solid to prohibit the latter. Because it's incoherent in the long run... if you allow a/:b: c, what's wrong with ::a/:b::c etc.

I kind of feel like this is one of the first attempts to get ambitious with PATH! in a dialect. It's been hard to do, because for reasons beyond my ability to understand, DocKimbel has defended (a + b)/c being interpreted as (a + b) /c, as a GROUP! and then a REFINEMENT! (further in the ANY-WORD! dept. vs. a PATH!). It's kind of a house of cards, IMO, and so anything done to tighten the whole thing up is good.

Clearly I'm angling for a very different idea, and the goal is specifically to enable dialect design.
PATH! should not be just about what the evaluator does with it, but these kinds of usages and beyond. It may be different from BLOCK! and GROUP!, but it can come into its own, I think.

So MATCH has become an extremely useful tool, used all the time.

But some of the wilder things it did in trying to become a matching dialect turned out to be junk. In surveying how the type block acts I mentioned that the weirder features in MATCH were not things we were likely going to want to carry forward to build on.

Not only does the C code implementing it suck, the syntax is ugly:

I'd already dropped the idea of MATCH quoting its first argument. That means it doesn't see the number of quotes you put on it's argument unless you put it inside a block:

>> match ['integer!] just '10
== #[true]

We can still talk about whether that's a great idea or not.

PATH!s For And Was Ugly

The concept that each clause in a BLOCK! is an OR makes sense with the type dialect. But using PATH!s for AND is pretty hideous.

GROUP!s might be more palatable:

>> match [(integer! :even?) (block! :empty?)] 1020
== 1020

Should Functions Need The GET-WORD! ?

Is the GET-WORD! even necessary? Could we assume that any functions that test values have names that suggest they do so...and understand that we aren't actually calling any functions?

>> match [(integer! even?) (block! empty?)] 1020
== 1020

The reason it was done with a GET-WORD! initially was for consistency with when you didn't give the rule as a BLOCK!

>> match :even? 10
== #[true]

>> match [:even?] 10
== #[true]

But is that interesting? If consistency of that kind is so important, might it be better to say MATCH always takes a BLOCK! or... a GROUP!?

Another option would be to use predicate format for functions and preface them with a dot, which would help call out that they were functions but be a bit less jarring:

>> match [(integer! .even?) (block! .empty?)] 1020
== 1020

GROUP! Can Be Used As The Main Match

With the argument no longer having quoting mean the argument is quoted, you could use a GROUP! as the main match:

>> match '(integer! even?) <not an integer>
; null

>> match '(integer! even?) 304
== 304

>> match '([block! text!] 2) "ab" 
== "ab"  ; acted like `parse try match [block! text!] [2 skip] "ab"`

Bear In Mind PARSE is now MATCH-ish

Before going too far in terms of the powers of MATCH... I should point out that now that PARSE returns its input on success and is back to require reaching the end by default, it can be used for matching purposes...e.g. "tuple"-style matches

 >> parse [1020 "hello"] [integer! text!]
 == [1020 "hello"]

PARSE is looking at sequence by default, while MATCH is looking at alternates. MATCH does not "destructure" its input...all its tests are running on the same single value.

Should (Cleaned Up) MATCH Be The Function Arg Dialect?

It was the intention that the parameter to MATCH would be the same format as the blocks used for type checking arguments.

But when you think about reading the HELP, it gets a bit verbose. It's as if anyone who comes up with a sufficiently complex parameter spec should probably name it and make a function for it.

I'm assuming no one used any of the weird MATCH features. But would you be more likely if it used GROUP!s and didn't have the need for the GET-WORD!s on functions?