`greb` : Grep Using PARSE?

There was a Trello card with a remark from @johnk about an idea from Pekr:

Not sure how this would work, but it is an interesting excercise to think about. Something that bridges between unix and rebol syntax like:

ls | greb 'some alpha ".reb"'

(assuming a few more built in rules/charsets)

May need a collect/keep kind of wrapper to retrieve data cleanly.

This would be a very small subset of what @BlackATTR's QUERY could do...but it seems like it wouldn't hurt if we went ahead and made this program as a test and put it in the ren-c-stdio repository.

I've already made things like TAC (backwards unix CAT), where the goal is to see how essential and correct we can make things:

I like the idea of making GREB similarly essential, and part of the samples. Worth doing.

(Note that grep was itself originally framed as a one-line awk program.)

I'll also throw in a little ping that it would be nice to have a simple web app that exposed PARSE functionality with some highlighting, like http://regexpal.com/. Seems sometimes we get such big ideas in our head we forget to do the small ones (!)

2 Likes

Greb Lives.

https://github.com/hostilefork/greb

It's of course catastrophically slow as such utilities go. But, design first, then optimize.

I've started it off on the right foot, with a test. Not a complicated one...it just filters a pre-made list of filenames, and makes sure it gets the expected results:

test-greb.yml (Line 90)

So that shows it working on Mac, Linux, and Windows.

What's Next For Greb?

Rulesets coming from a file are clearly an important feature. So instead of:

cat data.txt | greb "your <rule> here"

You would want to say:

cat data.txt | greb %my-rules.greb

But what format would these rules files take? How do they separate defining rules from running rules? How do they do #include of other files for rules they use?

For instance, some-a: [some "a"] could mean "assign a block to represent the reusable SOME-A rule". Or it could mean "run the some combinator on the input looking for the letter a, and then store the result in the variable some-a".

When we write source code, we make that separation by having code outside of PARSE that assigns the rules, then inside the PARSE it's run. But greb is hiding the call to PARSE and making it implicit. So what's the limit of what a greb file does?

Maybe you define rules, and it checks to see if there's one called main... if there's not, you still have to provide a string of rule code to get it kicked off?

:man_shrugging:

That's one question. But then, how would the binding work?

This first take on greb implementation shows that I don't try to BIND things like DIGIT and ALPHA, but rather slip them into the combinator set. That's actually how DESTRUCTURE works as well.

There's a lot which this seemingly-"simple" task points out about things we may not have elegant solutions for. So it's definitely worth looking at.

Anyway, it has been started!

2 Likes