Binding Issues Raised by Chris's PARSE-MACHINE

hostilefork · January 26, 2021, 3:45am

(These posts were from a discussion of PARSE-MACHINE which became more about binding than what it was about or for. I've tried to cull some tangents and pick out what remains potentially relevant.)

It looks like a very good non-trivial use of binding to study and try to make better.

For instance, looking at:

set in system/codecs grammar make object! compose/only [  ; why I like SPREAD
    name: quote (grammar)  ; today, this could be done as name: '(grammar)
    suffixes: (suffixes)
    identify?: _
    rules: (bind rules parser)  ; v-- see remarks below
    options: (options)
    decode: _
    encode: _
]

Here's one of those situations where (bind rules parser) not only mutably binds, but it yields something that gets further bound into the make object!. So if the rules use the word "name" it would be redirected into this object. Which is presumably not what you wanted. It's an example of the need for the ideas in "Breaking MAKE OBJECT! into Multiple Operations".

There's also an aspect here of "Separating Parse Rules Across Contexts". Because if someone uses a component subrule that isn't a direct descendant of the rules, they presumably want access to the elements of parser as well.

The idea of having bindings that only apply to the GROUP! portions of code in a PARSE is a likely desired feature as well.

Retaking the term USE in DO-oriented code is one of those uncomfortable-yet-heretically-foundational ideas...and it makes you wonder exactly what the prescribed way of subverting that is. (Do you always go through LIB/XXX? How do people know what's bound out from under them and what isn't?
I've wondered about this with things like adding keywords to PARSE...each new keyword you add runs the risk of breaking a use for another purpose...)

rgchris · January 26, 2021, 1:42pm

The convention I'm shooting for here is that you use the state map to track any variables. As state is a consistent value through all grammars, every grammar shares the same view on, well, state. There's a few aspects where adhering to conventions such as this offer some insight in to what the system might be capable of.

Yes, though as parser only has one state value at a time, any rule 'outside' the binding just needs access to parser/state to access parsing details.

It's certainly something I've considered. Red doesn't have use and a common pattern is to create namespaces which are kind of what module! is for. Another unstated goal here is to make the code surrounding these grammars as basic as possible, to make them maintainable and optimize-able.

I'm a little less familiar with latest Ren-C than I was, especially the Parse changes. It's not a complex module, but it may take a bit of effort demonstrating something useful.

There's a few such things that I've come across. Working to get them into tests.

hostilefork · January 29, 2021, 8:09am

That avoids naming conflicts, though it's a bit heavy-handed. And even in this case it's not true of all variables you're using, e.g. mark has potential to collide. (BTW: At one time, Ren-C took MARK as a keyword, previous to the new design for pos: <here> as how to mark series positions.)

In any case...my point stands, that you likely only intended here to bind rules parser. Yet with the historic workings of MAKE OBJECT!, when you do this with a COMPOSE and the block becomes a child of the object body, you will get the additional bindings of name, suffixes, identify?, rules, options, decode, encode... etc. Imagine adding mark to this list in your example...the user code bound to it could wind up corrupting state it wasn't supposed to see.

When people make these little "parse machine" snippets, I'd imagine they'd like to be able to take for granted that the parse machine engine could interpret those snippets without having to be given all of them up front in a block. Especially if the snippet might be generated dynamically. It would be nice if you could be on more of an even turf with the keywords of PARSE in this respect.

rgchris · January 29, 2021, 5:08pm

Ah, I get what you mean. Yep, that did catch me out. My inclination though is to reach for map! instead as it has no context and potentially doesn't require a preset list of fields, or another way of designating the rule (e.g. [rule: bind 'rule 'some-other-word-in-the-parent-context-that-isnt-in-the-object]—not ideal and a more nuanced constructor would be better in this instance). This goes back once again to thinking about what objects are for—1) single use, code-grouping objects such as this vs. 2) prototypes representing things in the classical object-oriented sense—which I'd still prefer to see prioritized. Again, I'm sympathetic to the need for different constructors, but still have reservations about the implications of my/method as a solution to binding when cloning. Anyways, this is more a response to that topic than this one.

Specific usage of mark here is a behavioural legacy and a mistake, the convention I'm looking to use here would be state/mark. I'm not sure I see the state basket as being heavy handed, I think it's a reasonable response to the complexity of what I'm trying to achieve. Could well also be my familiarity bias too, but for now it's workable as a means to explore the Parse Machine concept to a conclusion.

rgchris · January 29, 2021, 10:47pm

If there's one advantage to using a MAP!*, it's that if you do use overlapping grammars**, you don't have to preset the object to use all the variables used by each grammar.

*but I suppose can use a map within an object

**latest monstrosity encountered today comes from Google: a semantic information schema encoded as JSON contained within the HTML content of a MIME-based email all designed to make your email look better in the Promotions tab in Gmail—data format mad-libs!