Custom Evaluators and Foreign Syntax

Brett · February 13, 2021, 7:52am

Excuse my random comment here. But I've often wondered whether support (beyond parse) for user defined foreign syntaxes would be feasible, useful and not horribly painful. The idea coming from the number of syntaxes I've attempted to deconstruct using parse, manipulate, evaluate and sometimes re-form over my time with rebol. It's using rebol as the rosetta stone of playing with data/code stored in foreign syntax - not necessarily always validly (as in a parse tree), but perhaps usefully.

Perhaps I misunderstood and that's what you're saying with fused!.

The follow on from that being whether custom evaluators are a possibility, for the standard rebol types in a dialect, say as an attribute of a function (frame or block?) as way of making dialects a bit more first class rather than fringe interest and perhaps leverage more of the lego box parts of evaluation in custom dialect evaluation.

Not thought through, just random musings from someone who is perhaps out-of-date, and certainly not requests for functionality since these days I'm playing with photos rather than code.

Brett · February 14, 2021, 1:11am

We have custom evaluators today in the form of functions we write. To process a dialect we must use Parse or specially craft our "dialect" to accommodate Ren-c's evaluation process. Parse is good at recognition, terrible at building structures, it's aimed at step by step evaluation, which is fine but leaves the heavy lifting to the dialect writer to build context as required from scratch for every new dialect.

It would be good if evaluators by user's feel more first class. Operations like DO and REDUCE on a block is so nice. I'd like that niceness to extend to dialects, my intuition is that perhaps by raising a usermode evaluator to plug in to the system operations such as DO and REDUCE can apply to a dialect and produce useful outcomes and sane errors. Maybe there would be an opportunity to "compile" them upon load for performance.

Writing an evaluator is hard as Hostilefork has so nicely documented, and he is striving to create orthogonal parts that can be stuck together by users for powerful expression. I'm wondering how that work can be leveraged even further for dialects.

Coming back to foreign syntax and your datatype question, if one has broken text into tokens with parse or whatever, I see a need to be able to distinguish/annotate them. Objects might be ok if they were performant and were not saddled with the terrible baggage of "make object" they have when saved out, but I suspect a more lightweight system would be better - more equivalent to rebol words - something with spellings and bindings. I guess I'm suggesting supporting multiple syntaxes that map onto ren-c datatypes and user datatypes where the default sytnax is ren-c.

I think back to the C-lexicals of the build process. I feel these other elements could make that process so much better. We don't need to evaluate C as C, but being able to edit it or interpret it as symbols in a linear or tree structures would be very useful.

All very intuitive and light on answers I know...But being able to talk pidgin with other languages might be a really useful trick.

BlackATTR · February 14, 2021, 1:35am

Thanks. I think concrete examples are helpful. Maybe a custom evaluators conversation can continue with @hostilefork in his Whitespace Interpreter forum topic.

hostilefork · February 14, 2021, 4:44am

Hey Brett, good to hear you're out there having fun somewhere. You're probably making a good choice avoiding code!

The fact that people are commenting on the FUSED! proposal--and even beaming messages from retirement--is indicative of...something. Probably that Rebol notation simply leaves a lot to be desired.

Rebol's PARSE isn't the only game in town for people looking for alternatives to RegEx, and I've talked about how parser combinators in languages like Haskell offer a quite a lot...with tons of infrastructure to assist in real world tasks.

So before trying to extend the syntax of Rebol's medium, I'd side with the idea of shoring up PARSE when you're not trying to use it to mix full-on representations from different languages in the same source file. Until that really demonstrates "bringing the magic" reliably, then getting entangled in syntax extensions for LOAD just adds other layers of complication.

@BlackATTR brings up the Whitespace Dialect and I think that really hitting home-runs on problems like that is a prerequisite to getting too much further out in ambition.

I think FUSED! might be able to be a relatively low-cost way to get a little more satisfaction for people. Little things like that and COMMA! may help get closer to a sweet spot where it feels "flexible enough".

The question of how to make the evaluator "hookable" is definitely on my mind. One big step of this has been trying to make FRAME! a very reusable part...and to unify services across PARSE and plain DO for things like single stepping and debug stacks.

And yes, all of this is hard...as my notes show. It's especially hard when it's all on the backdrop of a weird dependency game where you're writing it all yourself in C89.

hostilefork · September 1, 2024, 4:56pm

So this seems to tie into your other post about "Kinds" of Values

Kinds of values?

For demonstration I thought Kind could be a word followed by a #, followed by the existing syntax. E.g:
vid#[button "push me"] ; A block of VID dialect.
vector#[1 2 3]
kind of markdown#{*This* could be interesting.} ; Would yield MARKDOWN
Kind would be optional so existing forms would be unaffected:
kind of {simple string} ; would return _
kind of [simple block] ; would return _

I've mentioned that we now have inert forms, and this is possible...and I think the idea seems sound.

>> @vid:[button "push me"]
== @vid:[button "push me"]

>> kind of @vid:[button "push me"]
== @vid  ; or just vid ?

The name "KIND" was taken for a while to be the underlying type of an item regardless of quoting:

>> type of first ['''10]
== &[quoted]

>> kind of first ['''10]
== &[integer]

But I am now calling that "HEART". So KIND is available for this purpose now!

DO Could Have a Registry To Dispatch These

(I don't think we want to dispatch by binding, because things like vector are too likely used by variable names.)

This is interesting in terms of the DO vs. EVAL distinction. DO would be able to handle this, but EVAL would expect a block of Rebol.

I like it. Also, the polymorphism of DO with things like different languages.

>> code: @javascript:-{function () { console.log "your JavaScript here" }}-

>> do code
; would be able to work in the Web Console

If you had your string in a variable, just JOIN it...

>> js-source: -{function () { console.log "your JavaScript here" }}-

>> do join @javascript: js-source

That's awesome. I look forward to wiping out CSS-DO and JS-DO, and letting DO be polymorphic!

Hopefully this will fulfill your wish!

hostilefork · September 19, 2024, 2:19pm

So I actually wonder if this should be dialect of, along with a test of dialected? to ask if a value is a CHAIN! with a WORD! at the head.

I don't know if we should restrict that to being length 2. But I do know that @css:".box {background-color: blue;}" would be a string and not a block.

We might be in the situation of needing to have qualified names if there's contention, and a fully-qualified name. So @web.css maybe if you loaded two separate systems that registered the name "css" into your project.

bradrn · September 20, 2024, 1:16am

I’m not sure I see the point of this proposal. What’s the practical advantage over a simple block [button "push me"]?

If I understand the situation correctly, I think it would be simplest to say that the thing after the @ must be an evaluation function. (Well, a value which is bound to an evaluation function, that is.)