DO vs. READ and URL translation

In trying to make the Web Console as easy to use as possible, we have some magic that translates URLs on GitLab and GitHub into CORS ("Cross-Origin Resource Sharing") API requests.

For instance, take an example @gchiu is working with currently (despite being a forum absentee :angry:):

https://gitlab.com/Zhaoshirong/docx-templating/-/blob/master/gmdocx.reb

That GitLab URL indicates a document. But what you get if you READ that URL is not the file contents of gmdocx.reb...because it is a web page with a lot of decorations. You get line numbers and all that other HTML all around it.

We don't want READ to do any magic to give anything different back that the browser wouldn't see for that URL. If you READ it from the console, you should get all the HTML... line numbers and all.

But the DO function can have different rules. Since DO of a bunch of HTML would be meaningless, it seems that if you ask to DO that URL (which is convenient to exchange with people), it should be able to figure out how to extract the code and run it.

Now...we do that. When you run DO, it first translates that link into a GitLab raw link:

https://gitlab.com/Zhaoshirong/docx-templating/-/raw/master/gmdocx.reb

Then that link is passed to READ.

But we're not done with magic. When you're running in the browser, that link can't be fetched directly due to CORS rules. You have to go through the GitLab API.

So there's a fiddling inside READ to do that, and pick apart the JSON to give you the data--even though you are in the browser.

Are There Other Candidates for DO to do "magic"?

Any URL could be taken as an instruction. Is there any real reason to stop at code, or would data be acceptable too--if a URL format conveyed an obvious intention? There's BROWSE if you want to open the URL, and READ if you want the HTML...so should there be other behaviors?

Random whimsical example: A DO of a Google Search could "know" to return a list of the results of that search as a BLOCK!.

There's JavaScript and CSS code URLs (as well as ones that decorate it, like CodePen, where the decorations could be removed).

Right now, there's a separate JS-DO, that you can use to run .js code

There's also CSS-DO, that will "execute" .css by loading it up.

I didn't make these overload DO (in part because that's harder...there'd have to be some generic hooking mechanism to allow all these different "do codecs" register themselves).

But think about the invariants. If you're using DO inside an implemention where what you mean is "do block", is it appropriate for that DO to potentially have such wild behaviors as loading CSS code if you happen to pass it something that it interprets that way?

Maybe this should be a distinction like RUN (arbitrary tool) vs. DO (stay inside the language). Or maybe other words. Just something to think about.

1 Like

(Thinking out loud...)
It would be cool to use the browser/javascript as a shell to perform text/character operations and have the results return to our script. Like piping data to a *nix shell.

There is some formatting that might be more natural to execute in javascript as part of a READ/DO instead of recreating in Ren-C:

  • Getting a DOM list
  • HTML comment stripping, tag stripping
  • Converting HTML character entities
  • Text normalization, e.g., "résumé" --> "resume"
  • Removing excess whitespace (similar to: sed -e "s/\s+/ /g" inputfile)