Carl's New Projects (?) AltScript, AltOS

hostilefork · March 22, 2021, 5:02pm

This is apparently something Carl is working on...

Going through the bullet points:

Braces {} are used to denote objects. They are lexical and may be used directly without evaluation (the make constructor is not necessary).
Braces {} are not used for multi-line strings. A single+double quote format is used for multi-line strings.

At times I've certainly had my doubts about whether the language tradeoff of braces for strings is a good one. I go back and forth on this.

Today the biggest justification for it is that it mixes well with putting portions of code in quotes, for strings and characters and filenames:

 REBVAL *v = rebValue("... {a string} #{c} %{spaced filename.txt}...")`.

The escaping you get if you don't have that is annoying. Single quotes aren't an option due to ambiguity with quoting. It's a distinguishing feature, and I'd hate to drop it.

All loaded values are constant by default. This includes strings, blocks, and objects. They are all protected from modification. However, they can be copied and modified. In addition, they can be loaded as modifiable by using a load option.

We tried this and I don't think it's as palatable or interesting as the current Ren-C behavior:

Mutability Manifesto

It's been working well enough that I don't think anyone's found much fault with it.

Also, if Carl is working from the R3-Alpha codebase enforcement of mutability, there are countless bugs.

Zero-based indexing is used for blocks (arrays). This is consistent with many other languages.

I've gotten used to FIRST and 1 going together.

Dots are used for field selection. For example, “person.name” refers to the name field of the person object. Slashes are not used for selection. Use of a dot for selection is common in many other languages.

I favor dots, we have TUPLE! and the option to say that PATH! points are only for refinements.

The at sign “@” denotes literal values for none, true, false, data-types, and special values like NAN (not a number).

Ren-C uses a WORD! antiforms for ~null~. If you want something in source that evaluates to the antiform, you can use the quasiform ~null~ (which is truthy, as all reified forms are). Anyway... I think it's a better choice than @null (or @none) in terms of its look, and frees up the @-forms for general purpose words in dialects.

Admittedly somewhat strangely, ~true~ and ~false~ are also chosen as antiforms, but this turns out to be a useful choice.

We don't have a ~NaN~ yet, but if we did it would also probably be an antiform.

Datatypes are currently done as &-forms, e.g. &integer

Short strings are stored directly in value cells, not in separate storage. This is an space-saving optimization.
Characters are expressed as strings. The above storage optimization makes this possible.

The TOKEN! unification (ISSUE! + CHAR!) does this.

Binary (byte arrays) use #”…” for hex and ##”…” for base-64.
Base-2 binary is not directly expressible in source format.

My current thinking is that ${...} and $"..." are used for binaries.

I don't know that I'm particularly concerned about special representations of base64 or base2. It seems to me that TOKEN! can serve well enough and then you convert it if you need to. The cases where you involve base64 binaries are few and far between...usually compressed payloads, and you know where they are so you don't need metadata saying "this information is binary, and this information is base64" because all you're going to do with it is decompress it.

Arithmetic operators can be used on series data-types (strings and blocks) for various useful results. For example “+” can be used to join strings and blocks.

String character escapes use C notation. They use backslash notation, for example “\n” for newline and “\t” for tab.

Giulio has asked for this. It's a question worth revisiting.

The #{ and #[ sequences are reserved for future use. Currently they are undefined.

I do like #{ as an alternative to #" for TOKEN!...it helps passing characters to APIs inside quoted strings.

For JSON compatiblity:

Keys (word definitions) can be written with quotes (“field”:)

A lone colon (:) will automatically associate to the word/string immediately before it.

Commas as element separators are allowed as long as they are not directly followed by a non-digit character (to avoid confusion with comma-based decimal values.)

COMMA! has become a favorite feature of mine, so I like how it's done.

I don't know particularly what to say about the broader question of JSON compatibility.

Seems to touch upon a few points...in particular embracing dots and leaning more toward immutability/constness.

hostilefork · October 18, 2021, 10:38pm

I wrote up a summary of why I think braces as a lexical object! representation won't actually please anyone:

{ Rethinking Braces }... as an array type? - #25 by hostilefork

You get a very minor bit of JSON compatibility, but a mostly useless language feature. The FENCE! proposal is more promising, but has to be considered for what it is...and the consequences it has on strings.

I am a skeptic of accepting something that looks like a SET-TEXT! (set-string!) as a synonym for SET-WORD! for the purposes of this "JSON compatibility".

Reading around suggests the quotes are there because Crockford didn't want to put JS keywords in the JSON spec (like function), and prior to ES5 these could not be used as keys.

JavaScript itself has moved on to where keys that name keywords don't need to be in quotes as of ES5. I imagine it's not unlikely that if JSON was created today it would not have the quotes, and would restrict keys to not having spaces or hyphens...using underscores only for word separators. The fact that spaces and dashes are allowed is more likely a by-product of not wanting to complicate the spec by figuring out how to prevent them.

I think this is where we start to get into the saying no territory I was talking about. As an example, how much trouble does having filenames with spaces in them cause? a lot. Questions arise like: "Do people really need to have spaces in filenames, or was this something that was pursued in spite of how boneheaded an idea it was?"

Embracing bracing for objects is a change that may have value. But I think chasing this JSON compatibility is a move in the wrong direction.

hostilefork · August 29, 2024, 9:27pm

Skeptical or not, the notation itself is coming... a CHAIN! with a TEXT! in the first slot, and a BLANK! in the second.

If JSON loading makes maps instead of bindable objects, we could pick the fields.

deserialized: load-json %whatever.json
value: deserialized."some spaced field"
save-json %output.json compose/deep ["another spaced field": {x: (value), y: 10}]

I'm reticent (e.g. completely unwilling) to allow spaced terms to be bindable in the language. But in a dialect --like a JSON serialization dialect--do what you need to. This can make bridging the gap easier.

If you want to suggest lookup and execution in your dialect...you'd just have to put a terminal dot on it to fetch it, a leading slash on it to run it:

"just a string"

/"some function"

"some value".

But this means you'd be creating mutable series instances, copying that string everywhere, unless strings were immutable and interned. The evaluator isn't going to go that direction.