Syntactically Significant Newlines

JSON came from a Rebol heritage, yet people found it weak weak in a couple of areas. So odd-little YAML has come in to pick up some of that slack:

One of the key features is syntactically significant newlines. Rebol already ascribes syntactic significance to newlines in some constructs (comments, multi-line strings) so you can't just take an input Rebol file and smoosh all the newlines together and have it mean the same thing.

So why not add some way to get arbitrary string data based on a significant newline? What if # followed by a space didn't make an empty ISSUE!, but rather cued the issue to just read to the end of the line...uninterpreted?

>> issue: #   ))) Write 4nything you $want$ here  (((
>> as text! issue
== "   ))) Write 4nything you $want$ here  ((("

There'd be a light annoyance in the sense that the leading space becomes part of the content, and would need to be for accurate molding. But with Rebol's model, you can always NEXT that out if you need to pass it to some routine.

>> issue: # foo bar
>> as text! next issue
== "foo bar"

What you'd use this for are things like UNIX command lines, as in .travis.yml

# echo "Generating the cross-compiler"
# ${TOP_DIR}/external/tcc/configure --enable-cross --extra-cflags="-DEMBEDDED_IN_R3"
# make -j ${MAKE_JOBS}

// Could throw in comments, which wouldn't get tied up as part of the string content
// (they would be if you tried doing this with a multi-line string literal)
# mkdir bin
# cp *tcc bin #save cross-compilers

FILE! could follow the same rule:

% echo "Generating the cross-compiler"
% ${TOP_DIR}/external/tcc/configure --enable-cross --extra-cflags="-DEMBEDDED_IN_R3"
% make -j ${MAKE_JOBS}

// This probably looks a bit better to the average UNIXer
% mkdir bin
% cp *tcc bin #save cross-compilers

But if you did read an issue and NEXT'd out its leading space, you'd have to mold it somehow. Not a new problem, as you can create issues with embedded spaces today.

It's a bit unfortunate that #{} is used for BINARY!, as #{} would be good for empty issue (and would permit multi-line issues as well). Due to that, I might suggest ${..} for BINARY!, with #{...} gradually phased out and ultimately repurposed. (Curiously, any binary would be loadable as an ISSUE! since an issue can hold anything--including hex characters, so old files would still LOAD and be able to be converted under this change, perhaps at LOAD time in the legacy emulation.)

Interestingly, you could kind of use # for comments

Today you can write:

sum: function [x y] [ #the-summation-function
    return x + y #return-them-added-together

It works because ISSUE!s are inert. But with this change, you'd be able to now say:

sum: function [x y] [ # the summation function
    return x + y # return them added together

Still only two ISSUE!s.

Because the values are actually there, it would screw up some things. For instance, since RETURN detects no argument and then gives back void, this would return an ISSUE!:

sum: function [x y] [ # the summation function
    return # now we're returning an ISSUE!

Of course, that consequence is small potatoes for why it wouldn't be a good general substitute for comments. It's fairly relevant that they're not stripped out of your file but just left there.

But what it could mean is it could give dialects that wanted to do something creative with # that was comment-like could do so.

It could be used with the DUMP abbreviations

Instead of writing:

-- "got to part 3a"

You could write:

-- # got to part 3a

It's the same number of characters, but it stands out a little more while at the same time looking less cluttered. Plus it swaps out a shift-to-type quote for a space, so you only hit one shifted character.

I'm not sure I like it, especially changing binary representation (well, I actually like that, just not in the realm of Rebol. Looks to much like a change just to be different, and makes interacting with other Rebols more difficult).

That said, how about using "# "(#space) as delimiter. Then the space woul not need to be part of the data for correct molding.

I don't know exactly what that means.

Another option could be to repurpose something else for a new type that would not have the space in the data. Perhaps $.

$ echo "Generating the cross-compiler"
$ ${TOP_DIR}/external/tcc/configure --enable-cross --extra-cflags="-DEMBEDDED_IN_R3"
$ make -j ${MAKE_JOBS}

// There are $ in shell commands, but that doesn't mean it's a bad idea.
$ mkdir bin
$ cp *tcc bin #save cross-compilers

We could even do something wacky with it and say that if it's in code and not quoted, the evaluator behavior is to call the shell. :-/

... # test
... ^ only # as delimiter, space part of data

... # test
... ^^ # and following space as delimiter, space is not part of the data, when molding the space is added, because it is part of the delimiter, not the data.

I feel like I had a post talking more about YAML but can't find it now...

I definitely think the things that have driven YAML's success are too important to overlook.

Right now when creating data-ish things I often have to write:

status: <foo>
description: {
    Here is some multi-line stuff.
    I can be free-form with the text.
widget: <bar>

But those braces are a tax. We could take backslash, for instance:

status: <foo>
description: \
    Here is some multi-line stuff.
    I can be free-form with the text.
widget: <bar>

Or something along those lines. When I use YAML I am finding myself often thinking "this would be really ugly in Rebol".