MOLD and LOAD parity

It occurred to me in this post that MOLD/ONLY appears to work at odds to other /ONLY options:

>> append [] []
== []

‌>> append/only [] []
== [[]]

‌>> mold []
== "[]"

‌>> mold/only []
== ""

As my solution to /ONLY is to use an optimized equivalent of func [value][reduce [value]], this highlights the discrepency:

>> append only [] []  ; same as append/only
== [[]]

>> mold only []  ; not same as mold/only
== "[[]]"

How then should MOLD handle blocks, and what are the implications for LOAD? Let's assume that current MOLD/ONLY is the correct behaviour for BLOCK!:

>> mold [1 2 3]
== "1 2 3"

>> load "1 2 3"
== [1 2 3]

This seems fine until you get to single-length blocks:

>> mold [1]
== "1"

>> load "1"
== 1

Clearly this is now an inconsistency in LOAD. LOAD/ALL would get you there:

>> load/all "1"
== [1]

Should LOAD/ALL then be the default? Probably. Proposition: MOLD/ONLY and LOAD/ALL become MOLD and LOAD. Yay! There is a parity where it matters most: (de)serialization of a block of values.

The wrinkle is: loading singular values is useful. Lost is the MOLD/LOAD parity when handling non-blocks:

>> mold 1
== "1"

>> load "1"
== [1]

Where to go with that? Lost is shorthand validation of snippets of text:

>> type-of load "10-Dec-2020"
== date!

>> type-of load ""
== url!

I'd argue that this is beyond the scope of LOAD—LOAD is an overloaded (:sunglasses:) function and the more critical behaviour is to return a block. It would be more fitting to dedicate a separate wrapper for TRANSCODE that is less likely to use the polymorphic demands of loading from external sources and craps out if, leading/trailing whitespace aside, the content doesn't conform to a single value (even if that value is a GROUP!). Spitballing—DETECT, DISCERN, whatever:

>> map-each value ["10-Dec-2020" ""] [type-of detect value]
== [date! url!]

>> detect "1 2 3"
== ERROR! #!*^!#%@$

Thus MOLD of BLOCK! holds parity with LOAD; MOLD of non-block holds parity with DETECT (or other name).

1 Like

If this were to come to pass, one would need to learn the dissonance between using MOLD on a BLOCK! vs. using MOLD on other values. Shorter, there would not be parity between MOLD BLOCK and DETECT*. That dissonance exists everywhere in the post-/ONLY world anyways though.

DETECT should be able to handle a stringified block, thus:

>> mold only [block of things]
== "[block of things]"

>> detect "[block of things]"
== [block of things]

I believe this proposal and my ONLY one addresses a couple of inconsistencies in the language.

*Not convinced re. the name

It feels consistent with the "container-as-primacy" for blocks, which I will link here:

"Every thought on array splicing has been had!" :exploding_head:

I definitely favor the idea that LOAD always returns a BLOCK!, and that if you think the load operation will return one-and-only-one item there be an extractor for that:

>> single of [1]
== 1

>> single of [1 2]
** Error: single used on block with more than one item

So you can put together a thought like:

>> something: "1"

>> ensure integer! single of load something
== 1

Or you could shorthand the single of load behavior as just LOAD-VALUE (I'm trying that out...)

Note: here is what my remark inside LOAD on the feature of single value extraction was:

; If appropriate and possible, return singular data value.
; !!! How good an idea is this, really?  People are used to saying
; load "10" and getting `10`, not `[10]`.  But it seems like this makes
; the process have some variability to it that makes it a poor default.
all .not [
    header  ; technically doesn't prevent returning single value (?)
    (length of data) <> 1
] then [
    data: first data

One aspect of this overloading is that it's trying to handle arbitrary file formats and codecs as well. So
load %whatever.png

TRANSCODE is now easy to use, with multiple returns cueing if you wanted to transcode the whole thing vs. just one part. It handles TEXT! as well as BINARY! (as TEXT! is also UTF-8 underlying):

>> transcode "a <b> 10"
== [a <b> 10]

>> [value next]: transcode "a <b> 10"
== a

>> next
== " <b> 10"

Binding while transcoding would be a performance advantage (I think Red does it). There is a facility in the scanner now that binds as it goes but this is for the libRebol API only, not exposed by TRANSCODE.

In any case, point being that though I said LOAD should always return a BLOCK! that would be understood here to mean well, at least LOAD of 'rebol code should always return a BLOCK!.

But even if /ALL goes away, there's always going to be parameterization settings to the underlying "codec" that powers generic loading. Does the code-loading "codec" have options? I think it should...policies like allowing or disallowing CR/LF or tabs, etc. Compatibility options, like BINARY! as #{...} instead of &{...}.

So is there a hard and fast reason why there wouldn't be an option of "load just one value?" in the code-loading codec?

...I think it's something that is probably not worth it to include. But just wanted to point out that LOAD is theorized to do much more, and parameterization will be necessary.

A post was split to a new topic: Plugging The Script Header Hole

Historical MOLD/ONLY is indeed backwards...

...which motivated making an exception for MOLD not taking antiforms for splices. MOLD of a splice will mold the contents with no delimiters:

>> mold [a b c]
== "[a b c]"

>> mold spread [a b c]
== "a b c"

Nice and neat.

I'm going with the concept that DO and LOAD require a script header.

Also: LOAD was previously needed to get an initial binding. But with pure virtual binding, unbound code is a more practical currency, and you should be getting an unbound value back from "LOAD-VALUE".

So that's two reasons why LOAD-VALUE is badly named in the current world. It's really TRANSCODE/ONE, plus ensuring that there's no leftover. Output lacks a binding and input lacks a header, so it's not LOAD-like.

Calling it transcode-value is kind of long. :frowning: It sort of seems like TO-VALUE might be a good name for it, but a string already is a value.

>> to-value "{abc}"
== {abc}  ; a text! value... but "{abc}" was a text! value too

If we can look past that imperfection, then TO-VALUE doesn't seem that bad. I think it's learnable that really means (STRING-)TO(-TRANSCODED-SINGLE)-VALUE without having to write that all out.