Embracing A "Useless" Definition of TO

Many years ago, I thought about trying to untangle of the TO and MAKE matrix. These operations are notoriously unpredictable in historical Redbol.

I tried putting some stakes in the ground about what I believed had to be true. It was difficult because I could not think of too many.

One thing I said was that TO should always create a "new" value (if the value was not immediate). And I felt like this should be true:

to (type of value) value
; ...same as...
copy value

Grasping for any stake in the ground I could find, I was pretty sure that TO of a string representation of an integer to an integer should give you the integer:

>> to integer! "1020"
== 1020  ; we know this, at least? (one hopes...)

And the reverse, one would think, as well:

>> to text! 1020
== "1020"  ; what else *could* it be?  codepoint 1020? -> "ϼ"

But beyond that it was hard to think of the pattern.

Coming Back To An Old Idea: Reversibility

When I first encountered Rebol and was made aware of these problems, I suggested TO should be reversible:

value1 = to (type of value1) to type2 value1

However, this gave what appeared to be very "unexciting" options for behavior:

>> to block! 1
== [1]

>> to integer! [1]
== 1

>> to integer! [...anything else in a block that's not integer...]
** Error.  Always.

That particular idea was was so long ago that I don't have direct quotes on hand of people saying "nah, that sucks, TO could barely do anything." But I'm pretty sure it was panned by basically everyone I suggested it to. I guess I agreed, because I dropped it.

Coming back to it now, and seeing it in a new light, I see this as much more useful than I used to. Especially when compared to the historical mess that makes TO nigh-unusable.

And in fact, it fits in with several Rebol2/Red behaviors that I'd thought were kind of pointless before:

>> to integer! <1>
== 1

>> to tag! 1
== <1>

The use I didn't see at the time was the frequent need when dialecting to push values out of band, into some other type, without losing their meaning. If you have a dialect in which integers already mean something, but you want a way of pushing some integer-oriented instruction in there...you can use these kinds of operations.

Definitional errors in Ren-C make this convenient, since checking if something fits the pattern is quick, you just throw in a TRY and the antiform error that's the return result of the TO will be suppressed, giving you a "falsey" null:

>> thing: <a b>

>> to integer! thing
** Error: Cannot TO convert <a b> to integer

>> try to integer! thing
== ~null~  ; anti

The more I look at it the more useful it appears. And it helps give clarity to the MAKE vs. TO division. If you have something that isn't shaped like this, then maybe MAKE is the right place to put it. For example:

>> to percent! 1
== 1%

>> to integer! 1%
== 1

>> make percent! 1
== 100%

(Note: I think ENCODE + DECODE is a better place for binary conversions, e.g. ENCODE 'IEEE-754 is better than MAKE BINARY! of a decimal because that could mean many things, and ENCODE can have more parameterization for single vs. double precision, etc. I don't know
if TO BINARY! should work at all, but if it does, I'd probably agree with the Rebol2 choice to give the binary representation of the UTF-8 string... e.g. (to binary! 1020) as #{31303230} ... and use more explicit future-proof routines to encode with specified byte size and endianness.)

Reversibility Rules Out Rounding

In order to get losslessness in the representation, you can't throw out information.

So this works:

>> to integer! 1.0
== 1

>> to decimal! 1
== 1.0

But this does not:

 >> to integer! 1.5
 ** Error: Can't TO INTEGER! a DECIMAL! w/digits after decimal point

I don't think that's a problem, because that seems like a job for ROUND. Unfortunately, R3-Alpha and Red do something dumb:

rebol2>> round 1.5
== 2

red>> round 1.5
== 2.0

r3-alpha>> round 1.5
== 2.0

It seems this was part of a shift to try and preserve the input type, to facilitate things like rounding MONEY!:

r3-alpha>> round $1.50
== $2

So I guess the way the thinking went was that if you want to keep MONEY! as MONEY! when you round it, all types should act that way.

I think the relationship between INTEGER! and DECIMAL!, which lack decoration, suggests something more intimate where the type can be lost. If you don't want to lose it, ROUND:TO 1.0

>> round 1.5
== 2

>> round:to 1.5 1
== 2

>> round:to 1.5 1.0
== 2.0

BLANK! Pretty Much Has To Mean Empty

If we're talking about equivalencies, we now know this:

>> for-each 'x _ [print "Doesn't run"]
== ~void~  ; anti

>> empty? _
== ~okay~  ; anti

And so really, it seems that the TO conversion of BLANK! has only one set of answers to fit into the family of reversibility:

>> to block! _
== []

>> to text! _
== ""

>> to blank! <>
== _

>> to blank! #{}
== _

And if you try to TO BLANK! anything that's not conceptually empty, you'd get an error.

>> to blank! <a>
** Error: ...

I don't know if there's a motivating case for saying to integer! _ should pick something like 0 as an answer, though Rebol2 did something of that sort:

rebol2>> to integer! none
== 0

Neither Red nor R3-Alpha carried that forward, though it was discussed

Can Be Checked In The Implementation

I've started hacking this through, and it's gone relatively well. TO dispatches to the type it's converting from, with the type being converted to as the argument.

And it's nice in the sense that the TO native driving the process can also check the reversibility constraint in the debug build, to give it some teeth.

(I've rigged up some interesting frame mechanics to enable doing this reversal efficiently, that have sped up other parts of the system (like CASCADE) with "downlevel shifting", that can bypass a trampoline bounce...)

Anyway, things are a mess right now with a couple hundred broken tests to painfully sift through. But I think the reversibility rule is good... biggest questions are whether that relaxes in terms of spacing...

>> to block! "  1    2  "
== [1 2]  ; legal?

This would suggest you would have to compare with the trimmed/canonized version of your input.

Also, issues of string representations.

>> b: to block! "--{1}--"
== ["1"]  ; legal?

So there are some pain points, but chipping away at them.

1 Like

So as a first cut of getting family of equivalencies and reversibility, I wound up with something like this:

>> to path! [a.b c.d]
== a.b/c.d  ; what we have "assumed" from history

>> to block! 'a.b/c.d
== [a.b c.d]  ; add in reversibility requirement

>> to text! [a.b c.d]
== "a.b c.d"

>> to path! "a.b c.d"
== a.b/c.d  ; being consistent within the equivalencies

This was making it a given that TO PATH! of a BLOCK! of items would item-wise join the things into a path.

When you turn the crank with the other rules about reversible transformations and equivalencies, that's just what you get.

BUT if all string classes act equivalently, then it breaks a behavior that was relied upon pretty heavily:

 >> to file! 'foo/baz.bar
 == %"foo baz.bar"  ; historical code was expecting %foo/baz.bar

It's a pretty uphill argument to say that's good.

In order to maintain the equivalencies I am promoting, and fix this, I think we have to push away from the idea that TO is the tool you reach for to make sequences from blocks.

It forces our hand to:

>> to block! 'a.b/c.d
== [a.b/c.d]

>> to path! [a.b/c.d]
== a.b/c.d

But as I've tried to explain, these are useful things--more than was realized.

And all is not lost...and arguably improved...because we have JOIN. Right now I'm suggesting it would come in two varieties... reducing and non-reducing, based on whether you use a [...] or @[...]

>> join path! ['a.b 'c.d]
== a.b/c.d

>> join path! @[a.b c.d]
== a.b/c.d

>> block: [a.b c.d]

>> inert block
== @[a.b c.d]

>> join path! inert block
== a.b/c.d

So there's some flexibility there.

This continues to shake the foundations of what we might have thought we knew about TO, but it feels like a path of improvement.

Honestly, isn't JOIN clearer than TO, in terms of telling you what's going on?

1 Like