Special Syntax for FOR-EACH/etc. to receive ACTION!s?

hostilefork · December 10, 2018, 6:33pm

Here is some pretty innocuous-looking code:

n: 1
for-each item block [
    print ["the" item "is" n "in the block"]
    n: n + 1
]

But because Rebol runs ACTION!s when they are bound to WORD!s, and you can put ACTION!s in BLOCK!s, you can get into trouble:

muhaha: func ['x 'y 'z] [print ["stealing your args!" x y z]]
block: compose [10 (:muhaha) 20]

Enumerating such a thing produces total garbage:

the 10 is 1 in the block
stealing your args! is n in the block
the #[void]
the 20 is 3 in the block
== 4

I'm not concerned about malicious cases--Rebol is fundamentally not secure about this kind of thing (see and discuss at "The Philosophy of Security in Rebol") But you want to write code that gives reasonable error messages, especially when writing a mezzanine routine.

Unfortunately, doing this "the right way" is ugly (and it doesn't actually work in a general sense):

n: 1
for-each item block [
    print ["the" :item "is" n "in the block"]
    n: n + 1
]

That only has one reference to the item, but it's clearly much uglier in real cases. But beyond being ugly, tacking a : onto the front of everything doesn't have the same semantics. What if you think you're dealing with a block of objects, and want to call methods on them? item/some-method and :item/some-method aren't the same. (Note: perhaps (:item)/some-method should work?)

Point being: even if you think adding colons is mitigating the problems, you're not getting what you really want...which most of the time, is an error.

Could we make the common case work better?

Something I've often wondered is if these enumerators which take words and set them through a loop would only let you get at ACTION!s if you used some other ANY-WORD! type. This way, they could error otherwise...and common loops could feel safe.

For the sake of example, let's say it's a GET-WORD! passed to FOR-EACH if you actually are prepared to work with ACTION!s. The first example above, by using a plain WORD!, would output a clear error instead of gibberish:

the 10 is 1 in the block
** Error: Variable `item` must be GET-WORD! to hold ACTION! in FOR-EACH

If GET-WORD! were used, it would make some amount of sense--and line up with the fact that you would be wanting to use GET-WORD! in the body too. But it sacrifices the current feature for GET-WORD!, which is "soft quoting"...where the word specifies the word to look at to find the word to use:

 >> word: 'item
 >> for-each :word [1 2 3] [print [item]]
 1
 2
 3

However, there's another way of doing this, with GROUP!:

 >> word: 'item
 >> for-each (word) [1 2 3] [print [item]]
 1
 2
 3

Soft-quoting doesn't really come up all that terribly often. Still, it's a little annoying that this would throw a wrench into COMPOSE situations doing simple soft-quotes, but you could attack that multiple ways.

I think it would be much better if you could mark very clearly which enumerations were intentionally working with ACTION!s. The error messages would be better, and people are savvy enough to know there could be a problem won't be so paranoid in their basic enumerations--knowing the error will be delivered.

Should just loops be affected, or all soft quotes?

It could be loops for starters. They could be switched to hard quotes and do their own logic, only giving soft-quote semantics for GROUP!s.

Or maybe soft-quoting is too sacred as a mechanism in PATH! processing...and you don't want to have to type foo/(bar) instead of foo/:bar...so that translates to wanting to keep it in sync.

This might mean using another datatype, e.g. for-each @item block [...] to say "ACTION!s are okay".

Any thoughts? I know Rebol has some elements of "it's a fundamentally unsafe language", but I just feel there need to be some limits. But I don't want to bulletproof every FOR-EACH in the system against function injections--even if they are all just accidents, you want better feedback than having a mess be made.

hostilefork · July 16, 2020, 5:41am

I feel comfortable with the step ahead (well, technically step back to Rebol2 semantics) on GET-WORD! of unset variables raising errors. So GET-WORD! really is dedicated to the defusing of ACTION!s so that they do not run.

With that hardened association, it makes me wonder if a related concept might make sense...which is to cue people to notice when a parameter or loop variable is allowed to be an ACTION! by its declaration:

This would let people be less paranoid about writing their routines, keeping them from obfuscating it with protections in the common case. Let's take an example:

>> foo: func [] [print "Formatting hard drive...."]

; In today's world...

>> print-val: func [x] [print ["The value is" x]]
>> print-val :foo
The value is Formatting hard drive...

Imagine if plain WORD! arguments prohibited ACTION! (even if in the typeset, as per ANY-TYPE!), but GET-WORD! arguments allowed it:

; The proposed world...

>> print-val: func [x] [print ["The value is" x]]
>> print-val :foo 
** Error: Function assigned to X, to allow this name the token :X

>> print-val: func [:x] [print ["The value is" x]]
>> print-val :foo 
The value is Formatting hard drive...

Yet the real point is what you get when you can scan the second example and notice the asymmetry. You've been given a short visual clue that might hint you into the behavior of mirroring the spec. If you see print-val: func [:x] [print ["The value is" x]] and then notice the disparity of the parameter being named :x and invoked as x, you might question why that's not :x to match.

It seems to me that every little bit of such cue-ing can help. I'm imagining this applying in loops as well:

>> block: compose [1 (:foo) 2]
>> for-each var block [print ["The value is" var]]
The value is 1
** Error: Function assigned to VAR, to allow this name the token :VAR

>> block: compose [1 (:foo) 2]
>> for-each :var block [print ["The value is" var]]
The value is 1
The value is Formatting hard drive...

The value is 2

Again, seeing the colon on the variable helps hint what's going on. And basic code doing loops that doesn't expect to be finding ACTION!s in a block will be somewhat protected.

This would need new notation for "hard literal parameters"

Today's GET-WORD! is an "unescapable" literal argument. So even if the callsite is a GROUP!, the argument receives the group vs. the evaluation product.

I've in the past suggested func ['(arg)] [...] as a more semiotic way of denoting "escapable literal parameters"...and then func ['arg] [...] could be unescapable. It looks nice and kind of shows you the idea that "it's quoted, but a GROUP! can get you past that". Also you can mix it with the colon cases, for func [':arg] [...] or func ['(:arg)] [...] to say that even though you're quoting, you still consider ACTION! a candidate (if found literally in the block or evaluated to by a soft literal situation).

On the downside, you really want people to use hard literal parameters as a last resort. So this makes it easy to forget the more relaxed form as the better default choice.

Only GROUP! could do escaping for loop parameters

Historically for-each :named-var [...] [...] was a shorthand for-each (named-var) [...] [...]. This would be reclaiming those GET-WORD! and GET-PATH! cases for the "I can process actions in the loop body" signal.

Honestly, needing to escape via an already named variable doesn't come up all that often. One case I mentioned was the idea of reusing a variable that already exists (to save on creating an OBJECT! for the iteration), but that could be done with @word or 'word perhaps.

What do people think?

I know this isn't the end-all and be-all of somehow making the language safe. Once you start picking things out of objects, anything you thought could have been just data could actually be active.

But I think it would help ease the pain of perfectionists who want to write "correct" code, to take it easy and think "well I don't have to handle these dangerous cases, because the proper errors will happen for me just by not using GET-WORD!".

This has been on my mind a long time -- this thread began in Dec '18. Because I'm one of those people who tries to write generic code and frets about what would happen on the day that unset variables or ACTION!s come along...so I'm caught between the balance of feeling negligent by not peppering with checks, or junking up otherwise elegant code for the sake of something that could happen.

I feel like it may be time for it, now that we're committing to the GET-WORD! really meaning only "defuse action" and not "get void" (disallowed) or "get null" (always allowed, even with WORD!). That shores up the bit about unsets not being able to run amok. This change would be a nice way of complementing that by stopping ACTION!s from throwing a wrench into things--while keeping the common case code clean.

hostilefork · July 16, 2020, 6:23am

Just a note that this has to also account for optional arguments ("refinements"), and there should be a canon representation.

How many ways might you say "literal escapable optional argument that can be an action"?

func ['(:/param)] [...]
func [':(/param)] [...]
func [:/('param)] [...]
func ['/(:param)] [...]

All are legal, but I feel like only one representation should be allowed. I like option 1 because it puts the escapability right inside the quote. I like option 4 because it puts the "actions are legal" directly on the arg. (Note that func ['(/:param)] [...] isn't legal because colons are not allowed interior to path elements.)

I think 1 wins, just because of preserving the junction of the quoting with the escapability.

For comparison, here's soft literal optional, default not permitting ACTION!

not-any-function!: make typeset! [... everything but any-function! ...]
func [/param 'arg [not-any-function!]] [...]  ; historical

func ['(/param)] [...]  ; proposed

Soft literal optional, permitting action

func [/param 'arg] [...]  ; historical

func ['(:/param)] [...]  ; proposed

Hard literal optional, default not permitting ACTION!

not-any-function!: make typeset! [... everything but any-function! ...]
func [/param :arg [not-any-function!]] [...]  ; historical

func ['/param] [...]  ; proposed

Hard literal optional, permitting ACTION!

func [/param :arg] [...]  ; historical

func [':/param] [...]  ; proposed

I know it's a little dense, but it seems to me that the semiotics are there.

Apostrophe means quote, don't evaluate
GROUP! immediately inside apostrophe means escapable quote, use GROUP! to subvert
Colon means ACTION!s allowed (you likely want to use GET-WORD!s inside the function with their own colons)
Slash means optional, like a refinement.

I haven't actually seen a whole lot of quoted refinements, ever. So the most common weird thing you'd see would be :/param, a refinement that accepts ACTION! (like a comparison function for sorts, or a function for filling in elements for ARRAY/INITIAL).

While this might seem overly symbolic, function specs are a bit tough since they try to be full spectrum in permitting WORD!s, which means they've taken all the words away. We can't do func [quoted optional escapable x] [...] Because that would be 4 parameters. func ['/(x)] [...] is compact and--I think--learnable.

hostilefork · September 19, 2020, 3:52pm

Today I've conceived the idea that TUPLE!s could be used to indicate things to the left of a dot should not be a function.

That decoration could be used as an alternative for functions...

>> print-val: func [x.] [print ["The value is" x]]
>> print-val :foo 
* Error: Function assigned to X, disallowed functions (spec'd as X.)

>> print-val: func [x] [print ["The value is" x]]
>> print-val :foo 
The value is Formatting hard drive...

Or for iteration:

>> block: compose [1 (:foo) 2]
>> for-each var. block [print ["The value is" var]]
The value is 1
** Error: Function assigned to VAR, disallowed (spec'd as VAR.)

>> block: compose [1 (:foo) 2]
>> for-each var block [print ["The value is" var]]
The value is 1
The value is Formatting hard drive...

The value is 2

This would give library authors peace of mind and save them the trouble of less efficient ways of writing bulletproof code... moving the burden of annotation to those who need it.

It makes me a bit uncomfortable as the default seems unsafe. But that unsafety can also be seen as flexibility for code golf.

Slashes could be used as well, but that wouldn't seem to combine well with refinements... as /a/ meaning "optional parameter than can be a function if it wants" seems less natural than /a. as "optional parameter that can't be a function". It also seems the cleanest expression...

foo: func [/a.] [...]
foo: func [/a/] [...]
foo: func [:/a] [...]

So all told, we could have something like foo: func ['(/a.)] [...] for meaning "quoted optional parameter that can escape the quoting via GROUP!, but can't be a function." In my ranking listed above (1-4) it brings the best of all worlds... since TUPLE! would be legal in PATH! while GET-WORD! wouldn't, it binds the non-executability to the parameter -and- puts the escapability right next to the quote.

Having a pretty good feeling about this...so maybe the lax default is okay. If anyone is bothered by it they can build their own wrapper that enforces the non-executability unless you annotate it somehow.

hostilefork · October 27, 2020, 6:40pm

With the recent changes to where :X, :Y/Y, and :(Z) are the quoting callsite escapes , then subvertible quoting is tied directly to colons.

This strongly suggests an inversion of the current convention for parameters, where 'foo is the "quote regardless" and :foo is "subvertible quoting". I'm on the fence about an extra apostrophe for ':foo ... it helps convey that it is both quoted and subvertible, but then you're in a situation in which :foo might mean something else (which could be confusing) or where it doesn't (and you're just complicating the value typing for little benefit).

So the above is probably just:

func [/param 'arg] [...]  ; historical

func [:/param] [...]  ; proposed

I'm glad that it makes it possible to get a semiotically consistent spec that doesn't have to use GROUP!s, which makes it less of a hassle to COMPOSE specs.
It's going to be kind of a nuisance to flip it, but if bootstrap is any indication then quote subversion isn't terribly common.
Having generalized tuples and foo. for "no function invocations" seems like a pretty good way to put that guard on a parameter too, if you like. It's short, and completely optional...just to save you time of more elaborate checks.