SET-WORD! To Initialize Locals In Function Specs?

(cc: @IngoHohmann as you have had opinions on these kinds of things.)

It seems it would be nice if you had the option of setting your locals when you define them.

foo: func [
    arg1 [integer!]
    arg2 [text!]
    <local>
    local1 local2
    local3: 10
    local4: (20 * 30)
][
    ...
]

As it so happens, there's potential to exploit this for efficiency. The frame mechanics have a slot for each local in the function archetype that currently just holds trash, and it could hold this default value / expression. So it wouldn't just save on typing the local name and then later the name again and the expression...but you're avoiding the need to perform the evaluation to do the assignment on each call!

There's a lot of questions to answer:

  • What binding rules is it using? Could you initialize local3 and then say local4: (local2 * arg1)?

    • Almost certainly not, and it would just be using the binding of the spec block
  • Does the code run on each invocation, or is it run only once to calculate a fixed value? e.g. if it was local4: (global-var * 30) would each invocation of FOO recalculate what (global-var * 30) was at that moment?

    • Almost certainly would just calculate a fixed value and use that value on each call.
  • Do you need parentheses directly after the SET-WORD!?

    • If the expression were run on each invocation (which it probably shouldn't be) then it would be a requirement, because there'd be no way to find the start and the end of the right hand expression without evaluating it.

    • If plain words are being picked up as locals there's potential for error if you accidentally wrote an expression that didn't work, like:

      func [
         arg [integer!]
         <local>
         local1 local2
         local3: arity-2-but-I-think-it's-3 a b c
         local4: 10
      ][
          ...
      ]
      

      That could wind up making a local c that you didn't intend. But then again, sometimes it would be just a very obvious simple initialization, like local4: 10. Forcing people to use parentheses could do more harm than good, vs. trusting them to use the parentheses if they feel it's warranted.

Compare to <static>: Not Initialized With SET-WORD! ATM

Right now the <static> feature lets you assign your variables, but it uses a non-Reboly-notation to do so:

accumulate: func [
    item [any-element?]
    <static>
    block ([])
][
    append block item
]

The parentheses are optional to hold the initializer. But it seems much more normal to say:

accumulate: func [
    item [any-element?]
    <static>
    block: []
][
    append block item
]

One reason for the parentheses notation was to try and be consistent with the idea of defaulting refinements.

>> foo: func [/string [text!] ("default")] [print string]

>> foo/string "hello"
hello

>> foo
default

But that feature was removed

There's another reason why just WORD! was used...

RETURN: Has "Owned" SET-WORD! In The Spec Dialect

We have a little bit of friction in that the dialect has been using RETURN: to indicate what a function returns. The choice has not much to do with what comes after a return being an assignment any more than anything else, it was picked for looks:

double-multiply: func [
    return: [integer!]
    value1 [integer!]
    value2 [integer!]
][
    return 2 * value1 * value2
]

The issue is that historical Rebol2 (and R3-Alpha, and Red) allow this:

rebol2>> print-sum: func [return break] [print ["Sum is" return + break]]

rebol2>> print-sum 10 20
Sum is 30

Ren-C only lets you do that in LAMBDA. FUNC prohibits it:

ren-c>> print-sum: func [return break] [print [return + break]]
** Error: Generator provides RETURN:, use LAMBDA if not desired

I think Red/System decided on RETURN: first. But they put it at the end of the spec. Red errors if you try to put the return elsewhere:

red>> stringy: func [a b return: [string!]] [a + b]
== func [a b return: [string!]][a + b]

red>> stringy: func [return: [string!] a  b] [a + b]
*** Script Error: invalid function definition: [return: [string!] a b]

But either way, it's not checked. On the 2012 announcement of function support in Red, DocKimbel says: "Note: argument and return value type checking have not been implemented yet, they need typeset! and error! datatypes to be implemented first." Parameter type checking works, but I guess return type checking was never added. It does show up in the HELP though.

red>> help stringy
USAGE:
     STRINGY a b

DESCRIPTION: 
     STRINGY is a function! value.

ARGUMENTS:
     a             
     b             

RETURNS:
     [string!]

Note that they also put the RETURNS: at the end there, too. Most people would expect the return value for functions to be the first thing you put down.

I've Wondered If A Leading Block Would Suffice...

Off and on, I've been willing to consider the idea that return typing is just implicitly what you get if you have a leading block:

double-multiply: func [
    [integer!]
    value1 [integer!]
    value2 [integer!]
][
   return 2 * value1 * value2
]

Yet while it looks clean there, it causes some problems when you are filling in documentation strings.

I've become a pretty true believer in the idea that documentation strings for arguments come after the argument name (and that we may do a service to the userbase by standardizing this, rather than by letting it be done either way and have people fight about it):

 my-style: func [
     "Overall function description here"
     argument "Argument description here"
         [integer! text!]
     /refinement "Refinement description here"
 ][
    ...
 ]

The rationale is that any good function will put labels on all its arguments. But not all arguments are type-constrained, in particular refinements are not. So you wind up either being inconsistent

 variation1: func [
     "Overall function description here"
     argument [integer! text!]
         "Argument description here"
     /refinement "Refinement description here"  ; this feels inconsistent
 ][
    ...
 ]

Or you're throwing in newlines for no reason

 variation2: func [
     "Overall function description here"
     argument [integer! text!]
         "Argument description here"
     /refinement
         "Refinement description here"  ; consistent, but annoying
 ][
    ...
 ]

This is why I chose "MY-STYLE" above. But if return becomes implicit on a leading block, you wind up back in inconsistent land:

my-style-with-leading-block: func [
     "Overall function description here"
     [integer!] "Description here"
     argument "Argument description here"
         [integer! text!]
     /refinement "Refinement description here"
 ][
    ...
 ]

So one thing RETURN: has historically bought us is making that look better:

my-style-with-leading-block: func [
     "Overall function description here"
     return: "Description here"
         [integer!] 
     argument "Argument description here"
         [integer! text!]
     /refinement "Refinement description here"
 ][
    ...
 ]

And I think having the word RETURN in there makes it better. Note how it's less obvious when the word isn't there what that is.

But SET-WORD!... is that best?

If we're going to be allowing SET-WORD! for locals and statics, does it make sense to have a stray SET-WORD! for RETURN?

And one outside-the-box thought... given that modern FUNC doesn't allow you to name parameters RETURN, why not just go with a plain WORD! ?

what-about-plain-word: func [
    "Overall function description here"
    return "Description here"
        [integer!] 
    argument "Argument description here"
        [integer! text!]
    /refinement "Refinement description here"
 ][
    ...
 ]

If you try that with a LAMBDA you'll not get an error, and maybe suffer some confusion when the lambda gets its first argument as a variable named RETURN. You'll figure it out pretty quickly, though.

Though I have wondered about questions like "what if you want the behavior of a lambda with the bottom expression dropping out, and no RETURN declared, but you still want type checking?"

You might say "just use ENSURE"

 my-lambda: lambda [
    "Overall function description here"
    argument "Argument description here"
        [integer! text!]
    /refinement "Refinement description here"
 ][
    ensure [integer!] [
        ...
    ]
 ]

The problem is that the return type and any description don't make it to the HELP. This is one reason that I pretty much always use FUNC.

This makes one want to lean back to the return type being something nameless, like just a leading block.

Not Sure On RETURN, But I Like SET-WORD! Locals

I definitely feel that finding a way to not be using SET-WORD! for RETURN: would be good. It's not like it has anything to do with assignment.

Plain word RETURN in FUNC is not an idea that I'm feeling is as crazy as it might sound.

I do think that I like the idea of SET-WORD! for local initialization... that runs the right hand side without required parentheses, and that only runs the evaluation once in the environment of the spec block... stowing that value in the currently-just-trash slots of the frame archetype for the local.

And I like the idea of bringing <static> on board with the same rules.