Function Spec Dialect: Who Should Analyze It, And How?

hostilefork · October 6, 2019, 8:44am

When you supplied a function spec block in R3-Alpha, it would preserve that spec as a copy of whatever was passed to the low-level MAKE FUNCTION! call. You could get it back with SPEC-OF. (If your code was abstract and the FUNC call was generated by code, this spec might not look familiar to you--since it wouldn't line up directly with what you'd written in source.)

This spec was preserved as commentary only: used by SOURCE and HELP. The MAKE FUNCTION! operation would create a second array as a cache--that was a summary of just what the evaluator needed to know: the order, types, and conventions of the parameters (normal argument, quoted argument, refinement). R3-Alpha called this the "args" array of the function, but Ren-C named the summary array the paramlist.

Redundant Interpretations of the Spec

If you wanted to know the order of parameters and their conventions in R3-Alpha, you could use WORDS-OF:

r3-alpha>> words-of :append
== [series value /part length /only /dup count]

This information was extracted from the cache (the "paramlist"), not from the spec block. Same for getting the accepted types (which is not very readable when the ANY-TYPE! renders expanded as all the types, for each parameter...!)

r3-alpha>> types-of :append
== [make typeset! [binary! string! file! email! url! tag! bitset! image! vector!
block! paren! path! set-path! get-path! lit-path! map! object! port! gob!]
make typeset! [unset! none! logic! integer! decimal! percent! money! char!
pair! tuple! time! date! binary! string! file! email! url! tag! bitset! image!
vector! block! paren! path! set-path! get-path! lit-path! map! datatype!
typeset! word! set-word! get-word! lit-word! refinement! issue! native!
action! rebcode! command! op! closure! function!...

The description strings--however--were skipped over by MAKE FUNCTION!. It only looked at ANY-WORD!s for the parameter names, and BLOCK!s for the typeset definitions. So anyone who wanted to extract help would have to do it from the saved SPEC-OF block.

This was made more dubious by the lack of a formal specification for what exactly the legitimate formats were:

func [arg [type!] {doc string}] [...]
func [arg {doc string} [type!]] [...]
func [arg {doc string1} {doc string2} [type!]] [...]
func [arg {doc string1} [type!] {doc string2}] [...]

Ren-C had some ambitions to define this, and even to let the help information that were broken up into multiple strings be merged together into a single string, so you could say things like:

func [
     {Some description you write}
     {that spans multiple lines}
     arg1 "description of arg1"
     ...
][
    ...
]

This led to the idea of centralizing the code for function spec analysis so that the block was turned into structured information in one place. It wouldn't just be the argument names and types that would be processed at function creation time. Each parameter would be mapped to a completed description string as well.

And then the spec block wouldn't be stored at all. All queries about the function would go through to the structured information. If it wasn't part of what the evaluator needed (and hence didn't wind up in the paramlist), it would have to be stowed away in an object associated with the function known as the META.

What's Been Good About It?

The best aspect that I think has come out of this is the idea that the core focuses on function mechanics and is relatively unconcerned with how much or little information is maintained for HELP. This has empowered and enabled a large number of function derivation operations, that are critical to the operation of the system today.

In fact, I think saying the core focuses on mechanics should be pushed even further, to establish that the core does not ever see a function SPEC BLOCK! at all. So no description strings ever get passed in. Instead, whatever we think of as MAKE ACTION! takes a distilled parameter order and convention definition...essentially a FRAME!.

I've written about an idea approaching this concept in "Seeing All ACTION!s as Variadic FRAME!-Makers". And I think something like this is basically the future.

So in this coming future, not only are FUNC and FUNCTION synonyms, but the logic which transforms BLOCK!s into parameter lists and conventions lives in those generators. If you want to write a generator that uses the spec block logic, you build it on top of FUNCTION. Because if you are going to make a raw ACTION! on your own, you will need to speak a more "core" protocol...and you'll also be responsible for managing the information tied to HELP, which the core does not care about at all.

What's Been Bad About It?

I mentioned that there has always been a problem with using higher level generators that translate down to calls to FUNC, as R3-Alpha would only preserve what it got at the FUNC level:

Imagine if you come up with a relatively simple function generator, that takes a WORD! and gives you two variables corresponding to that one WORD!:

r3-alpha>> var-doubling-func: func ['var body] [
    func compose [(to word! join var 1) (to word! join var 2)] body
]

r3-alpha>> f: var-doubling-func 'x [
     print ["x1 is" x1]
     print ["x2 is" x2]
]

r3-alpha>  f 10 20
x1 is 10
x2 is 20

But SOURCE will not give you back your "source" as written with the VAR-DOUBLING-FUNC. It will just give you what MAKE FUNCTION! ultimately saw:

>> source f
f: make function! [
     [x1 x2]
     [print ["x1 is" x1] print ["x2 is" x2]]
]

That's the tip of the iceberg. But because Ren-C is factoring so much out into a generalized "language construction kit", the idea of being able to get at "source" is becoming more difficult. People who are doing what they used to feel were "simple" things are actually doing complex things.

There are some tools we can use here, one of the best ones is mapping back to real source in files via line numbers and file names. This would mean that in the f: var-doubling-func example above, the function that is generated would be able to locate that invocation and point you back at it. There might be dozens of functions generated by an abstraction--all pointing back to that origin--but this could still be very helpful if you are inspecting one of them to see where it all kicked off from.

Another problem has been that as Ren-C's sophistication as an evaluator has gone up...with things like <skip>-able parameters and variadics and such...the ability to detect and interpret that from the outside has gone down.

Even a simplifying change like the refinements-are-their-own-arguments now makes things a little harder for those doing spec analysis. Compare:

r3-alpha>> words-of :append
== [series value /part length /only /dup count]

ren-c>> parameters of :append
== [series value /part /only /dup /line]

There was a bit of effort involved in parsing out which refinements took arguments or not before. But at least you could tell, just from the word list, without consulting the types. Now there's no way of knowing.

I stand by the change of unifying refinements and their names 100%. But maybe there are moves that should be made here to distinguish the argument-taking refinements? Random brainstorms:

>> parameters of :append
== [series value part/ /only dup/ /line]

>> parameters of :append
== [series value /part/ /only /dup/ /line]

>> parameters of :append
== [series value /[part] /only /[dup] /line]

I don't know. But maybe this is kind of a losing battle; and really the only meaningful piece of information you should be concerned about is words of make frame! :append, and the parameter conventions are really between the evaluator and HELP.

This is more thinking-out-loud than anything...

But what I think the central point is, is that the interpreter core and "function spec blocks" are getting an increasingly distant relationship. I think that's a good thing, and I foresee something more like the post I made where parameter descriptions and HELP information are completely divorced from the language used to ask the core to make a function.

It's just going to require a much different way of thinking about SOURCE. It might mean more tricks like what @IngoHohmann was looking into, but do bear in mind my VAR-DOUBLING-FUNC above...which leads me to think that rather than snapshotting a certain level of abstraction, that backtracking through generated code to the "actual source" might be the more useful direction long term. But you might also make requests, e.g.:

 >> snapshot f: 1 + 2
 == 3

 >> get-snapshot 'f
 == "1 + 2"  ; expression that made f

hostilefork · November 30, 2020, 8:51pm

This is coming closer and closer to reality.

You've been able to go both ways for a while now: either MAKE FRAME! from an ACTION!. or MAKE ACTION! from a FRAME!. But I've managed to do some changes that make this process more fluid and less expensive.

How much less expensive? MAKE ACTION! from any FRAME! now creates an entity that takes up just 8 platform pointers. Before, creating actions always required creating a new parameter list to serve as the action's identity...so how much it cost would depend on how many arguments and locals you had. You'd need the same 8 platform pointers...plus 8 more platform pointers (archetype and end), plus 4 platform pointers per argument or local...plus round up for fitting in a memory pool.

This does make it very appealing to start thinking of making functions in a fast and direct way that doesn't necessarily go through any "spec parsing"... and the spec language would no longer be built into the interpreter or evaluator.

Lots of Open Questions, Though.

Right now, you have a chicken-and-egg problem when it comes to making ACTION!s out of FRAME!s... because the only way to get a FRAME! is from an existing ACTION! (which has encoded knowledge of what's a refinement, argument, local, quoted argument, etc.)

What will it take to be able to MAKE FRAME! from scratch, with information beyond just the key names? And how do you give the frame behavior?

Brainstorming here: Perhaps it's a two step process, where making a frame from a BLOCK! sets up the parameterization with no initial behavior outside of type-checking. Then you adapt it:

 frame: make frame! [
     return: <return>
     arg1: integer!, arg2: integer!
     hide local1: ~undefined~
 ]
 add-nums: adapt (make action! frame) [
     local1: arg1 + arg2
     return local1
 ]

That's a rough idea with a lot of hand-waving, but maybe it's pointing in a useful direction. And even a weird notion that the non-hidden contents of frame values are their parameter gathering disposition...so if you ever wanted to specialize something, you'd actually hide it.

>> f: make frame! :append
== make frame! [
    series: [any-series! port! map! object! module! bitset!]
    value: [<opt> any-value!]
    part: [any-number! any-series! pair!]
    only: []
    dup: [any-number! pair!]
    line: []
]

>> hide f/dup: 2  ; hiding means frame content is value, no longer typecheck
>> f/value: [integer!]  ; not hiding, so adjusting typecheck
>> ap2int: make action! f

>> ap2int [a b c] 10
== [a b c 10 10]

>> ap2int [a b c] "illegal"
** Error: ap2int doesn't allow TEXT! for its VALUE argument

Anyway, just trying to push a little of this thinking forward as to more what it might look like. One way or another, I see FRAME! moving to a role of structural primacy in the evaluator...with spec blocks and their variations being designed to match the needs of their uses.

Some More Details on New Efficiencies

Without belaboring too many details of recent housecleaning...I've managed to align the workings of contexts and actions a bit better:

Since the beginning of FRAME!, parameter lists of actions have been able to serve as the "keylist" of contexts. But they were also acting as the identities of functions. This meant every time you would ADAPT or SPECIALIZE something with 13 arguments and locals...you'd have to copy that 13-long array to get a new identity.
In the beginning, a frame was dependent on being able to glean an action's specific identity from the paramlist...because that's the only pointer it had. As the design of "phase" information in the FRAME! cell got further along, this was no longer needed; the function identity could be known every step of the composition just by the phase.
Other changes consolidated that the paramlist no longer needed to be the identity of an action, and could thus be shared between them...with another array holding instance information about the action (such as the body block to run) being the identity.

Not quite finished, but largely a clarifying change that reduces memory footprint. Some performance work will be needed since it has to do calculations to find things that used to be closer at hand, but I don't think it should be too bad. Biggest impact is that some of the parameter hiding was based on having new paramlists at each level, so it will have to be seen if frame phasing can achieve the same effect...or if it's an acceptable loss to not hide access to locals in some cases (like ADAPT)

hostilefork · January 9, 2021, 1:15am

hostilefork:

And even a weird notion that the non-hidden contents of frame values are their parameter gathering disposition...so if you ever wanted to specialize something, you'd actually hide it.
>> f: make frame! :append
== make frame! [
    series: [any-series! port! map! object! module! bitset!]
    value: [<opt> any-value!]
    part: [any-number! any-series! pair!]
    only: []
    dup: [any-number! pair!]
    line: []
]

Inch by inch...I think things are converging toward what I've been trying to do.

I think it can be much better than the above. e.g. as frame! for being able to get a read-only alias of a function in this style without paying for any allocations. That would be great just for examining the interface quickly. But if you wanted to make a variation, you could copy as frame! to get a copy...and then as action! to get that copy to reify as an action without additional allocations. If all this can click together I think it will be pretty amazing (though I actually think some of what is happening already is very, very neat!)

Moving the parameter mode information onto the arguments makes this work; e.g. instead of:

 /part [any-number! any-series! pair!]

This representation would relay:

 part: /[any-number! any-series! pair!]

Obsoleting Old Attempts: e.g. RESKINNED

Reworking the code to some new ideas in this vein is breaking various hacks that were tried which approximated this before. One of those was called "RESKINNED". It scratched the surface of the concept. But being able to do surgery on a framelike thing and efficiently combine specializations and augmentations all into one operation without wasting memory is a much more appealing concept--and that's what I'm shooting for now.

Here were the fledgling tests that were started for RESKINNED, for reference...but I'm removing the code:

; RESKINNED is an early concept of a native that rewrites parameter
; conventions.  Would need a bigger plan to be integrated with REDESCRIBE,
; as a general mechanism for updating HELP.  (The native does not mess with
; the HELP, which is structured information that has been pushed out to a
; mostly userspace protocol.)

; Test return type expansion and contraction
(
    returns-int: func [return: [integer!] x] [x]

    returns-text: enclose :returns-int func [f] [f/x: me + 1 to text! do f]
    returns-text-check: reskinned [return: [integer!]] :returns-text

    skin: reskinned [return: [integer! text!]] :returns-text-check

    did all [
        "11" = returns-text 10
        (trap [returns-text-check 10])/id = 'bad-return-type
        "11" = skin 10
    ]
)

(
    no-decimal-add: reskinned [return: @remove [decimal!]] adapt :add []
    did all [
        10 = no-decimal-add 5 5
        (trap [no-decimal-add 5.0 5.0])/id = 'bad-return-type
    ]
)

; The @add instruction adds an accepted type, leaving the old
(
    foo: func [x [integer!]] [x]
    skin: reskinned [x @add [text!]] (adapt :foo [x: to integer! x])

    did all [
        10 = skin "10"
        10 = skin 10
    ]
)

; No instruction overwrites completely with the new types
(
    foo: func [x [integer!]] [x]
    skin: reskinned [x [text!]] (adapt :foo [x: to integer! x])

    did all [
        10 = skin "10"
        (trap [skin 10])/id = 'expect-arg
    ]
)

; @remove takes away types; doesn't need to be ADAPT-ed or ENCLOSE'd to do so

(
    foo: func [x [integer! text!]] [x]
    skin: reskinned [x @remove [integer!]] :foo

    did all [
        "10" = skin "10"
        (trap [skin 10])/id = 'expect-arg
    ]
)

; You can change the conventions of a function from quoting to non, etc.

(
    skin: reskinned [@change value] :just
    3 = skin 1 + 2
)

(
    append-just: reskinned [@change :value] :append
    [a b c d] = append-just [a b c] d
)

; Ordinarily, when you ADAPT or ENCLOSE a function, the frame filling that is
; done for the adaptation or enclosure does good enough type checking for the
; inner function.  The only parameters it has to worry about rechecking are
; those changed by the code.  Such changes disrupt the ARG_MARKED_CHECKED bit
; that was put on the frame when it was filled.
;
; But if RESKINNED is used to expand the type conventions, then the type
; checking on the original frame filling is no longer trustable.  If it is
; not rechecked on a second pass, then the underlying function will get types
; it did not expect.  If that underlying action is native code, it will crash.
;
; This checks that the type expansion doesn't allow those stale types to
; sneak through unchecked.

(
    skin: reskinned [series @add [integer!]] (adapt :append [])

    e: trap [
        skin 10 "this would crash if there wasn't a recheck"
    ]
    e/id = 'phase-bad-arg-type
)