Function Composition via PATH!

hostilefork · August 29, 2024, 3:42am

Something to consider here is what to do if there's a slash after a function call

lib/append:dup.2/xxxx

To be coherent with "slash is tied to invokable functions", it seems the thing on the right should be specifying some sort of function invocation, I'd think?

It may not seem useful to say it was an invocation that would happen on the result:

append:dup.2/empty? [] if false [<nothing>]

Because you could have written that as:

empty? append:dup.2 [] if false [<nothing>]

BUT if you can do it all in one swoop, you can get a description for a cascade of functions passed as a single value:

odd?: get $even?/not

Though reading it backwards isn't as useful as reading it forwards:

odd?: get $not/even?

We wouldn't worry so much that the composition tool that was CHAIN has to be renamed CASCADE if we have another way of saying it that we can use most of the time.

I had a similar idea a long time ago. Anyway, put a in that. There's lots of ground to cover first.

bradrn · August 29, 2024, 5:06am

In other words, function composition: f/g ≡ func [x] [f g x]. An immensely useful tool in Haskell, and one which I wish more languages had.

Still, it feels awkward to combine this with function member access. Indeed, it’s ambiguous: x/y/z value could be x/y z value, or it could be x y/z value. My intuition is that this should instead be some variety of nested access, but I’m not sure how to reconcile that with function application.

hostilefork · September 1, 2024, 9:57am

Isn't it the same?

not/positive?/negate number

not/positive? negate number

not positive?/negate number

The ambiguity concern I have here is that with x/y/z doing composition, you're losing your assurances that X is not a function. e.g. how do you know NOT isn't an object? (If there were no composition, we could be assured NOT wasn't a function.)

I think this suggests you need to write /x/y/z if X is a function.

You would usually only be doing this to pass to GET anyway. Otherwise you'd just write x y z.

Though there's some niceness in being able to use the same number of characters and point out the function calls.

x y z
x /y /z   ; let's say we don't decorate x even though it invokes an action
x/y/z   ; same # of chars as `x y z` but more information

/x/y/z  ; complete information, clearest if required when X is a function

bradrn · September 1, 2024, 10:33am

hostilefork:

bradrn:

Indeed, it’s ambiguous: x/y/z value could be x/y z value, or it could be x y/z value

Isn't it the same?
not/positive?/negate number

not/positive? negate number

not positive?/negate number

If they’re all functions, yes. But not if x or y is an object containing a function…

hostilefork · September 1, 2024, 2:05pm

Ah, so like what I was saying about get $not/even? being such that you can't tell if NOT is an object.

But it's ambiguous to the reader, not to the system, right?

My policy of requiring the leading slash when the first element is a function helps a little bit.

In fact, if you use leading slash you don't need to use slashes to mention functions you want to run (I imagine you couldn't use them at all!)

/lib.append  ; clearly lib.append has to be FRAME! (or antiform FRAME!)
vs
/lib/append  ; makes it look like two function applications, illegal

So that would fix it up, I think... once you are into function composition mode, your picks from objects would be done with tuples, because the previous slash internal to the path would imply the execution.

Ren-C already has it, just not in this brief a notation. Called CHAIN (to be renamed, likely to CASCADE). From the tests to make sure function derivations combine correctly:

add-one: func [x] [return x + 1]
mp-ad-ad: chain [:multiply, :add-one, :add-one]
assert [202 = (mp-ad-ad 10 20)]

sub-one: specialize :subtract [value2: 1]
mp-normal: chain [:mp-ad-ad, :sub-one, :sub-one]
assert [200 = (mp-normal 10 20)]

So it will be nifty to have the shorter way to write it. But you'll still want the longer version if you're defining your functions inline (using the new slash rules and string delimiter here, just to start trying it out...)

/collect-text: redescribe [
   -{Evaluate body, and return block of values collected via KEEP function.
     Returns all values as a single spaced TEXT!
     Individual KEEPed blocks get UNSPACED.}-
] cascade [
    adapt get $collect [
        body: compose [
            /keep: adapt specialize get $keep [
                line: null
                part: null
            ][
                value: maybe unspaced value
            ]
            (as group! body)
        ]
    ]

    get $spaced

    specialize get $else [branch: [copy ""]]
]

CHAIN has had the first function run as the first argument, and people have asked for it to be in the other order...where the pipeline has the last function first in the list. Maybe this new short notation will make them happy enough. I like the "first called function first" in the longer form.

While I expect foo. to act like get $foo I am finding myself liking writing the GET out long form better most of the time.

We could make the function generators allow taking @foo and doing the lookup themselves:

    specialize @else [branch: [copy ""]]

I don't hate that, though it makes their interface less orthogonal just to avoid writing:

    specialize else. [branch: [copy ""]]

Though we could have different meanings. else. gives you a FRAME! antiform and could mean "treat function anonymously, don't put a symbol in it" and @else could mean "I'm passing in a symbol for you to look up yourself, so the symbol is meaningful, embed it in the result".

Ah, so many degrees of freedom.

bradrn · September 2, 2024, 1:26am

This design makes much more sense to me. A slight pity, because I like the aesthetic of object/function, but /object.function would be more consistent and less confusing, so it wins out easily.

So, this would mean that our three levels of sequencing each perform a single dedicated task in the evaluator:

/PATH/ for calling functions (one or many)
:CHAIN: for giving functions refinements
.TUPLE. for specifying field names or refinement arguments

And the natural order of nesting corresponds to the interpretation you would expect for these things, such that /f/obj.g:refine/h does precisely what it looks like. I like this.

hostilefork · September 2, 2024, 1:56am

That's still how it will work in ordinary calls...

lib/append [a b c] [d e]

...and that should work with GET too.

get $lib/append

But with that syntax the system (and reader) knows LIB is an object.

If the first thing is a function and you're trying to do a composition, that's when you need the leading slash, and the implied rules after that of using tuples:

get $reverse/lib/append  ; error
get $reverse/lib.append  ; error

get $/reverse/lib.append  ; works

As I said... one slash per function call (with the only exception being the reduced case of using a WORD! looking up to antiform frame...and you can prepend a slash if you're in a situation where you feel that would make things clearer.)

bradrn · September 2, 2024, 8:44am

hostilefork:

That's still how it will work in ordinary calls...
lib/append [a b c] [d e]
...and that should work with GET too.
get $lib/append
But with that syntax the system (and reader) knows LIB is an object.

Hmm, I suppose I can accept it as a piece of syntactic shorthand. Just so long as /lib.append remains available too.

A thought on this: what if the first thing is a library? I personally think that lib/append/reverse would be very confusing, and that leading-slash syntax should be required in this case too (so /lib.append/reverse).

Another thought: what are the semantics when we throw multivalent functions into the mix? If we let F and G be two-argument functions and f be a one-argument function, there are a number of possibilities (most of which exist in at least one language already):

; for 2/1 composition:
/F/f x    ≡  F x (f x)      ; J ‘monadic hook’, Haskell ‘Reader applicative’
/F/f x y  ≡  F x (f y)      ; J ‘dyadic hook’
/F/f x y  ≡  F (f x) (f y)  ; J ‘dyadic compose’

; for 1/2 composition:
/f/F x y  ≡  f (F x y)      ; J ‘dyadic atop’, Haskell ‘owl operator’

; for 2/2 composition:
/F/G x y  ≡  F x (G x y)
/F/G x y  ≡  F (G x y) y
/F/G x y  ≡  F (G x y) (G x y)
; not aware of any language with these combinators

And naturally the possibilities explode once you begin to consider functions with even more arguments. I’m not sure how important the precise choice of rule is: just as long as there is some reasonably intuitive rule.

(Personally, I tend towards a rule which states that all functions except for the last must take one argument only, such that the rule is /f/g/…/h x y z … ≡ f g … h x y z.)

hostilefork · September 2, 2024, 8:59am

As long as it's unambiguous I don't have a problem with it. There are two slashes, so two function invocations, and each function's name (well, variable holding the function, might be a tuple) appears after the slash. So LIB is a module or object or whatever, because it would error otherwise.

If you want to prefix slash that, it has to lose a slash to keep with the rule, and that makes sense to me.

Today's CHAIN is permissive and just picks up more arguments.

>> mp-ad: chain [get $multiply, get $add]   
== ~#[frame! {mp-ad} [value1 value2]]~  ; anti

>> mp-ad 10 20 104
== 304

It follows the intuitive rule (each step takes the previous step's product as the first argument, then consumes any remaining arguments from callsite). The interface doesn't convey its arity at this moment so it's effectively variadic... but, that's something that could be worked on.

Especially since the interface isn't lined up yet... perhaps clearest if you had to use CASCADE with a refinement to get that (CASCADE/MULTI, or whatever). Then the path-based shorthand would follow the more restrictive default.

With such things it doesn't hurt to be conservative to start with and then be more permissive later if it turns out to be oppressive.