FIND treats TYPESET!s specially...why not functions?

hostilefork · July 14, 2022, 10:36am

In Rebol2, R3-Alpha, and Red...doing a FIND searching for a TYPESET! will give you the first instance of that type in a block:

>> find [1 2 "abc" 3 4] any-string!
== ["abc" 3 4]

>> find [1 2 <abc> 3 4] any-string!
== [<abc> 3 4]

One would think that you'd be able to search for the typeset literally by using /ONLY. But that doesn't work (though Red says they addressed this recently)

Why Wasn't This Taken Further?

Trying to FIND a function is pretty rare. So why didn't they make it so that passing a function to FIND makes it search?

>> find [1 2 3 4] func [x] [x > 2]
== [3 4]

If a function took multiple arguments, that could be asking it to effectively /SKIP and group items at a time:

>> find [1 2 4 3 5 6] func [a b] [a > b]
== [4 3 5 6]

/ONLY could have worked for finding a function literally:

>> find/only reduce [:positive? :zero? :negative?] :zero?
== [#[native! zero?...] #[native! negative?...]]

Ren-C goes with QUOTED! vs. /ONLY, but same basic premise

>> find [1 0 2 0] :zero?
== [0 2 0]

>> find reduce [:positive? :zero? :negative?] quote :zero?
== [#[native! zero?...] #[native! negative?...]]

Though It Seems Easy To Make Mistakes...

People are invariably going to write find data value...think it works for a few values they try...and assume it works for others. Redbols are notorious for pulling the rug out from under you with such things.

But if you're willing to do this for typesets, I don't see why doing it for functions is that much worse.

Just something to think about.

hostilefork · July 20, 2022, 3:56am

I feel like we're coming to a solution here in general, with the idea of everything having isotopic forms, specifically the idea that ACTION! isotopes are the only kind of action that executes implicitly.

It would then be isotopic ACTION! and isotopic TYPESET! that would have the weird behavior.

How would you make an isotopic action to pass to something like FIND? Well, the proposal is that action isotopes are the only things that execute. Hence:

foo: func [x] [print [x]]  ; FUNC would return an ACTION! isotope
bar: x -> [print [x]]  ; LAMBDA would return an ACTION! isotope

This means that simply defining a function in the slot where isotopic things are taken would fulfill the purpose.

>> find [1 2 3 4] func [x] [x > 2]  ; pass isotopic function, so not using literally
== [3 4]  ; one interpretation of the result (could also pass function a position)

>> find [1 2 3 4] meta func [x] [x > 2]  ; search for the function *literally*
== [1 2 3 4]  ; impossible to find that function in the block, you just created it

We could say that there is a term MATCHES which creates isotopic typesets from non-isotopic ones

>> find [1 2 "abc" 3 4] matches any-string!
== ["abc" 3 4]

Alternately, we could instead solve this by passing functions, e.g. generated by POINTFREE:

>> find [1 2 "abc" 3 4] (<- match any-string!)
== ["abc" 3 4]

hostilefork · July 28, 2022, 6:38am

So this gets a little bit complex to do with existing functions without GET-WORD! acting in the traditional way, because the function accessors like ^integer? give you a non-isotopic action. Hence this won't work, if you wanted it to return [2 3 4]

>> find [1 2 3 4] ^even?  ; non-isotopic ACTION! for even?
; null

You'd have to unmeta the result, and the function I have for doing that right now specific to actions is called RUNS:

>> find [1 2 3 4] runs ^even?
== [2 3 4]

That's not awful... if you think about it as an analogue to SPREAD.

Right now there's GET/ANY which will give you an isotope, but I think that looks worse:

>> find [1 2 3 4] get/any 'even?
== [2 3 4]

But right now, with the idea that the function generators all return isotopes, you'd still be able to use a generator in that spot:

>> find [1 2 3 4] func [x] [x > 2]  ; FUNC does make isotopes in pending code
== [3 4]

hostilefork · August 19, 2022, 12:49am

PRESTO!

>> matches integer!
== ~#[datatype! integer!]~ ; isotope

>> find [a b c 1 2 3] matches integer!
== [1 2 3]

Remember that isotopes cannot be put into blocks. Only quasi-forms (which are real values, non-isotopic, that evaluate to isotopes). So there's no gray area here--it means what it means--match instances of the type.

We can see this is different from the literal non-isotopic search:

>> block: reduce [1 2 3 integer! text!]
== [#[datatype! integer!] #[datatype! text!]]

>> find block integer!
== [#[datatype! integer!] #[datatype! text!]]

Things that are built on top of FIND, like REPLACE, get to use it for free... just like happened with SPREAD!

>> replace/all [1 a 2 b 3] matches integer! <int>
== [<int> a <int> b <int>]

>> replace/all [1 a 2 b 3] matches integer! (pos -> [negate pos.1])
== [-1 a -2 b -3]

>> replace/all [1 a 2 b 3] matches integer! (pos -> [reduce [pos.1 pos.1]])
== [[1 1] a [2 2] b [3 3]]

>> replace/all [1 a 2 b 3] matches integer! (pos -> [spread reduce [pos.1 pos.1]])
== [1 1 a 2 2 b 3 3]

Powerful and precise. Are we having fun yet?

I Remember First Seeing The R3-Alpha Code For Find...

It was a semantic nightmare, and I remember thinking "is the code really just full of issues like this?" (yes, it was, and Ren-C has been stamping them out)

Here you see it's a complete crapshoot as to what you're going to get back. Maybe it'll point at a datatype. Maybe a typeset. Maybe an instance...it didn't discriminate.

// Find a datatype in block:
else if (IS_DATATYPE(target) || IS_TYPESET(target)) {
	for (; index >= start && index < end; index += skip) {
		value = BLK_SKIP(series, index);
		// Used if's so we can trace it...
		if (IS_DATATYPE(target)) {
			if ((REBINT)VAL_TYPE(value) == VAL_DATATYPE(target)) return index;
			if (IS_DATATYPE(value) && VAL_DATATYPE(value) == VAL_DATATYPE(target)) return index;
		}
		if (IS_TYPESET(target)) {
			if (TYPE_CHECK(target, VAL_TYPE(value))) return index;
			if (IS_DATATYPE(value) && TYPE_CHECK(target, VAL_DATATYPE(value))) return index;
			if (IS_TYPESET(value) && EQUAL_TYPESET(value, target)) return index;
		}
		if (flags & AM_FIND_MATCH) break;
	}
	return NOT_FOUND;
}

I suppose the thought process behind this would be that if you were looking in a block for a datatype or typeset, it was probably a block that didn't contain other things. So it assumes you wouldn't be mixing datatypes and integers in the same block.

But if you're using a block as a kind of mapping structure, you may well be mixing keys that are sometimes datatypes and sometimes instances. UPARSE does this... it doesn't have separate mapping tables for finding the combinator registered for the general datatype of WORD! vs. a specific WORD! that acts as a keyword. It just has one big table (implemented with map, but you could imagine it being done differently if it were working in a row-oriented manner).

So I'm glad this is sorted out! Many things shaping up the last couple of days--almost exciting.

hostilefork · August 19, 2022, 1:52am

Added Bonus... SWITCH Works With MATCHES Too!

>> switch <asdf> [matches any-string! ["COOL!"]]
== "COOL!"

>> switch "asdf" [matches any-string! ["COOL!"]]
== "COOL!"

>> switch 1020 [matches any-string! [fail ~unreachable~]]
; void (decays to none)

This might be a good bridge to wean us off the idea of multiple switch values per branch... at least without having commas.