Since I'm having a spitballing vibe, here's another proposal for a NODE! datatype.
In developing my Markup Parser, I settled on a method of tree-building in which I used linked lists with nodes having five relationships: parent, previous/next sibling and first/last child. For this, I used blocks with bi-directional links for each relationship.
<html> <head> <body> </html>
In this example (ignoring whitespace/text), there are three elements (I'll call nodes). I'll use blocks to represent each node:
; prototype node: [parent _ back _ next _ first _ last _ name _] ; nodes html: copy node head: copy node body: copy node ; names html/name: <html> head/name: <head> body/name: <body> ; relationships html/first: body/back: head html/last: head/next: body head/parent: body/parent: html
The advantage to this is in being able to reference nodes without losing that reference if other nodes are added/removed post-facto—quite common even when building a tree from HTML tokens:
; remove <head> html/first: body body/back: _ head/parent: head/back: head/next: _ ; add <foot> foot: copy node foot/name: <foot> head/last: body/next: foot ; body is still <body> body = <body> ; => true
With the myriad circular references, this becomes unlike any other Rebol data structure—it's difficult to get a helicopter view of the relationships on account of the number of circular references. Then the other issues inherent in Linked Lists, such as having to walk the list to get the size, etc.
In the current BLOCK! based system, it's not clear how efficient all these circular references are or even just the cost of using names in the name-value pairings over potentially hundreds/thousands/more nodes.
I'd like to float the idea of a NODE! datatype, which in memory is a six value-slot object with parent/back/next/first/last and value slots. It's only purpose would be to optimise for this case. With it would come a few efficiencies.
Management could come in the form of a small object with helper functions that managed the relationships:
; creation html: make node! <html> head: tree/append html <head> body: tree/append html <body> ; reflection value of html/last ; => <body> value of head/parent ; => <html> ; manipulation tree/insert-after <body> <foot> tree/remove head ; reflection ii value of body ; => <body> body/back ; => _ html/first ; => <body> html/last ; => <foot> value of head ; => <head> ; still exists head/parent ; => _ head/next ; => _
Coupled with my Class proposal, this would (I'd suppose) represent the lightest weight, yet versatile, representation of trees.
html: make html-element <html> head: tree/append html make html-element <head> body: tree/insert-after head make html-element <body>
A display of the node would use shorthand for the value of each relationship in place of Rebol's usual
[...] circular replacement.
>> probe body == make node! [ parent: <html> back: _ next: <foot> first: _ last: _ value: <body> ]
I don't think it'd get around the inconvenience or inefficiency of, say, counting the number of direct kids, but the other benefits would make up for it.
NODE! could similarly be used in conventional linked lists or doubly linked lists by just leaving the other four/three relationships respectively blank.
A correction: not every relationship is bi-directional. Every node (except the root) connects to its parent, the parent only connects to the first and last of its kids