First things first: If you want to work on network protocols at all--in the Redbolverse or elsewhere--please heed this warning!
There's a really big potential misunderstanding about TCP when you offer someone a plain READ operation.
You may get the impression that the other side of the connection is sending "messages" with specific lengths. This is not the case!
Let's imagine you could say:
>> read tcp-port
== #{1FFEC02A} ; 4 bytes
>> read tcp-port
== #{E0C1} ; 2 bytes
Just because you got two chunks of information from that connection does not mean that there were 2 sends from the other side. The other side might have sent 100 bytes and this is just how you are getting the first 6. Or it could have done 6 individual 1 byte writes, along with any number of 0 byte writes (which are legal).
This means all understanding of the data you receive has to be in terms of a protocol. The only number that matters is how much data you are certain that you can expect... never in any particular length you get from a chunk of data you are provided.
Note Today's READ On TCP:// Doesn't Work Like That
Rebol hasn't historically let you READ a TCP connection in the way shown above.
( Though honestly, the code for everything would have been more understandable if you could have just done that! All the convoluted WAITs and AWAKE handlers were in service of some asynchronous nirvana that never materialized. So what ended up emerging was the most convoluted and poorly engineered way to write ultimately synchronous protocols the world has likely ever seen. )
What happens instead is that READ basically just makes a request, and returns nothing. If you want anything back, you have to poke an AWAKE handler function onto the port as a function to receive the data. If you WAIT on the port...then eventually during the course of that wait you should get an EVENT! passed to the AWAKE handler specified.
The only thing that event holds is the word "READ", so the place you need to look for the data would be in the port's DATA member. That data member would just get bigger with each READ. So it was the responsibility of the user of the port to clear that data out, or it would just accrue indefinitely.
If a READ didn't come back with the number of bytes you wanted, you'd have to call READ again...and then return FALSE from your port's AWAKE handler to say you were not done yet. Returning TRUE from the port's AWAKE handler would indicate that enough progress had been made that something which was performing a WAIT on that port should be unblocked from the WAIT.
Can We Improve This? (rhetorical question)
If we were only judging R3-Alpha PORT! vs. the low-level unix recv() function, we might say that it's an improvement. It doesn't concern you with the dynamic memory management of having to pass small fixed size buffers to the recv() and stitch them together into a large blob. The BINARY! grows on its own, and you take what you want from it.
The truth is somewhat more complex. The BINARY! lives in the port and the semantics of how you interact with it are traditionally not clear. What if you increment the index?
tcp-port.data: skip tcp-port.data 100
Should the next time the buffer is added to slide the buffer forward to use the unused space at the head? Or should those 100 bytes at the beginning be preserved indefinitely?
The way most languages would resolve such questions about "buffered IO" would be to narrow the interface through something like GoLang's Reader interface. You are given specific APIs to "peek" at the data without removing it. Or if you do remove it, then it's always slid forward to the front of the buffer.
Attacking Asynchronousness In A Modern Fashion
Empirically people must have noticed that R3-Alpha never delivered the goods on its promised asynchronousness. I've pointed out some of the reasons, and how "WAIT" has lacked a semantic definition of what it is you're "waiting for" on ports.
By building on libuv, the nuts and bolts at the systems/C level of being able to make requests and then get a callback is now available...with reasonable handling of errors. But it would be a big mistake to expose that mechanism by replacing the PORT!/AWAKE mire with some kind of READ/CALLBACK situation where you pass a function that gets called with either the data you asked for or an error. No one wants to code like that... which is why Node.JS callbacks are all being replaced by async/await patterns.
My feeling is that if you want to disentangle users of a scripting-class language with the problems that come with threads and mutexes and the like, there have emerged modern answers. And Go is one of the better examples of this:
"Do not communicate by sharing memory. Share memory by communicating."
Code would get clearer if we rolled it back to where you write things as if they are synchronous. And often that's probably going to be fine for people. But if it's not, then you use channels to split off what you are doing.
So I think the likely right answer is just to push forward on stacklessness as the basis for green threading, used to implement asynchronousness as it is needed.
This would mean that I think all the asynchronous port stuff that exists so far should just be scrapped. #andnothingofvaluewaslost
So... READ on a TCP PORT! Should Give Back BINARY!, then?
Okay we're back to this:
>> read tcp-port
== #{1FFEC02A} ; 4 bytes
>> read tcp-port
== #{E0C1} ; 2 bytes
As I just said, synchronous reading like this is more in line of how we express ourselves...and we get asynchronousness by virtue of some scheduler that can rearrange things as a master of stack-time-and-stack-space. (I will point out pretty well developed experiments with that scheduler have been reached in the past, and can be reached again...with more insight now.)
But I think we like the concept of READ as a default of "give me all you've got until the EOF" as a default.
Multi-returns can help us here. Remember that a function knows how many returns you requested, so it can selectively invoke a behavior when you do so.
>> [data eof]: read tcp-port ; asking for EOF means don't force read to EOF
== #{1FFEC02A}
>> eof
== false
>> [data @]: read tcp-port ; remember "circling" and other neat tricks..
== #[true] ; asked for eof to be the main return result
>> data
== #{E0C1}
-- or --
>> read tcp-port ; don't ask for EOF means read until EOF
== #{1FFEC02AE0C1}
This feels much more solid.
Weird Concept Idea: Buffer As BLOCK! ?
Cleansing ourselves of the dead-end of R3-Alpha's asynchronous plan, there are some areas we might look to to play to language strengths.
I've mentioned the importance of being able to "push things back" into the buffer after having read them...and that it's likely the best way of doing PARSE on streams
So I began to wonder if the thing that ports would accrue might be a BLOCK! instead of a BINARY!?
Imagine a TCP port would be feeding in little blobs of BINARY! at the tail. But when you got the chance to process it, you could make the decision to fold that into some kind of structure. Then you could emit this higher level processed structure to something that listens down the "pipe".
These are the kinds of novel directions I'd like to see...where we can do streamed block PARSE on a PORT! that feeds arbitrary values, that was decoded from binary, that was decompressed from a streaming codec on top of a TLS decoder...
So this might lead to some weird stuff. Like if you start asking to look into the buffer you'd see that it's a block and see the blobs it plans on giving you in the next READs:
>> peek/part tcp-port 2
== [#{1FFEC02A} #{E0C1}]
Anyway, long post...but I feel slightly optimistic that it points toward some of how to dig out of the R3-Alpha port debacle.