What is a Port?

From the Rebol 3 Project Wiki:


Basic Concepts

As stated above, a port is used to transfer data. However, the basic port definition is a bit more general than that. A port is actually more like a stream of data that undergoes some type of exchange, transformation, or effect.

For example, a port is often used for I/O functions such as:

  • console input and output
  • file reading and writing
  • directories of files
  • network transferring of data
  • event handling, such as mouse clicks or keyboard input
  • database access

But, a port can also be used for other types of functions:

  • image conversion - such as encoding or decoding a JPEG file.
  • sound conversion - such as encoding or decoding an audio file
  • checksum computation - keeping a running checksum
  • compression and decompression of data
  • encryption and decryption of data
  • other codecs for encoding and decoding data formats

Related to Series

As you know, Rebol is built on the concept of a Series.

A series is a set of values arranged in a specific order. It is a sequence.

A port is a special type of series. Not only is it a sequence, but it can also hold state information like an object, and access external devices for I/O or other high-speed operations, such as image conversion or encryption.

In Rebol 2, ports were built on a pure series model. However, we found this approach to be problematic because ports are not pure series. They also embody state (information).

For example, a file can be thought of as a stream of bytes. But, a file also has other important attributes such a file name, a location within a directory, creation and modification dates, permissions like read-only or allow execution, and ownership information. These attributes fall outside of a pure series model.

New Definition

Rebol 3 moves away from the pure series model of Rebol 2 and more toward an I/O stream model. Now it is closer to the concept found in other programming environments and languages.

So, a port can be defined as:

  • a series of values - such as a sequence of bytes
  • holds state information - such as file attributes
  • can access the external world - network communication, for example
  • can have side effects - internal changes, such as compression

The pure series model is gone. Ports are more pragmatic now, and this has resulted in a port system that is cleaner, smaller, faster, and more extensible than ever before.

Main Components

A port consists of these main ideas:

  1. A name that specifies the general type of port (scheme)
  2. An object that holds information (state of port)
  3. A set of functions that are applied to that object (actions)

The name of a port is called a 'scheme'. Example schemes are:

  • console
  • file
  • dir - file directory
  • event - gui events (mainly)
  • TCP - networking
  • HTTP - web connections
  • clipboard - cut and paste
  • sound - for audio output
  • system - system state changes

Many other types of schemes can exist, and they are often built on top of lower level schemes. For example, FTP for file transfer is built on the TCP networking scheme.

Here is an example. In this line:

port: open tcp://www.rebol.net
data: read http://www.rebol.com

the first scheme is TCP; the second is HTTP. (Note that this is consistent with the definition of a URL.)

The object holds information such as:

  • the type of the port (file, network, database, etc.)
  • the name and location (path) of a file
  • the URI of a network connection
  • a network host name and port number
  • a buffer of data being transferred
  • date and time info
  • structures used by external devices

This object is of a specific Rebol datatype, called a PORT!

Specific action functions can be applied to a port. Some common actions are:

  • make - create a new port
  • open - initialize the port
  • close - finalize the port
  • read - read data from port
  • write - write data to port
  • query - get other information from port
  • update - detect external changes to the port

But, there are many other actions as well, as generally defined by Rebol datatypes.

Using Ports

Two Basic Methods

There are two basic methods to use a port: implicit and explicit.

When you write code such as:

write %index.html read http://www.rebol.net

you are using implicit ports. This is a shortcut notation to keep simple code simple. You are only using a single port action, such as read or write and all the other details are hidden behind those functions.

However, if you write code such as:

file: open %data.dat
write file data1
write file data2
...
close file

then you are using explicit ports. Here you specify each action separately. You open the port, then read and write to the port, and then close the port. Each action must be specified.

Fast and Easy

Implicit ports are the fast and easy way to perform various I/O actions in Rebol.

A few examples are:

data: read %todo.dat
write %plans.r data
query %docs.txt
page: read http://www.rebol.net
result: write http://rebol.net/cgi/act.r data
data: read ftp://www.rebol.net/projects.dat
host: read dns://www.rebol.net

This type of usage depends on the type of port (the scheme). The example above uses the file, http, ftp, and dns schemes. Those schemes have been designed to support implicit actions.

Notice that for local files, the file datatype is used to indicate usage of the file scheme. The line:

data: read file://todo.dat

is equally valid. Think of the file datatype as an abbreviation for that. Both methods use the same file scheme to perform the I/O.

Other schemes do not support implicit usage. For example:

>> data: read tcp://www.rebol.com
** Access error: Port is not open: tcp://www.rebol.com
** Where: read
** Near: read tcp://www.rebol.com

This error occurs because TCP does not support an implicit read action. That's because TCP is a lower level scheme that requires a higher level protocol in order to be useful.

Full Control

Explicit ports give you full control over each I/O action.
For example, let's say you want to read a large file in small 20000 byte chunks. You might use these steps:

file: open %bigdata.dat
while [not zero? data: read/part file 20000] [
    process data
]
close file

This common method will be familiar to most programmers. The file is opened, reads are done, and the file is closed. Each action is done separately.

This type of explicit I/O is common for large files that would consume a lot of memory if you read them with implicit I/O. For example, if the bigdata.dat file is 10 GB, you would not be able to read it all into memory at one time.

Explicit I/O is also used when you need strict control over each action. This is often done if you need to seek to different locations within a file or write your own network protocol.

For example, let's say you need to read data from three different parts of a large file. In that case you would use read to seek to each part of the file to do the read:

file: open %bigdata.dat
da-head: read/part file 4000
da-body: read/seek/part file 12000 10000
da-tail: read/seek/part file 56000 4000
close file

Port Details

This section describes some of the important concepts you need to know about ports.

Port Datatype

A port is a Rebol datatype. If you use explicit ports, you will need to use the port datatype as a type of handle to access the port. If you've used handles before in other languages, that concept is probably familiar to you already.

In Rebol a port is very similar to an object because it stores information in named fields. We often call these fields the state of the port. When various actions are performed, the state will change, depending on the action. A port differs from an object in that it responds in a special way to specific datatype actions such as open, read, write, and several others.

Port Schemes

A scheme is a type of port.

You will use schemes to identify the type of port access you need as well as the protocol to use.

For example, when you access a local file, you are using the file scheme. When you read a web page, you use the http scheme, which is a higher level protocol built on top of the tcp scheme.

Each scheme has a unique name that is used to identify it. For example, file, http, and tcp are the scheme names shown above. A scheme name can be used as part of a URL, or separately, depending on requirements.

The Rebol system manages a list of available schemes. These schemes can be built-in, can be loaded separately, or can even be user defined within a script.

A lot more about schemes can be found in the Port Implementation section.

Making Ports

All ports are made from a spec -- a specification of the port's attributes. As you have seen above, the spec can be something quite simple, such as a file name or URL. But, a port spec can also be a block that includes many fields to indicate various options for the port.

All of these can be used as port specs:

%file.txt  ; a file name
tcp://www.rebol.com ; a URL
[scheme: 'tcp host: "www.rebol.net"] ; a block
'tcp  ; just the port's scheme name
object ; an object that specifies the port
port ; a existing port

There are a couple ways to make a port, depending on your required level of control.

One method is to use the make action, as you would for any datatype. The general form is:

port: make port! spec

Where port! is the port datatype itself, and spec is the specification as described above.

Here are some examples:

port1: make port! %file.txt
port2: make port! tcp://www.rebol.net
port3: make port! [scheme: 'tcp host: "www.rebol.net"]

These examples will create a port object and initialize its various fields.

One of the most common methods to create a port is with the open function. Unlike make the open function does not require a port! datatype. It knows that it is being provided with a spec. For example:

port: open tcp://www.rebol.net

will create a new port and also perform initializations associated with the open action.

More details about open are discussed later.

Port Actions

Port actions can be thought of as functions that act on ports.

More precisely, port actions are polymorphic datatype actions similar to those used on all other datatypes. If you're not sure what that means, don't worry about it here. Just think of ports like objects that have a well-defined set of methods that act on them.

The actions defined for ports are:

make
make a new port object
to
special (convert an object to a port)
open
initialize external operations
close
conclude external operations
write
transfer data to the port
read
transfer data from the port
query
get information about the port
update
update the port's state
create
create an external object of port type
delete
delete an external object of port type
rename
rename an external object of port type

Note that not all port actions can be used on all port schemes. For example, the rename action has no purpose when used on a sound port scheme.

Ports also allow basic equality comparisons:

equal?
ports are the same object
not-equal?
ports are not the same object

For the exact usage of each action use Rebol's built-in help function. (In this way the action information is always accurate.)

>> ? open
USAGE:
        OPEN spec /new /read /write /seek /allow access

DESCRIPTION:
        Opens a port. Makes a new port from a specification, if necessary.
        OPEN is an action value.

ARGUMENTS:
        spec (port! file! url! block!)

REFINEMENTS:
        /new -- Create new file - if it exists, reset it (truncate)
        /read -- Open for read access
        /write -- Open for write access
        /seek -- Optimize for random access
        /allow -- Specifies protection attributes
                access (block!)

All of the port actions are provided with the port (or spec in the case of implicit port usage) as their first argument.

See the Port Examples section for various examples of how to use port actions.

Asynchronous Usage

Actions on a port can be synchronous or asynchronous.

In general:

  • Synchronous actions will not return until the requested function has completed.
  • Asynchronous actions will return as soon as possible, even if the function is still being performed.

Both modes are useful, depending on program requirements.

Whether an action happens asynchronously depends on a few things. Here are the basic rules:

  • Implicit port usage is synchronous. This provides ease-of-use.
  • Explicit port usage is asynchronous, but only if the port scheme supports it and an awake handler has been provided.
  • Some actions are synchronous, even when applied to an asynchronous port.

Note that it is possible to operate an asynchronous port in a synchronous manner, when so desired. But, it is not possible to operate a synchronous port in an asynchronous manner. (To do so, you must use a Rebol task as an asynchronous thread.)

See the Port Examples page for examples of both modes of operation.

Error Handling

In general, port actions can generate errors in the same way as other Rebol functions, and you can catch and process these error exceptions in the same way.

For example, if you want to handle an error during an open action, you can wrap the code with an error handling function such as try:

if error? err: try [port: open spec] [
    handle-error err
]

A shorthand method is to use attempt:

either port: attempt [open spec] [
    perform-io...
][
    print ["Cannot open" spec]
]

So, you can wrap each separate port action in an error handler, or you can wrap all of your port actions together in a single error handler:

err: try [
    file: open %bigdata.dat
    da-head: read/part file 4000
    da-body: read/seek/part file 12000 10000
    da-tail: read/seek/part file 56000 4000
]
close file

if error? err [
    print ["Port error:" form err]
]

For asynchronous port operation, error handling can be a bit more complicated. (And more work is needed here.)

2 Likes

This doesn't quite parse, to me it should read: The name of a port corresponds to the name of the scheme on which it is based.

One slight complication here is where READ/WRITE is handled for both implicit and explicit modes of a scheme:

read %a-file.txt

port: open %a-file.txt
read port
close port

As I understand it, the scheme author is responsible for monitoring whether an instance is implicit or explicit. Related: OPEN?

From a user point of view, it's possible that the simplest way to understand the difference between implicit from explicit is whether one passes a FILE! or URL! value to READ vs. passing a PORT! value.

I think this is wrong: a scheme is the prototype for a port. All ports inherit the actions/properties of their parent scheme.

I don't ever recall seeing examples of how these are supposed to work.

The following is a spitball projection of, say, compression:

big-file: open %a-big-file.bin

compressor: open [scheme: zip target: %a-big-file.zip]
insert compressor big-file
close compressor
1 Like

Instead of:

err: try [
    file: open %bigdata.dat
    da-head: read/part file 4000
    da-body: read/seek/part file 12000 10000
    da-tail: read/seek/part file 56000 4000
]
close file

if error? err [
    print ["Port error:" form err]
]

there should be:

err: try [
    file: open %bigdata.dat
    da-head: read/part file 4000
    da-body: read/seek/part file 12000 10000
    da-tail: read/seek/part file 56000 4000
    close file
]

if error? err [
    print ["Port error:" form err]
]

Because is something is going to fail in the try code, than it is the file opening. If it fails, the file would be none and one would receive uncatched error, because close does not handle none value.

Or even better:

err: try [
    file: open %bigdata.dat
    da-head: read/part file 4000
    da-body: read/seek/part file 12000 10000
    da-tail: read/seek/part file 56000 4000
]
if file [close file]

if error? err [
    print ["Port error:" form err]
]

Which will close the file in case, when fails the reading in the try block above.

1 Like

So, a port can be defined as:

  • a series of values - such as a sequence of bytes

This "such as" bothers me, because there's a fair amount of magic assumed here. How do I know if it's a sequence of bytes, or a pizza?

This puts a lot of pressure on READ and WRITE:

data: read %todo.dat
write %plans.r data

What does this mean? So it knows from the .DAT extension what to do (a table that tells it how to decode things that end in the .DAT extension?) And it gets Rebol-compatible records it can write out to %plans.r?

Let's look at the refinements on READ and WRITE in R3-Alpha:

READ source /part length /seek index /string /lines
WRITE destination data /part length /seek index /append /allow access /lines

Given the way R3-Alpha's not-very-fancy Multiple Dispatch model worked, those are the only refinements you will ever be able to pass to READ and WRITE. And there was no guarantee a port would pay attention to them. Try read/lines http://example.com, for instance. You get back an unprocessed BINARY!.

There are two basic methods to use a port: implicit and explicit.

If I feel there's anything to the Rebol I/O model, it is mostly centering around being able to write one kind of PORT! object for the explicit behavior, and then get--somewhat "for free"--the implicit.

So that seems to be the thing to focus on...defining it, defining its limits, and showing realistic scenarios of what it might be used to accomplish--in a way that adds benefit over just making a bunch of disparate functions which can have their own pertinent refinements, like READ-CLIPBOARD, READ-HTTP, etc.

This is scheme-dependent. The only place where the FILE scheme would vary on READ (processes a stream of bytes) is if a file is a directory (returns block of contained files) or non-existent (returns error).

READ is just a conduit for the READ actor within the scheme.

That overlooks the efficiency that can be gained by the explicit model. Take HTTP as an example: you can process multiple requests with one port (and thus a single persistent TCP connection).

rebol-site: port: open http://www.rebol.com/
result: read port

result2: write port [get %file1.html]
result3: write port [get %file2.html]

result4: write port [post %target-file [Header: "Value"] {Request contents}]
close port

I'm not overlooking any efficiency. I'm just saying that if I am a port author, then the concept being that I could theoretically just implement the port as the explicit version. And then the shorthand would be available for users, because that is a feature of the port model which I get by following the rules of implementing ports.

And thus far, it's the only "feature" that I see involved. Everything else is something I could do more conveniently and clearly by making an OBJECT! with methods called "read" and "write" and "open"...or whatever methods I wanted.

My bad, I see what you mean.

Perhaps this is the case (effectively this is what you're doing anyway), however it does offer some consistency and access to native verbs.

for-each resource [
    %a-file.dat
    http://some.place/foo
    ftp:///some.place/bar.r
][
    probe read resource
]

It also gives you a framework for building in related state, metadata and documentation. Also auto-breakdown of URLs.

It also offers a best-practice model for implementing such things. There are a few examples (from memory) in Rebol 2 where someone has gone the alternate verb route or the object route, what you end up with is an interface that is less than easy or intuitive to use or maintain.

Er, isn't that because someone removed the lines processing in the scheme?

I guess it's up to you to find a place where it ever was for R3-Alpha. As far as I can tell it was never implemented.

It's difficult to implement in a generic way. In Ren-C, just because someone asked about it (I think), I added a bit of a hack in READ that if it sees you get a /LINES refinement and have a port produce a BINARY!, it converts it to text and then turns it into lines. Or if it's a TEXT! then it will break it into lines. But this is relatively inefficient when compared with the idea of a port that did the conversion to lines as it went.

My point is just about the very complex set of concerns. What qualifies /LINES as a refinement in the finite universe of "the only choices for what read" has? And if it was qualified, how was it justified that it was skipped over.