`tac` : Implementation of UNIX's line reverser

I mentioned I wanted to study some basic utilities...even more basic than greb. Small codebases can put a focus on big design points.

Here was my off-the-cuff line-reverser (TAC ... reverse CAT, where CAT is like Windows Command Shell's TYPE). I wrote it a year ago, while I was solely focused on getting READ-LINE to work from piped input cross platform at the time...so almost no thought went into it:

; %tac.r, version 0.0.1
;
; * COLLECT the lines
; * REVERSE the collection
; * output the result DELIMIT-ed with newline
;
write-stdout try delimit/tail newline reverse collect [
    until [not keep line: try read-line]
]

Right off the bat, you can see it's using twice the memory it needs to. It's collecting a block of strings, and then while that whole block is in memory it merges it into one giant string before output. At minimum, this should loop and write the strings out of the block out one at a time. (Though doing it this way does draw attention to an interesting point about DELIMIT, which I'll get to later.)

Note: This line-reversing task is one of those pathological cases that can't be done in a "streaming" way. You can't start writing anything to the output until you've read the input to the end. (Doing better needs a random-access I/O PORT! that can SEEK the end of the file and go backwards...but the standard input device can't do this.)

But First Things First: The Impact of the New TRY Rules

Recall that the old behavior was that TRY would turn NULL into BLANK! in order to stop functions from raising an error on those NULLs. So here:

  • we'd stop KEEP from failing when there are no more lines to READ-LINE from, so it acts as a no-op instead

  • we'd stop WRITE-STDOUT from failing when an empty block is collected and DELIMIT returns NULL:

    write-stdout try delimit/tail newline reverse collect [  ; TRY DELIMIT => BLANK!
      until [not keep line: try read-line]  ; TRY READ-LINE => BLANK!
    ]
    

The new way :sparkles: is you go ahead and pass the NULL, and the function raises a very narrow error. This error is caught by a TRY on the function call itself.

So applying that transformation:

try write-stdout delimit/tail newline reverse collect [  ; TRY WRITE-STDOUT => NULL
    until [not try keep line: read-line]  ; TRY KEEP => NULL
]

Why Does DELIMIT Ever Return NULL ?

One might ask if it should never be able to return NULL when you use /TAIL. At the moment, it does:

>> delimit/tail "," ["a" "b"]
== "a,b,"

>> delimit []
; null

>> delimit/tail "," []
; null

Maybe that last one should be "," ? Perhaps when you have /HEAD or /TAIL, you never get null back.

But... let's stick to looking at the use cases.

What Does %tac.r Want From DELIMIT/TAIL Here?

If we look at the edge case here, there is a difference between these two situations:

  • The first call to READ-LINE returns an empty string, and the second call returns NULL

    • This happens when you pipe in a 1-byte file containing a single line feed, e.g. a file containing one line that's empty.

    • With the code above, COLLECT produces the block [""] for this case

  • The first call to READ-LINE returns NULL

    • This happens when you pipe in a 0-byte file, e.g. a file containing no lines at all

    • With the code above, COLLECT produces the block [] for this case.

So perhaps you see why DELIMIT chooses to react with some kind of signal when the block contents vaporize. It's precisely because cases like this tend to need some kind of special handling, and it's not good to gloss over that.

In this case, the empty block (which corresponds to the 0-byte file input, e.g. 0 lines) should result in there being no write to the output. So the default behavior of TRY WRITE-STDOUT NULL is the right answer.

More to Study, I Just Thought That Bit Was Interesting...


"There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."

1 Like