ZIP dialect - How to specify alternate names or structure?

hostilefork · June 3, 2019, 7:46pm

The ZIP dialect documentation outlines the following options, where a FILE! signifies the source of data you want to compress. But it doesn't seem to have a way to specify an alternative naming or structure to your files. Whatever you are compressing has to already be laid out with the names and directories that you want the archive to compress as:

    you can zip a single file:
        zip %new-zip.zip %my-file

    a block of files:
        zip %new-zip.zip [%file-1.txt %file-2.exe]

    a block of data (binary!/text!) and files:
        zip %new-zip.zip [%my-file "my data"]

    a entire directory:
        zip/deep %new-zip.zip %my-directory/

    from a url:
        zip %new-zip.zip ftp://192.168.1.10/my-file.txt

    any combination of these:
        zip/deep %new-zip.zip  [
            %readme.txt "An example"
            ftp://192.168.1.10/my-file.txt
            %my-directory
        ]

(I'll also note that the /DEEP option seems it should apply to individual directories; you may want one directory to be zipped deeply and not others.)

One way of working around the naming problem is to read the data yourself; which has a downside of making you read all the files you're working with into memory at the start of the zip (vs letting it read them one at a time as it goes along, possibly even doing streaming compression):

 zip compose [%new-name.dat (read %old-name.dat)]

Something seems a little tenuous to me about the pattern %filename "data" meaning the file is just a name, while %filename means the file is actually read. You could use another datatype like ISSUE! for naming, which could perhaps be seen as like a "label":

 zip [#new-name.dat %old-name.dat]

But that would be awkward on the unzip side, where FILE! is a pretty clean label there:

    you can uncompress to a directory (created if it does not exist):
        unzip %my-new-dir %my-zip-file.zip

    or a block:
        unzip my-block %my-zip-file.zip

        my-block == [%file-1.txt #{...} %file-2.exe #{...}]

I think it should be the case that if you zip something up, then unzip it to a block...you should be able to use the unzipped block to re-generate the same archive again with zip. That seems a pretty good design goal.

So perhaps FILE! is always a label, and @ could be used for specifying contents other than the file itself?

data: #{AEBDAE}
zip [%new-name.dat @(%old.dat) %ae.dat @data %README.md @({Hello}) %as-is.txt]

Athough plain GROUP! would be cleaner...

data: #{AEBDAE}
zip [%new-name.dat (%old.dat) %ae.dat (data) %README.md ({Hello}) %as-is.txt]

That could get confusing if using COMPOSE to put the file names in non-literally, however. If not for the limitations of ANY-WORD!, it seems this would be better as SET-WORD!

data: #{AEBDAE}
zip [new-name.dat: %old.dat ae.dat: data README.md: {Hello} as-is.txt: auto]

Anyway, the point is that today the only way to do it is if you put the filename and then COMPOSE in. I'm going to do that for the problem I'm looking at right now, but wanted to raise the question of how this might improve.

hostilefork · June 4, 2019, 5:27am

I noticed another things about the current state of the ZIP dialect, which is that if you use a full path it actually stores the path. For a single file you can again work around that with the compose [%new-name.dat (read %some/location/you/want/to/forget.dat)] strategy. But if you want ZIP to be recursing deeply into a directory, then that's not an option.

The easiest thing to do is to go ahead and organize a directory structure how you want it, and then say "zip that whole directory, as is". Though that's inefficient. But perhaps there's a way to virtually create an organization through imperative steps vs. having a specification which declaratively lays out the entire mapping...

In any case, there were some bugs in the ZIP as it was which are now fixed. But it does seem to raise a number of questions. Maybe there's some other declarative ZIP tool to look at for inspiration.