Kaitai Struct Declarative Language for Binary Formats

This is an interesting declarative language in YAML designed to generate classes for binary formats. There are many formats defined:

https://formats.kaitai.io/

That's a lot of cases...but as one instructive example, you can look at how it describes a ZIP file--with little snippets of code in it for extracting the values:

ZIP archive file format spec for Kaitai Struct

The C++ code generated is like this:

ZIP archive file: C++11/STL parsing library

But it also can also be used to make code for C#, JavaScript, Python, Ruby, Nim, PHP, Lua, Perl... (though we'd assume that if you escape code in, that part will only work be available for that language)

The regimentation of YAML provides the typical repetition in the "dialect". This is the same as the Rebol complaint about JSON--not really leveraging "parts of speech", but repeating tags over and over like id: and type:

  - id: version
    type: u2
  - id: flags
    type: gp_flags
    size: 2
  - id: compression_method
    type: u2
    enum: compression

But it's still pretty hard to compete with, especially when you consider this is giving a compilable specification...so the performance is going to be much better.

I always thought BINARY! parse was something that Rebol would have a unique story for, and Ren-C's UPARSE makes that a stronger story (by allowing rules to synthesize arbitrary results via extraction)... but seeing this kind of stuff reminds me that there are diminishing returns.

3 Likes