The Robustness Principle Is Not Robust

Something floating around that I've pretty much always hated--but experience has made me hate more--is the "Robustness Principle":

I understand the motivation. If you have something that imports to a vector graphics program and not all the Bezier curves have endpoints... then, sure. Someone is going to appreciate that you were forgiving and they got some kind of maybe-a-bit-broken picture instead of an "Invalid File" error.

Pushing this further to those of us who remember the DOS days: we had those experiences where we accidentally deleted files, but the filesystem didn't delete everything--it just wrote a zero byte on the filename (or somesuch). Norton Utilities let us catch our mistake and get our file back. In some ways, a lax and forgiving attitude is a beautiful thing...

This so-called "robustness" cannot (and should not) be subconscious. It needs to be mitigated with an in-your-face consciousness shift! (And I think Norton Utilities is an apt example, because if you were firing up that program you knew you were in the Danger Zone).

There must be a really jarring speedbump. When a data file is not fitting the standards, you have to go through a procedure that transforms the broken file to the standard. You have to be aware that you are dealing with a dirty file. And the experience of that transformation informs you that the person who gave you the dirty file isn't playing with all their marbles, and you correct them for the next transmission.

Authentic programs should demand a standard form. And when they aren't getting the standard form they should speak up, march you out the door, and force you to fix your input. "Real programs" should never pretend the byte sequence of bad input is okay--there should only be "Cleanup/Recovery programs" that fix the sequence to write a correct one. And that recovery program should be terminated before the "Real program" runs.


I don't need to rewrite the critiques of the "robustness principle" cited in Wikipedia. But it is nonsense, and the opposite of robust. It's a garbage idea, that anyone with a whit of sense regarding security can see right through immediately.

There is a better way: the better way is consciousness about what you are working with, and rejection of any sort of malignant "middleman" acting on your data without you knowing.

  • "Normalization" you did not ask for is an attack on your information.

  • "Glossing over or fixing invalid sequences" you did not ask for is an attack on your information

  • Anything which makes a simple load of a file and save back of the same file not idempotent is an attack on your information

Perhaps I've become even more religious than Carl in some of these ways. But I certainly see things won't get better without pushing back.

I think you’d very much enjoy this article, if you haven’t seen it already: Parse, don’t validate.

(That being said, I think there’s still a place for a weak version of the Robustness Principle. ‘Be conservative in what you send’ is always a good guideline, of course. And it seems reasonable to accept some amount of small variation in input, when it’s unambiguous.)

1 Like

Hadn't seen this...

It starts from the premise of "you can't write [a] -> a", which is where I had wanted to start my own Intro-to-Haskell essay. Grasping how nothing is taken for granted (no default constructor, etc) and how the type signature practically tells you what the function does.

Now I don't have to write my own less-informed version of that. :relieved:

(Although it does expect you to know what head :: [a] -> a means, and I'd thought of really starting from no knowledge besides familiarity with some imperative language for contrast.)


I don't see anything to disagree with:

  • "Use a data structure that makes illegal states unrepresentable."

  • "Get your data into the most precise representation you need as quickly as you can. Ideally, this should happen at the boundary of your system, before any of the data is acted upon."

  • "Avoid denormalized representations of data, especially if it’s mutable."

  • "Keep denormalized representations of data behind abstraction boundaries."

This kind of thinking is what made me want TEXT! to work how it does enforcing UTF-8, and is guiding thinking on pushing that further to NFC.