Links to make you go "Aaah!"

This is a set of fairly random links that I've noticed over the past few months. They don't really have anything in common with each other, except perhaps as catalysts to get me thinking about the whole software development process.

Ever since I discovered that making mistakes as a chess professional meant that I didn't eat that night, I've been fascinated by how and why systems fail. In fact, my book has whole chunks devoted to this subject, especially when it comes to debugging distributed systems. Here are some of the articles on this subject that have piqued my interest recently.

  • Learning from accidents Dan Bricklin, inventor of Visicalc, has done a study of some major accidents as a way of learning about dealing with situations that stress a system. This includes a fascinating introduction to the unlikely chain of events that led to the Three Mile Island nuclear accident. One of his conclusions is that you don't just want to prevent an exact duplicate of a failure in the future, you want to prevent the entire class of failures that it represents.
  • SQL Server error handling, part 1 A deep investigation into the quirks of SQL Server error handling. After you've read this, you'll never do SQL error handling in the same way again.
  • SQL Server error handling, part 2 A follow-up to the article above - going even deeper into SQL Server error handling.

What can happen when you stray onto that thar intraweb thingie without protection?