The biggest blog software mistake is…

…storing the main content and comments the same way in a relational database.

I have been secretly bemoaning the transition from static sites such as the Linux Documentation Project to blogs. Static sites were very effective, because you could shape the text any way needed, without relying on a complex content type system such as Drupal’s. You just wrote in the way that best suited your content each time.

In blogs, you are constrained by the post format. Some try to build structure by breaking up a tutorial, for instance, over a series of posts, but this make navigation annoying.

On the other hand, the great strength of blogs are comments; comments are great at adding insights, corrections or suggestions, harnessing collective intelligence and the power of communities yada, yada, yada (has been repeated to death, but it’s still true). So we should really have the best of both; a model could be the MySQL documentation.

Luckily, the coupling between posts and comments is not that tight: a simple many to one relationship from a bunch of comments to the post they refer to, and that’s it. Relationships between posts, such as categories or tags, have no bearing whatsoever on comments, and posts need not have access to comments in any way.

In addition, the performance characteristics for posts and comments need not be similar. Posts rarely change, while if there are enough visitors, comments will be updated much more frequently. Posts can do with relatively slow writes if that helps speeding up the reads. The comment system is the only one needing relatively efficient writes.

You can conceive this situation as two parallel collections: a collection of comment documents, and a collection of post documents. In theory, the connection between the two could simply be made at the template level. That’s exactly how services such as Disqus or IntenseDebate are implemented, by inserting a javascript tag in your templates.

The big take away is that you can use completely different storage systems for comments and the main content. If you store the posts in an unstructured datastore (be it NoSQL or simply the filesystem) this frees you from having to worry about content types and defining fields each time you want to structure your content differently. Comments can be stored in whatever system you prefer.

I guess that’s all at the moment. I just may build a system to put these ideas into practice, once I have looked into the various unstructured storage solutions.