Migrating from WordPress to Django
This is the system I devised to move my blog posts from WordPress to a custom Django application. It looks a bit intricate, but it did not take more than an evening of coding.
Start from the obvious: WordPress exported XML
The WordPress native export functionality gets you a custom XML file, mixing your post data, with a lot of information you do want necessarily to keep, because it is needed only by WordPress internals (for example, post ids).
Instead of writing a program which would parse and create content for the new system directly from the XML, I thought it would be simpler and more convenient for future use to have two: a program to first output a subset of the XML data into JSON, and another program which interfaces with the Django database API and loads the required data from the JSON.
This way you obtain a clean JSON file, without WordPress-specific cruft, which can also work a sort of intermediate backup.
Python standard lib all the way
I decided the Python standard lib had everything I needed:t etree for XML and the json module. etree is quite an interesting API, though as most XML APIs it seems geared towards extracting a simple piece of information from the document, not trasforming the whole document into something else. I think SAX is still the best for that, once you get the hang of it.
Anyway, this time I had conveniently reduced the task to what the XML library was actually good at, so I had only to target the tag list, the post title and the post slug, create a Python dictionary for each post with this data, and dump that to JSON. The code was not probably very efficient, but there was not a lot of data either.
Then it is only a matter of turning the JSON back into Python (json.load), then running through the dictionary json.load returned and saving objects through the Django db api.