Kevin Quillen

Laughing. Coding. Rocking. Ranting.

Goodbye Wordpress- Hello Octopress

| Comments

Two nights ago, I was reading some developer blogs when I stumbled across one that was put together really well. When I investigated further, I discovered it was running on a platform called Octopress. I was immediately intrigued.

Not only have I wanted to do something new with my site for quite a while, but I made it a goal of mine to convert it to Drupal this year. Well, I could do it in Drupal. Easily. But there would be two problems with that:

  1. Drupal is a bit heavy handed for just blogging purposes.
  2. I wouldn’t learn a thing.

I began reading about Octopress and was quite interested, and began the process of migrating my Wordpress blog to Octopress.

I only needed a handful of things to do the migration:

  1. chitsaou’s branch of exitwp
  2. ruby 1.9.2
  3. coffee
  4. Mad Men in the background, on Netflix

However, there were a couple of hurdles along the way of getting to the end result.

Malformed Start Tag

The first immediate issue when using exitwp was that the BeautifulSoup parser was breaking pretty easy. There were a few suggestions. One was to try using version 3.0.7a, and the other was adding debugging to the html2text.py script so I could see where it was breaking:

html2text.py around line 289
1
2
3
4
5
6
7
8
def feed(self, data):
  data = data.replace("</' + 'script>", "</ignore>")
  #HTMLParser.HTMLParser.feed(self, data)
  try:
          HTMLParser.HTMLParser.feed(self, data)
  except:
          print 'malformed data: %r' % data
          raise

With the above in place, the parser will at least show you where it stopped.

What I was finding was that it was stopping on a couple of posts, ones containing HTML in the content. Some posts, the HTML was broken, just as a result of WYSIWYG generation I assume. Second, some posts only had a break tag or anchor in them that were perfectly fine. Upon removing them, and newlines, it would pass the post fine. However I didn’t want to just remove all markup.

So the next big decision I made was to just import the best posts from January 2009 onward, and do the cleanup from there. Starting at that point, the parse errors were far, far less. I assume this is due to the Wordpress WYSIWYG getting better over time (from 1.x to 3.x). Ah well.

Comments

Octopress comes with Disqus support out of the box, the defacto go-to platform for comments. We’ve been using Disqus since about 2009, and they’ve really done a lot of good stuff. Takes the burden of commenting off your hands, plus they have an excellent import/export platform and support for most major blogging/CMS systems.

I used their custom Wordpress plugin to export the comments from my Wordpress site, and imported them into Disqus. This process took roughly 24 hours to complete- but as I understand it there was a large backlog of import jobs to be processed. Normally this does not take very long at all.

After that, I had to create a CSV of URLs from the old paths to the new paths. With it, Disqus automatically updates existing comments to point to the new URLs.

Redirects

Since Octopress uses different URLs than the ones I had in Wordpress, I needed to setup 301 redirects for search engines. Nothing out of the ordinary here. I don’t mind the ranking impact here, after all I am just a small blog.

Hosting

With my Wordpress blog hosted up on our dedicated server, I already had a spot to host. Octopress has built-in support for rsync. Setting the configuration in config.yml makes this a breeze.

End Result

All in all, a little bit of work and a decent learning experience and here we are. There were a couple of huge benefits for me to switch:

No database.

Since Octopress bakes the blog, the result is all static files, resulting in a huge reduction in footprint on the server and much faster loading times. No database means a lot more, like no chance of SQL injection, no single point of failure, no maintenance.

No security patches.

I know I tend to exaggerate but seriously, Wordpress seems to have security patches or upgrades at least 3 times a month. While the process of updating the codebase has gotten easier since 1.x, its still a pain in the ass to keep up with. There have been a couple of times that letting Wordpress updates slack bit me in the ass, despite only using a couple of plugins.

Speed.

Static files = lightyears faster than waiting for database response.

Out of the box support.

All I have to do is alter the config.yml file and I have immediate support for Google Analytics, Twitter, Facebook, Disqus, and Google+. No plugins needed.

Opportunity to learn.

Doing all of this gave me a good opportunity to learn some new things since its all right in front of me now.

  1. Python
  2. Ruby
  3. Markdown format
  4. Sass

A good change of pace from the norm. I really like Octopress.

Comments