How I migrated from Posterous to Octopress while keeping page rank
–
Mike Bourgeous
When Posterous announced their pending closure (which takes place tomorrow,
Apr. 30, 2013), I immediately began searching for a new blogging platform. I
wanted a self-hosted setup on my own domain, so I would never again be subject
to another blog service’s site closure. I settled on Octopress (based on
Jekyll), due to its hacker-friendly command line nature, integration with
Git, ability to publish to Amazon S3, and easily customizable default
theme.
Migration from Posterous required three major steps:
Exporting posts from Posterous
Importing posts into Octopress
Setting up redirection from Posterous to my new blog
Follow along below to back up your own Posterous blog.
Export posts from Posterous
This part was quite straightforward. I used the Backup tool in Posterous’s
control panel to create a .zip archive of my blog’s data. There’s a delay
of a few minutes to hours between requesting a backup and the backup becoming
available, so check back periodically. Although Posterous shuts down tomorrow,
it’s not too late to request a backup of your old Posterous space. According to
the Posterous shutdown announcement, the backup feature will be available
until May 31, 2013. If you haven’t done so already, back up your old space
ASAP.
Once you have your backup, a bit of postprocessing is needed because Posterous’s
generated XML in wordpress_export_1.xml appears to strip extended
characters outside of 7-bit ASCII. Even simple things like M- and N-dashes
might be replaced with “???”.
Importing, editing, and fixing links within dozens of posts by hand seemed
rather tedious and boring, so I wrote a script in Ruby to do the job for me.
This script loads the fixed-up RSS feed generated in the previous step, then
generates files under source/_posts. Images within posts are downloaded into
post-specific directories, organized hierarchically by date. Encoded versions
of videos are downloaded from Posterous (this is likely to stop working on Apr.
30!). Links to other posts on the same blog will be adjusted to point to their
new location on the new blog.
To use the script, save it as posterous_import.rb in your Octopress or Jekyll
blog’s base directory, then run the script with the path to fixed_export.xml
generated in the last step:
12
cd /path/to/new/blog
./posterous_import.rb /path/to/space-[numbers, name, etc.]/fixed_export.xml
Complete usage information is included in the script.
Set up redirection from Posterous
Posterous doesn’t explicitly support redirection to a new blog. It does,
however, allow one to use a custom domain with a Posterous blog. Once you set
up a custom domain with Posterous, all of your old XYZ.posterous.com URLs will
redirect to the custom domain, which is expected to point to Posterous’s own
servers. We can cleverly exploit this behavior to try to transfer search engine
rankings to our new custom blog. Unfortunately there may not be enough time
left for Google to crawl your old blog and find the redirections before
Posterous shuts down.
Set up permalink redirectors
Since Octopress defaults to permalinks with dates, it’s necessary to redirect
the top-level Posterous shortlinks to the correct location on the new blog. I
added an option to posterous_import.rb to do just this (--links must be
the first argument to the script):
12
cd /path/to/new/blog
./posterous_import.rb --links /path/to/space-[numbers, name, etc.]/fixed_export.xml
This will create a directory under source/ for each post in the Posterous
backup. Within each of those directories an index.html file will be generated
that contains a redirection to the post’s new location.
Redirect feed.xml to atom.xml
I recommend setting up a 301 redirection from /rss.xml to /atom.xml. Octopress
generates /atom.xml, while Posterous used /rss.xml. The file formats may not be
the same, but most, if not all, feed readers can handle all the major feed
formats.
This will allow subscribed RSS readers to find your new blog’s feed (until
Posterous stops redirecting tomorrow). I used the S3 management console to
create the redirection for my blog; your own hosting solution will have its own
method.
Point Posterous at the custom domain
Here’s where the magic happens. Using Posterous’s control panel, I clicked
Spaces at the top, clicked the gear icon next to my blog’s entry under Your
Spaces, then clicked Space Settings in the popup menu. The first entry on the
settings page is “Name Your Space”. Click the big Edit button to the right of
your blog’s name and URL. At the bottom of the new page, there is a “Custom
Domains” section. Here you can tell Posterous your blog is hosted at the domain
of your choosing.
Thankfully, the Posterous engineers decided not to verify that the domain we
enter points to Posterous. Instead of entering a domain that points back to
Posterous, we’ll enter the domain that points to our new blog. At this point,
all requests to XYZ.posterous.com/url will be redirected to our new blog.
Verify it works
If you’re just completing this process now, it’s likely that Google and other
search engines won’t crawl the redirections in time. However, since I migrated
my blog several weeks ago, I can check to see whether my new blog has replaced
the old one in search queries. I’m hoping that GoogleBot caches the 301 results
after Posterous shuts down, for the sake of all those links on news sites to my
old blog.
My old blog used to rank on the first page of results for some searches related
to home automation. Sure enough, when I try these queries now, the new blog
shows up instead, and my new blog’s traffic has risen to match the old blog’s
levels. At this point the new blog is running smoothly.