When attempting to migrate, say, an old WordPress blog to another platform, such as Ghost, you may find yourself worried about losing preexisting links to your content on your old site. While this can also affect search engine optimization, perhaps the most irritating side effect of this occurrence is your audience reaching 404 errors whenever they attempt to follow bookmarks to your site.
While redirecting the whole domain is simple enough— there is a Redirects option in cPanel that makes the process quite simple— there could still be folks who have links to your RSS feed in their feed readers who likely wouldn't get the update and be quite lost.
Fortunately, we can address this by writing a few lines in an .htaccess file to dynamically redirect to specific locations based on where a person entered the old site.
What is .htaccess?
We run Apache on Reclaim Hosting and when an account is provisioned the server creates a directive telling Apache what your domain is and where the files for that domain are on the server along with a handful of other things. This is how multiple sites are able to be hosted from a single server because Apache reads the URL and then looks at its list of folders and information about each domain it has a record for and then displays the contents of the right URL.
An .htaccess file allows you to write commands that override the rules that already exist in the Apache configuration. So instead of someone having to type in HAXCMS site list they can just type http://timowens.io and a rule in .htaccess will tell the server that index.php is what the server should display. WordPress includes an .htaccess by default that makes those pretty permalinks you’re used to. We can utilize the .htaccess to get pretty specific about how we want to rewrite URLs though.
Let's take the following site as an example: http://archive.timmmmyboy.com. All of the posts have the following structure:
http://archive.timmmmyboy.com/2013/11/writing-collaborative-documentation-with-dokuwiki-and-github/
So the base domain, followed by the year, followed by the month, followed by the name of the post. Now let's say our Ghost blog has this format for posts:
http://blog.timowens.io/writing-collaborative-documentation-with-dokuwiki-and-github/
Adding the Redirect
No year or month, but otherwise we have something to work with.
Basically what we want to do is have Apache grab the post name from the end of the URL and append it to the domain of the new blog. There are also other links like for the RSS feed that would be /feed. Here’s what the full .htaccess file looks like now with the redirects in place:
RewriteEngine On
RewriteRule ^([0-9]+)/([0-9]+)(.)$ http://blog.timowens.io$3 [R=301,L]
RewriteRule ^feed$ http://blog.timowens.io/rss [R=301,L]
RewriteRule ^feed/$ http://blog.timowens.io/rss [R=301,L]
RewriteRule ^(.)$ http://blog.timowens.io/ [R=301,L]
That’s it! Let’s go line by line and see what it’s doing.
RewriteEngine On
Adding the Rewrite Rule
We need to tell Apache we’re going to be rewriting some URLs so this has to come before any RewriteRule directives.
RewriteRule ^([0-9]+)/([0-9]+)(.*)$ http://blog.timowens.io$3 [R=301,L]
- RewriteRule is the directive to create the rewrite.
- The ^ is a wildcard saying “ignore whatever came before the pattern I’m about to show”.
- ([0-9]+)/([0-9]+) are regular expressions that mean "a series of numbers between 0 and 9 will show up between these slashes).
- (.*)$ is our final wildcard “anything coming after that pattern” and the $ tells the server to store each of the wildcard directives of that line to a variable.
The second part of that rule allows us to grab that variable (in this case the third variable, the one that holds our post name) and append it to the end of our new URL. [R=301,L] tells the server this is a 301 Permanent Redirect which the server will then let Google and other entities know when they visit. It’s highly highly recommended that you use 302 Temporary Redirects until you’re confident you’ve got it right. Temporary redirects will not store a cookie in your browser causing the redirect to be cached, nor will search engines be notified while you’re testing.
RewriteRule ^feed$ http://blog.timowens.io/rss [R=301,L]
RewriteRule ^feed/$ http://blog.timowens.io/rss [R=301,L]
The link to my RSS feed is different so this redirect says “If someone visits some anything with the word feed on the end, redirect here”. There are two lines because I also assume sometimes people put the slash o the end and sometimes they don’t. That could probably be consolidated if I knew regex a bit more.
RewriteRule ^(.*)$ http://blog.timowens.io/ [R=301,L]
Finally a catchall. “Whatever URL they visited, if there isn’t a post or directive to end them to a specific location, redirect to the homepage.”
Once you’ve tested it and know it’s working switch from 302 Temporary Redirects to 301 Permanent Redirects and you’re all set!
Quick note about .htaccess files: Because of the dot on the beginning of the filename some programs may not show the file by default since that’s universal for “hidden files”. In the File Manager in cPanel you can check to show hidden files before entering.