SmartLogic Logo (443) 451-3001

The SmartLogic Blog

SmartLogic is a web and mobile product development studio based in Baltimore. Contact us for help building your product or visit our website to learn more about what we do.

Rack::Rewrite + Google Analytics Makes Site Transitions Seamless

November 24th, 2009 by

At SmartLogic we recently rebuilt our website in rails. The previous version was a MediaWiki installation with a ton of content that had garnered a decent bit of Google juice that we did not want to lose. By setting up 301 permanent redirects for the old URL’s, we can hold onto that juice.

Google Analytics Navigation

Google Analytics Navigation

Enter Rack::Rewrite. Rack::Rewrite is a Rack middleware for defining and applying rewrite rules. Though it’s not a full replacement for Apache’s mod_rewrite, a great deal of rules I’ve previously written in Apache config files can be replaced by Rack::Rewrite. Run gem install rack-rewrite to install the gem.

In order to determine which URL’s were most important to issue 301’s for, we turned to Google Analytics. By reviewing the most popular landing pages of the past two months from our site, we were able to methodically write our redirect rules.

Google Analytics | Top Landing Pages

Google Analytics | Top Landing Pages

Here is a subset of the associated Rack::Rewrite rules.

  config.middleware.insert_before(Rack::Lock, Rack::Rewrite) do        
    # original wiki website
    r301 '/wiki/Main_Page', '/'
    r301 '/wiki/John_Trupiano', '/john'
    r301 %r{^/wiki/(Charity|Charities|Local_Connection)$}, '/gratis-work-and-charities'
    r301 '/wiki/Category:Portfolio', '/portfolio'
    r301 '/wiki/ExxonMobil_-_Brand_Asset_Center', '/portfolio/exxonmobil-brand-asset-center'
    r301 '/wiki/In_the_News', '/in-the-news'
    r301 '/wiki/Getting_to_our_Office', '/driving-directions'
    r301 '/wiki/Category:Employees', '/our-team'
    r301 '/wiki/SimNet_for_Office_2007', '/portfolio/simnet-for-office-2007-and-vista'
    r301 '/wiki/VNC_Collaboration_Application', '/portfolio/shared-desktop'
    r301 '/wiki/Contact_Information', '/contact-us'

This scheme is great for landing pages, but what if we had querystring information that we wanted to keep around? This is common for tracking codes — many marketing platforms generate URL’s that embed data in the querystring for recording and tracking purposes. We can leverage the following trick to maintain the querystring across a rewrite.

  r301 %r{^/wiki/Main_Page(\?.*)?$}, '/$1' 

Note the following:

  • We are conditionally matching a querystring so that the rule continues to match in the absence of a querystring.
  • We are leveraging substitution patterns to reconstitute the querystring in the rewritten URL.

Many more great use cases for Rack::Rewrite are covered in the project’s README.

  • John Trupiano

    This excellent post by Hugo Frappier explains how to use Anemone to crawl your site to find URL’s that might require rewrite rules:

    This can be particularly useful if you store full path self-referencing links in your database. For our site in particular, it was really only important to find the landing pages.

  • Albert

    Hello, Rack::Rewrite is much appreciated, and I thought you might find my use of it interesting. I’m using Sinatra as a web dev framework for Regdel. For some URLs it simply serves static files that get transformed by Rack XSLView. To maintain logical url structures, I’m using Rack::Rewrite to map them to the static file path before Sinatra gets the request.

    Now instead of something like “/s/xhtml/account_form.html”, I can use “/account_new”! Works like a charm. Only suggestion I have at the moment is to support the different HTTP methods, or at least GET and POST.

  • John Trupiano

    Very cool Albert, thank for sharing.

    Regarding the feature suggestion, it would be cool if you added an Issue with a proposal for the feature at github:

John Trupiano co-founded SmartLogic with Yair Flicker in May 2005 and was co-president through 2011. Check out his GitHub Projects or follow @jtrupiano on Twitter.

John Trupiano's posts