SmartLogic Logo (443) 451-3001

The SmartLogic Blog

SmartLogic is a web and mobile product development studio based in Baltimore. Contact us for help building your product or visit our website to learn more about what we do.

Ruby on Rails Polymorphic Association Benchmarks

June 13th, 2008 by

Polymorphic relationships in Ruby on Rails are great. If you don’t know what they are, check them out here:

Understanding Polymorphic Associations

John and I were curious about the speed of these relations, since the linking between objects searches on both the ID of the foreign object, and a string which is the model name. So if you have two tables, ChildA and ChildB, your parent has a reference to child which is acutally the combination of child_id (the ID in the ChildA or ChildB table) and child_type (equal to “ChildA” or “ChildB”).

The old-school way of doing this involves creating a lookup table and using integer IDs for type, instead of strings. So you’d have another table mapping “ChildA” to “1” and “ChildB” to “2”, then when you do your query, you are matching against the number “1” and not the string “ChildA”.

The down side of doing it that way is that you don’t get to use Rails’ snazzy polymorphism, which makes life a lot easier. So we decided to run some tests to see how much faster it would be, and therefore, if it was worth it.

I created a Rails application that sets up four tables in the database:

  • A table with an Integer ID and a String Type
  • A table with an Integer ID and a String Type and an index on the type
  • A table with an Integer ID and an Integer Type
  • A table with an Integer ID and an Integer Type and an index on the type

My benchmarking procedure is as follows:

  1. Setup the number of records, N
  2. Setup the number of types, T
  3. Create N records, such that for each type there is an entry with ID 1, 2, 3 and so on. So you have
    Type 1 : ID 1
    Type 2 : ID 1
    Type 1 : ID 2
    Type 2: ID 2
    etc
  4. Insert all of these records in a random order
  5. Retrieve the records in a different random order via
    Model.find(:first, :conditions=> {:id => id, :type => type})

I made use of the Ruby benchmarking library to time this process for each table.

I have some preliminary results from running the test quickly, which shows a 500% speedup for using an index, and a 5% speedup for using integers instead of strings. In my opinion, polymorphism in Rails is worth 5%. And adding an index is definitely worth 500%. I mean, what isn’t worth 500%?

That was based on just 1,000 records. I am running it on 100,000 records now, but each table takes 1-2 hours to run.

If you’d like to play with my code, here is a link to it on GitHub:

ruby-on-rails-polymorphism-benchmarks

  • http://www.last100meters.com Eric Allam

    Usually you’d want to create an index that indexes on both the type and the id columns.

  • http://smartlogicsolutions.com Nick Gauthier

    Hey Eric,

    I was thinking about that too. When the current benchmark finishes I’ll add in two more tables with full indexing (unless you want to do it :-D).

    -Nick

  • Dan Kubb

    Another suggestion: you should benchmark this approach against the old-school approach of using an explicit join table between the two tables, since it’s the classic alternative to polymorphic associations and would provide a nice baseline.

    I personally am still inclined to use a join table rather than polymorphic associations. It’s very rare that I have something that needs to be “promiscuous” enough that I won’t be able to know up-front what needs to be joined to what. I tend to prefer it because I can add constraints to the join table to ensure one side is unique, for example. It also leaves room open in case I need to add other information to the relationship (as I often do) such as the time the relationship was created/updated, etc.

  • David Dai

    I ran it on 10,000 rows on my mac book pro 2.6GHz Core2 Duo, 4GB RAM with MySQL:

    http://pastie.org/215147

    I don’t see much of a difference.

  • David Dai

    To clarify, I mean Integer key VS string key. Not indexed VS non-indexed.

  • http://smartlogicsolutions.com Nick Gauthier

    @David: Right, the int key vs string key is a small percentage. 57 vs 51 in your case.

    @Dan (and everyone): Please add tests to the code base, we’ll all benefit from more benchmarks!

    -Nick

  • http://rhnh.net Xavier Shay

    You likely can use the SET datatype – it basically gives you strings but stores them internally as integers. We’ve hacked the enum-column plugin to support it for mysql – haven’t tried with polymorphics yet, but it works fine with STI, which I presume is similar code. It’s too ugly for public consumption at the moment but if you’re interested drop me a line.

  • name_tax

    You should regard index size.