Archive for June, 2008

Rails 2.1 broke my mysql foreign keys!

Tuesday, June 24th, 2008

Rails 2.1 introduced in the MySQL Adapter “smart integer columns.” The idea was to use the :limit option to determine whether a smallint, int, or bigint should be used. This is something that the Postgres adapter had already previously implemented. The relevant code in activerecord/lib/active_record/connection_adapters/mysql_adapter.rb is:

  # Maps logical Rails types to MySQL-specific data types.
  def type_to_sql(type, limit = nil, precision = nil, scale = nil)
    return super unless type.to_s == 'integer'

    case limit
    when 0..3
      "smallint(#{limit})"
    when 4..8
      "int(#{limit})"
    when 9..20
      "bigint(#{limit})"
    else
      'int(11)'
    end
  end

Mirko Froehlich suggests monkey patching this function. Timothy Jones blogged about it.

To monkey-patch this, just drop a file (fix_mysql_adapter.rb) into your initializers/ directory, as such:

(more…)

Don’t Abuse the Session

Monday, June 23rd, 2008

Never, ever, ever, ever, ever store an ActiveRecord model in the session. Just store the id and load it into an instance variable from the database on every request. Why? A couple reasons…

First, you’re susceptible to staleness. Consider this. User A logs in, and you store their user object in the session. Administrator X logs in and deactivates User A’s account. User A can still muck around your site because you’re reading the user data from the session, which has stale data.

Second, the default in Rails these days is to store your session data in cookies (honestly, I don’t know why…..it only clutters up your requests, forcing the session to be passed back and forth on _every_ request, and opening up the possibility that the encryption key could be brute-forced……this is a rant for another day). You just don’t want to be storing whole ActiveRecord objects in the session. They’re big and clunky. The extra database call to reload the object in a before_filter on every request is practically trivial, and you’ll keep the “tubes” less clogged.

This practice is certainly not rails-specific, and should be adopted no matter the server-side technology.

Google AJAX Libraries on Rails

Friday, June 20th, 2008

If you’re reading this blog you’ve probably already heard about Google’s AJAX Library API on many other news sites like Slashdot.

I’m going to describe my simple process for setting up a RoR app to use Google to pull the APIs in a rails friendly way, throughout layouts, views, and helpers.

(more…)

Automating Flex Compilation Using ANT

Sunday, June 15th, 2008

When we first started developing Flex applications for clients when the time would come to send the SWF over, I would build the application in Flex Builder and send off the generated SWF. This got the job done, but it imposed a few limitations since I was the only Flex developer in our office:

  • I was the only one that knew how to compile the application
  • If someone else wanted to try to compile the application, they’d have to install Flex Builder

After reading a blog post by Marc Hughes I realized it was time we put in place a more versatile environment for building Flex applications.

(more…)

Ruby on Rails Polymorphic Association Benchmarks

Friday, June 13th, 2008

Polymorphic relationships in Ruby on Rails are great. If you don’t know what they are, check them out here:

Understanding Polymorphic Associations

John and I were curious about the speed of these relations, since the linking between objects searches on both the ID of the foreign object, and a string which is the model name. So if you have two tables, ChildA and ChildB, your parent has a reference to child which is acutally the combination of child_id (the ID in the ChildA or ChildB table) and child_type (equal to “ChildA” or “ChildB”).

The old-school way of doing this involves creating a lookup table and using integer IDs for type, instead of strings. So you’d have another table mapping “ChildA” to “1″ and “ChildB” to “2″, then when you do your query, you are matching against the number “1″ and not the string “ChildA”.

The down side of doing it that way is that you don’t get to use Rails’ snazzy polymorphism, which makes life a lot easier. So we decided to run some tests to see how much faster it would be, and therefore, if it was worth it.

(more…)

Multithreading in Ruby on Rails

Wednesday, June 11th, 2008

Don’t you hate it when sites say “Please Wait” when you’d rather just come back later? I am always worried my browser will close and it won’t work. Or maybe I want to shut my computer down but I have to leave my task running. Read on!

(more…)

Subversion Timestamps + Capistrano finalize_update

Saturday, June 7th, 2008

Update 2008/06/13: Jamis released Capistrano 2.4.0, and it includes the :normalize_asset_timestamps patch that I submitted!

Update 2008/06/11: Here’s a link back to the Google Groups Discussion regarding this topic.

Subversion has a lesser-known feature that allows you to specify that checkouts/exports/switches/reverts should timestamp files with the last committed timestamp. By default, this setting is turned off. As such, when you checkout a repository, every file is timestamped with the current time on your local machine.

To be honest, I’m not quite sure why this is the default. The pertinent section of the Subversion manual (you have to scroll to the bottom) describes the setting as such:

———-
use-commit-times

Normally your working copy files have timestamps that reflect the last time they were touched by any process, whether that be your own editor or by some svn subcommand. This is generally convenient for people developing software, because build systems often look at timestamps as a way of deciding which files need to be recompiled.

In other situations, however, it’s sometimes nice for the working copy files to have timestamps that reflect the last time they were changed in the repository. The svn export command always places these “last-commit timestamps” on trees that it produces. By setting this config variable to yes, the svn checkout, svn update, svn switch, and svn revert commands will also set last-commit timestamps on files that they touch.

———-

It’s not clear to me why the default aids in Makefiles. What is clear to me though is that Jamis Buck has taken this default behavior into account in Capistrano, the wonderful deployment tool we use at SLS.

The following code snippets will require a bit of understanding of the built-in Capistrano deployment recipes. Let’s take a look at the code for the :finalize_update task. This task is invoked after the code has been updated on the server (for Subversion, either by an export or update).

  task :finalize_update, :except => { :no_release => true } do
    # ... other details omitted

    stamp = Time.now.utc.strftime("%Y%m%d%H%M.%S")
    asset_paths = %w(images stylesheets javascripts).map { |p| "#{latest_release}/public/#{p}" }.join(" ")
    run "find #{asset_paths} -exec touch -t #{stamp} {} ';'; true", :env => { "TZ" => "UTC" }
  end

What this snippet of code does is compute a timestamp, and then touch each asset file on the server with that timestamp (-t #{stamp}). The intention for doing this was to handle the scenario where you have multiple asset servers. Since an export/checkout updates the timestamp with the local machine’s current time, it’s possible for the same asset to have different timestamps on separate asset servers.

So what’s the big deal? First, rails serves up images (when using the image_tag helper) with a querystring appended to it. This querystring is simply a timestamp. The reason for this is to support client-side caching (you’re doing this, right?). This basically allows you to set the “expires” attribute of that file several years (decades or millenniums, in fact) into the future. The reason this is so is because if that file ever changes, it’s last modified attribute will also change, effectively changing the querystring rails appends automatically, and causing your browser to download a ‘new asset.’ So, when the finalize_update task is invoked (which happens every time you re-deploy), all of these last-modified timestamps are reset, causing any repeat visitors to re-download these very same assets again.

I have submitted a patch to Jamis (which I hope he’ll apply soon!) that exposes an extra Capistrano parameter (:normalize_asset_timestamps), which would be set to true by default, leaving the original behavior in tact. The new :finalize_update task looks like:

  task :finalize_update, :except => { :no_release => true } do
    # ... other details omitted

    if fetch(:normalize_asset_timestamps, true)
      stamp = Time.now.utc.strftime("%Y%m%d%H%M.%S")
      asset_paths = %w(images stylesheets javascripts).map { |p| "#{latest_release}/public/#{p}" }.join(" ")
      run "find #{asset_paths} -exec touch -t #{stamp} {} ';'; true", :env => { "TZ" => "UTC" }
    end
  end

I’ll follow up when/if Jamis accepts the patch. Hopefully it can make it into version 2.4!

Deploying Rails Apps with Capistrano without root or sudo Privileges

Friday, June 6th, 2008

In an effort to prepare for my presentation on Rails Deployment with Capistrano and Phusion Passenger at the next Bmore on Rails ruby users meetup, I’m writing a series of blog posts to help illustrate some concepts. This represents the second installment of the series. Better setup for environments in rails addressed the set of structural changes that I make to any fresh Rails app. This post will focus on some general principles and useful security considerations to take into account when deploying Rails apps with Capistrano.

The primary point of this post is this: You don’t need to deploy using root. And you don’t need to grant sudo access to the user used for deployment.

Our primary deployment setup is either a single or two-box solution (web server, asset server, database server spread across two). We generally use MySQL for the backend, and Phusion Passenger to serve Rails. We deploy to either Ubuntu Server (Hardy 7.10) or CentOS 5. We also generally disallow root ssh access.

First of all, it’s important to categorize tasks into two types: privileged tasks and unprivileged tasks. The nice part about a rails app is that, for the most part, it’s pretty self-sufficient, and rarely ever needs to venture out of its own tree in the filesystem. This means that we can get away with deploying with an unprivileged account. There are, however, certain ’setup’ tasks that likely need to be executed with root/privileged access. Fortunately, all of our privileged tasks can be performed before we ever deploy the app!

Privileged Tasks
For our baseline deployment, this includes the following:
1) Install any necessary software (Ruby, RubyGems, ImageMagick, MySQL, etc.)

2) Create the Rails app structure. For us, we create the following structure:

/var/vhosts/myapp
  /releases
  /shared
    /content

The default Capistrano setup task performs similar functionality. The only additional folder here is /shared/content. We use this folder to hold all of our uploaded assets (mostly via the File Column plugin). Then, on successive deployments, we set up symbolic links from the public directory up to this shared folder. This allows these assets to live outside of the context of a specific release.

3) Create a log directory at the system level: /var/log/myapp.

4) Create a symlink from the apache config directory (generally /etc/apache2/sites-enabled on Ubuntu, /etc/httpd/conf.d on CentOS) to /var/vhosts/myapp/shared/passenger.conf. Note that at this stage, passenger.conf does not exist. This is ok, as our cold deployment will address this, and each successive deployment will exploit this. This passenger.conf file will actually just be another symlink out to the current release. What this allows us is the ability to alter our apache/passenger configuration for this app on successive deployments. The apache config directories will not be visible to non-privileged users, and thus, we won’t be able to update those symlinks using a nonprivileged account.

5) Create (if it doesn’t already exist) a deploy user, and assign it to the same group that apache runs (www-data on Ubuntu, apache on CentOS).

$> adduser --group www-data deploy

6) chown the app root (/var/vhosts/myapp) and log directory (/var/log/myapp), and give both user/group write permissions (775, 774, or 770). Apache’s user will be able to write to these directories by virtue of them being owned by the group.

$> chown 775 deploy:www-data -R /var/vhosts/myapp /var/log/myapp

7) Create your database, and grant all privileges to a non-root user.

mysql> CREATE DATABASE myapp;
mysql> GRANT ALL PRIVILEGES ON myapp.* TO 'myappuser'@localhost IDENTIFIED BY 'asecretpassword';

8) Install Passenger and the GemInstaller gem (as long as we keep our geminstaller.yml file up to date, we can ensure that our server will use those exact gem versions).

In order to execute these tasks, you’re going to need root access. Note, however, that it is ill-advised to ever include your root password in your deploy script (lest someone accidentally commits it to the repo). One way to handle this is to implement some/all of this functionality in cap deploy:setup. You can then temporarily run this with the root user, but no password (this only works if root can ssh in), which will force you to enter your password. I like this approach, because at this point, we’ll never need the root password again. Put it in a lockbox and leave it alone! Your other option (if root can’t login), is just to login with the unprivileged account, su -, then perform these steps manually. Similarly, you can temporarily grant your deploy user unrestricted sudo access temporarily (but don’t forget to undo this!) in the sudoers file. Either way, these are one time steps, and it’s really not much of a hassle if you know what you’re doing.

Unprivileged Tasks
Everything else you’ll ever have to do (unless you’re adding a new feature that requires installing other software, etc.) with Capistrano can now be completed with the unprivileged deploy user that we created in step 5 above.

At this stage, the only difference between a cold deployment and a successive deployment is the fact that your app was never running in the first place. In essence, it’s the difference between a hard restart of apache and a soft restart. All other steps are the same:

1) create a new release (with the updated code)

2) update the current and filecolumn symlinks (these were the symlinks I mentioned above that point out to the shared/content directory)

3) ensure that all of our necessary gems are installed (reading geminstaller.yml and executing geminstaller if necessary) — more on this in the next post

4) run any pending migrations

5) update /var/vhosts/myapp/shared/passenger.conf to point to the config snippet in the latest release

6) restart apache (hard if cold deploy or if the apache config is updated, soft otherwise)

We overwrite more or less all of the default Capistrano recipes. Actual code will be released probably just after the presentation (Tuesday, Jun 10, 2008, 7:00 PM at Medical Decision Logic, 1216 E. Baltimore Street, Baltimore, MD 21202), as I’m still tweaking things here and there.

The important take home note is this: yeah, it’s nice to be able to just pull out your Capistrano recipes and build an app on a brand new server from scratch with a few command line calls. However, it is borderline impossible to securely pull this off. The line between server setup and application deployment becomes blurred when you try to put this together. The server setup process, by nature, requires root/privileged access. Incremental deployment, however, does not require this level of privilege. Capistrano was not designed to build a server from scratch. Rather, a better approach is to develop a server image that you will use for all of your client app servers.

My next post will elaborate (with more code samples, particularly recipe snippets and some capistrano/rails extensions I’ve been working on) on much of what was covered here. Additionally, I’ll go into further detail regarding other ways to make maintaining your production apps easier with Capistrano.

map.resources and custom nested routes

Thursday, June 5th, 2008

I encountered an error in rails trying to create a nested route in rails 2.x

map.import_time_cards 'users/:user_id/time_cards/import',
:controller => 'time_cards',
:action => 'import'

Wasn’t setting up a route for users because this route was being setup automatically and overwritten by:

map.resources :users,
:has_many => [:notes, :addresses, :expenses, :time_cards] ,
:collection => [:login, :logout, :disable, :enable]

So after digging around on the rails api I discovered that map.resources takes a block so my solution to this problem was :

map.resources(:users,
:has_many => [:notes, :addresses, :expenses] ,
:collection => [:login, :logout, :disable, :enable]) do |user|
user.resources :time_cards, :collection => [:import]
end

By using a block this tells rails to include route to ‘users/1/time_cards/import’ instead of appending import as the id for the show route.

Sexier Migrations in Rails 2.1

Thursday, June 5th, 2008

If you thought migrations couldn’t get any sexier, then clearly you haven’t been following the mudslide of updates going on in Rails this year. As of late I’ve been updating tables in almost every way possible in several applications. From simply renaming columns to renaming entire tables and their dozens of loyal has_many’s, I’ve gotten used to the old way to change tables, but I certainly haven’t enjoyed it. Behold, migrations in 2.1 now allow you to do this:

screenshot shamelessly stolen from railscast

I am definitely excited about that change and look forward to updating tables with the utmost ease. Along with this great update, migrations are now automatically generated with a timestamp as their version number instead of simply an integer. What this means to the solo developer is more information given at a glance than just the migration order (not too useful). What it means as a team developer is little to no worries of colliding migration versions with others on your team. Additionally, a schema_migrations table is created in your database which tells you what migrations have been run so far, which is more useful than the schema_info table which is just the latest migration version. Those of you who hate reconciling migrations with other developers rejoice.

For more on these updates, click on over to Railscasts, since I shamelessly stole an image of their screencast anyway.

What might go wrong when you update your own app, you might wonder?

In most cases there are no complications, unless you already have a migration up to version 20080605122345 [YMMV on the timestamp]. This is NOT ok if you’re setting the time and date of “right now” in the application. For example, the date of present time is set to midnight, May 15th, 2007 in an app I’m currently working on for development and testing purposes. This means my new migrations (ALL of them) are given the same version of 20070515000000_…. and this is bad news. I am hunting around for a way to tell the generator to use simple version numbers instead and I will report back when said way is found.