SmartLogic Logo (443) 451-3001

The SmartLogic Blog

SmartLogic is a web and mobile product development studio based in Baltimore. Contact us for help building your product or visit our website to learn more about what we do.

Testing PDF Content with Capybara

August 20th, 2012 by

PDF documents can be challenging to test, and even more so when using your typical Rails testing tools. So what do you do when you have a great new feature that is almost entirely based on the dynamic generation of PDF documents? You want your tests to provide you with an assurance that your code works as expected and puts the right content into your PDFs. Have no fear! There is a simple way to make assertions on the text content inside a PDF.

A new feature we built recently utilizes fillable PDFs with Pdftk and Prince generated PDF files based on user data. Testing that the content is correct hinges on the test’s ability to read the PDF content. The simple way to do this is using the `pdftotext` command from the Xpdf PDF viewer. The only major limitation is that you have to keep the content of your PDFs simple, but in our case that was easy to do.

We added this helper method:

def pdf_response_contains(text)
  temp_pdf = Tempfile.new('pdf')
  if Capybara.current_driver == Capybara.javascript_driver
    temp_pdf << page.driver.source
  else
    temp_pdf << page.driver.response.body
  end
  temp_pdf.close
  temp_txt = Tempfile.new('txt')
  temp_txt.close
  `pdftotext -q #{temp_pdf.path} #{temp_txt.path}`
  body = File.read temp_txt.path
  body.gsub!("\f", "\n")
  body.should =~ /#{text}/
end

There are a few caveats you may notice in that method:

  • pdftotext acts on files, so we need to squirrel away the response body into a temporary file.
  • We use capybara-webkit whenever possible, but we noticed that when accessing the response body, it would wrap the content in basic HTML tags. A simple check lets us access the response content the correct way when using either a capybara-webkit or rack-test.
  • If you are using a version of capybara-webkit at 0.11 or less, a null byte in your PDF will truncate the response that capybara-webkit provides. Newer versions contain a patch that will fix this issue.
  • The last little tweak is that the PDF will contain form feeds that you will probably want to replace with new-lines.

Now you can cover your PDF feature with complete end-to-end testing to ensure your PDF generation code is correctly integrated into your application.

For more like this, follow @smartlogic on Twitter or like us on Facebook.