Database is considered an expensive resource, and while developing our application we should be particular about the number of DB queries made, in general fewer the DB queries executed, the better the performance.

One way is to use eager loading which help us to decrease the number of DB queries performed, by preloading the associated content from database.

Lets assume we have two models Car and Owner with one-to-many relationship between them - i.e Owner has many Cars

An instance variable @cars is assigned a bunch of cars in the controller and used in views as follows:

<% @cars.each do |car| %>
  <%= car.mod_name %>
  <%= car.owner.name %>
<% end %>

DB queries executed in log for the above request.

SELECT "cars".* FROM "cars"  ORDER BY "cars"."created_at" ASC

SELECT  "owners".* FROM "owners" WHERE "owners"."id" = ? LIMIT 1  [["id", 1]]
SELECT  "owners".* FROM "owners" WHERE "owners"."id" = ? LIMIT 1  [["id", 2]]
SELECT  "owners".* FROM "owners" WHERE "owners"."id" = ? LIMIT 1  [["id", 3]]

First query is executed to fetch all the Cars related data, and then again one DB query is executed each time, whenever car.owner.name was called in our view, which required the associated Owner data, making a total of N+1 query where N is the number of cars. This is a common performance antipattern and it is termed infamously as N+1 query problem.

Eager Loading to Rescue!

In Rails ActiveRecord, we have different approaches to fetch data from an association of models. Rails provides us with four methods, namely preload(), eager_load(), includes() and joins() which we can use judiciously to tackle the N+1 query problem.

preload()

-- Car.order(:created_at).preload(:owner)

SELECT "cars".* FROM "cars"  ORDER BY "cars"."created_at" ASC
SELECT "owners".* FROM "owners" WHERE "owners"."id" IN (1, 2, 3)

It makes exact two DB queries where the first query is used to fetch the Car related data and second query is used to fetch all the associated Owner data, using the ids from the first query.

We have reduced the number of queries successfully from 4(3 + 1) to 2 query. It may not seem to be drastic improvement in this case, but it make picture more clear if suppose you had 5000 cars listed in your DB, then in case of N+1 query problem, (5000 + 1) query will be executed instead of 2 query using preload().

The only disadvantage using preload is that we cannot have any data filters on Owners table. It will throw us an exception, as two separate queries are executed in preload() and we haven’t joined our Car and Owner table anywhere.

-- Car.order(:created_at).preload(:owner).where(owners: {name: 'Jack'})

SELECT "cars".* FROM "cars" WHERE "owners"."name" = ?  ORDER BY "cars"."created_at" ASC  [["name", "Jack"]]
ActionView::Template::Error (SQLite3::SQLException: no such column: owners.name: SELECT "cars".* FROM "cars" WHERE "owners"."name" = ?  ORDER BY "cars"."created_at" ASC):

eager_load()

It makes a single DB query and uses LEFT OUTER JOIN to club together the two tables.

-- Car.order(:created_at).eager_load(:owner)

SELECT "cars"."id" AS t0_r0, "cars"."mod_name" AS t0_r1,
       "cars"."owner_id" AS t0_r2, "cars"."created_at" AS t0_r3,
       "cars"."updated_at" AS t0_r4, "owners"."id" AS t1_r0,
       "owners"."name" AS t1_r1, "owners"."created_at" AS t1_r2,
       "owners"."updated_at" AS t1_r3
FROM "cars" LEFT OUTER JOIN "owners" ON "owners"."id" = "cars"."owner_id" ORDER BY "cars"."created_at" ASC

It has an upper hand over preload() in the terms when we have to add data filter or order(WHERE and ORDER BY clause), on the second table, or Owner in this case.

includes()

Sometimes it may become tough to decide which one to opt for between preload() and eager_load() for better performance. Rails has includes() for the very same purpose, and does this task for you. It decides on itself which approach to take so that you can sit back and relax.

-- Car.order(:created_at).includes(:owner)

SELECT "cars".* FROM "cars"  ORDER BY "cars"."created_at" ASC
SELECT "owners".* FROM "owners" WHERE "owners"."id" IN (1, 2, 3)

includes() smartly switches to making a single DB query using LEFT OUTER JOIN, when ordering or filtering is required on the associated data.

-- Car.order(:created_at).includes(:owner).where(owners: {name: 'Jack'})

SELECT "cars"."id" AS t0_r0, "cars"."mod_name" AS t0_r1,
       "cars"."owner_id" AS t0_r2, "cars"."created_at" AS t0_r3,
       "cars"."updated_at" AS t0_r4, "owners"."id" AS t1_r0,
       "owners"."name" AS t1_r1, "owners"."created_at" AS t1_r2,
       "owners"."updated_at" AS t1_r3
FROM "cars" LEFT OUTER JOIN "owners" ON "owners"."id" = "cars"."owner_id" WHERE "owners"."name" = ?
ORDER BY "cars"."created_at" ASC  [["name", "Jack"]]

joins()

joins() uses INNER JOIN to fetch the associated data from tables.

--  Car.order(:created_at).joins(:owner)

SELECT "cars".* FROM "cars" INNER JOIN "owners" ON "owners"."id" = "cars"."owner_id"  ORDER BY "cars"."created_at" ASC

SELECT  "owners".* FROM "owners" WHERE "owners"."id" = ? LIMIT 1  [["id", 1]]
SELECT  "owners".* FROM "owners" WHERE "owners"."id" = ? LIMIT 1  [["id", 2]]
SELECT  "owners".* FROM "owners" WHERE "owners"."id" = ? LIMIT 1  [["id", 3]]

It seems like we are encountering again our N+1 query problem, and this happens because by default joins() does not include the owners column in result, and you have to specify explicitly all columns required using a select() clause on joins().

-- Car.order(:created_at).joins(:owner).select('cars.*, owners.name as owner_name')

SELECT cars.*, owners.name as owner_name FROM "cars" INNER JOIN "owners" ON "owners"."id" = "cars"."owner_id"  ORDER BY "
cars"."created_at" ASC

You just have to change car.owner.name to car.owner_name in your view as now owner_name is an attribute on Car model.

<% @cars.each do |car| %>
  <%= car.mod_name %>
  <%= car.owner_name %>
<% end %>