Rails can't find all your IDs when using a HATBM relationship?

October 30, 2012

In all of the examples below, 123 is a valid provider and 99999 is an invalid one. Also Company HATBM Categories (and vice versa).

Rails has a variety of ways of finding records and setting collections for associations. Depending on how you do it, you might get back objects, nil, or raise an error. For example:

Company.find(123)
 Company Load (0.6ms)  SELECT `companies`.* FROM `companies` WHERE `companies`.`id` = 123 LIMIT 1
  => #<ClaimedCompany id: 123...

Company.find(99999)
 Company Load (0.2ms)  SELECT `companies`.* FROM `companies` WHERE `companies`.`id` = 99999 LIMIT 1
 ActiveRecord::RecordNotFound: Couldn't find Company with id=99999

Company.find_by_id(99999)
 Company Load (0.2ms)  SELECT `companies`.* FROM `companies` WHERE `companies`.`id` = 99999 LIMIT 1
  => nil  

You can do similar things when looking for multiple IDs at once:

Company.find(123,99999)
 Company Load (0.2ms)  SELECT `companies`.* FROM `companies` WHERE `companies`.`id` IN (123, 99999)
ActiveRecord::RecordNotFound: Couldn't find all Companies with IDs (123, 99999) (found 1 results, but was looking for 2)

Company.find_all_by_id([123,99999]).to_a.size
 Company Load (0.4ms)  SELECT `companies`.* FROM `companies` WHERE `companies`.`id` IN (123, 99999)
=> 1 

Company.where(:id => [123,99999]).to_a.size
 Company Load (0.2ms)  SELECT `companies`.* FROM `companies` WHERE `companies`.`id` IN (123, 99999) 
=> 1 

It’s interesting that the SQL is the same for each of the three queries above, but the results are different. What I’ve “learned” over time is that if you don’t know if the ID is going to exist and you don’t want to raise an error, the “_by_id” methods are the way to go.

Further reinforcing this belief is the ability to set the _id field of a belongs_to association so that you don’t have to look up the associated object itself. Note in the following example I can set category.parent_id (which is part of an association) to any value I want and Rails doesn’t do any sanity checks whatsoever.

category
=> #<Category id: 3, name: "Other", parent_id: nil, created_at: "2012-08-04 18:36:37", updated_at: "2012-08-04 18:36:37", orig_id: 8> 
category.parent_id = 99999
=> 99999 
category.save
  (0.1ms)  BEGIN
  (0.4ms)  UPDATE `categories` SET `parent_id` = 99999, `updated_at` = '2012-10-18 16:54:37' WHERE `categories`.`id` = 3
  (0.3ms)  COMMIT
=> true 

Given all of this, I had assumed that in a HABTM relationship if I set “association_ids” to an array of integers, Rails would simply insert the values without checking. My thinking was this saved me the effort of loading up all those associated models when I already had their IDs which is all I really needed.

This turns out to be wrong. Note in the next example where the first thing Rails does is load the Companies for the given IDs. Only then does it insert them into the join table.

category.provider_ids = [123]
 Company Load (0.2ms)  SELECT `companies`.* FROM `companies` WHERE `companies`.`id` = 123 LIMIT 1
  (0.1ms)  BEGIN
  (0.4ms)  DELETE FROM `categories_companies` WHERE `categories_companies`.`category_id` = 3 AND `categories_companies`.`provider_id` IN (2
  (0.2ms)  INSERT INTO `categories_companies` (`provider_id`, `category_id`) VALUES (123, 3)
  (1.8ms)  COMMIT
=> [123] 

As an aside, if we had already loaded up the provider objects elsewhere and assign them, Rails doesn’t redo this lookup.

p = Company.find(123)
 Company Load (0.2ms)  SELECT `companies`.* FROM `companies` WHERE `companies`.`id` = 123 LIMIT 1

category.companies = [p]
  (0.2ms)  BEGIN
  (0.4ms)  DELETE FROM `categories_companies` WHERE `categories_companies`.`category_id` = 3 AND `categories_companies`.`provider_id` IN (2
  (0.3ms)  INSERT INTO `categories_companies` (`provider_id`, `category_id`) VALUES (123, 3)
  (1.9ms)  COMMIT

Where this gets us in trouble is when we try to set “provider_ids” with an array of integers that don’t exist in the database. Like this:

category.provider_ids = [123,99999]
 Company Load (0.2ms)  SELECT `companies`.* FROM `companies` WHERE `companies`.`id` IN (123, 99999)
ActiveRecord::RecordNotFound: Couldn't find all Companies with IDs (123, 99999) (found 1 results, but was looking for 2)

An error is occurring because under the scenes, Rails is using the ‘find(x,x,x)’ method to load up the IDs passed and as we saw above, this will raise an error when it can’t find them all. I find the error message to be odd since in the code itself there is no indication that you are loading Companies at all.

Here’s the method itself.

Why Rails doesn’t use one of the other methods instead I’m not sure. It seems to me that if you intentionally use the “xxxx_ids=” setter then you should also take the liability for screwing it up.

Anyway, maybe this will stop someone else from tripping up on it.