ActiveRecord: Migrating habtm to model table suitable for has_many :through

31
Oct/07
2

[Ha, ha, ActiveRecord has the last laugh, and this is much easier than I thought. The Rails Wiki must have been wrong, or it must have been that you couldn't add a primary key in an earlier version of AR.]

ActiveRecord will facilitate many:many relationships across a plain join table that doesn’t have a primary key. This is called “has and belongs to many.” It turns out that it is hard easy to convert this into a join table where the join table itself represents something.

The example here is a bookmarking application like del.icio.us: you have many users with many links (and vice versa), and model those things as a User class and a Link class. Each User has and belongs to many Links; each Link has and belongs to many Users. As a first cut, you might put the title of the URL in the Link class. So John and Amy might each have a reference to the same link. Note in this model, a catch is that since they share that same Link, it is problematic that the title is on the Link class. Because now they must share that title. Our model prevents one of them from saving his or her link for the NY Times with a personal title such as “The Yankee-Loving New York Times,” or some other appropriate moniker. So with that issue in mind, we would like to migrate our schema so that there is a Bookmark class in between User and Link. Now we will put the title in Bookmark, and let the User edit it. If we do this, Amy and John can have bookmarks for the same link (the URL being represented in the Link class) but different titles (in the Bookmark class).

Let’s take it from the top, shall we? In the original model, if you have a model User, and a model Link, and you want to have a many:many relationship, you can define an association for each one of the form has_and_belongs_to_many :links (and the reverse). Then you create a join table like the following (note the suppression of the primary key):


class CreateLinksUsers < ActiveRecord::Migration
  def self.up
    create_table :links_users, :id => false do |t|
      t.column :link_id,    :integer
      t.column :user_id,    :integer
    end
  end
  def self.down
    drop_table :links_users
  end
end

As time proceeds, we discover that we are interested in adding some extra data that should go on the join table. For instance, we might want the time/date when it was added, or notes specific to the relationship between the link and the user… Or a title, as above. At this point we would need to model it as a first-class (so to speak) ActiveRecord class and use the associations has_many :through for the “endpoints” and belongs_to for the in-between class.

Sadly, if you are like me, you might read the Rails Wiki pages before you do anything, and you would read the following which is all wrong:

Unfortunately, this is tricky because ActiveRecord does not model the original links_users table: It is purely a vehicle for the habtm association. We might think we could rename the table to, say, :bookmarks, and then add a primary key . . . But ActiveRecord does not allow one to add a primary key column to an already-existing table:

Note: The API doc on add_column refers to column_types which say that it can be of type :primary_key. This is not true. add_column cannot use :primary_key as a type_ (see Rails Wiki, Using Migrations)

So . . . What to do? One way The answer is to create a new bookmarks table, and then model it inside the migration. Loop over your users and links to extract the ids, and add them to the bookmarks. The reverse is much easier: Now you can just drop the primary key and rename the table back to links_users:


class CreateBookmarks < ActiveRecord::Migration

  class Bookmark < ActiveRecord::Base; end

  def self.up

    create_table :bookmarks do |t|
      t.column :link_id,     :integer
      t.column :user_id,     :integer
    end

    User.find(:all).each do |u|
      u.links.each do |l|
        Bookmark.create!( :link_id => l.id, :user_id => u.id )
      end
    end    

    drop_table :links_users
    print "Change your associations for User and Link to has_many :through => :bookmarks"

  end

  def self.down
    remove_column :bookmarks, :id
    rename_table :bookmarks, :links_users
    print "Change your associations for User and Link to has_and_belongs_to_many"
  end

end

But it can be even easier than that: You might also just add the new primary key going up, and drop it going down:


class CreateBookmarks < ActiveRecord::Migration

  class Bookmark < ActiveRecord::Base; end

  def self.up

    rename_table :links_users, :bookmarks
    add_column :bookmarks, :id, :primary_key

    print "Change your associations for User and Link to has_many :through => :bookmarks"

  end

  def self.down
    remove_column :bookmarks, :id
    rename_table :bookmarks, :links_users
    print "Change your associations for User and Link to has_and_belongs_to_many"
  end

end

Yet that’s not all. If you want to preserve data across this migration, you will likely find that information will be lost when you move from the newly-modeled table to the table without the primary key. This is because it is very common to enforce uniqueness on one of the keys in the habtm join table. But when you model a classic many:many join, you probably won’t do that.
You may find that going “down” that you have added data to the has_many :through version that is incompatible with what you had in your habtm model; in which case you are going to have to define a rule to re-organize the data.

Filed under: Rails