Faster Net::HTTP for Ruby 1.8.6
Dec/081
We’ve been a bit frustrated at work with Net::HTTP performance (as have so many others) so here’s a monkeypatch for 1.8.6 that combines the buffer size increase in 1.8.7 with Aaron Patterson’s recent tweak to use non-blocking IO (unfortunately, the non-blocking IO patch doesn’t work with HTTPS, which is why we use the buffer size tweak when the @io variable suggests that HTTPS is happening. No guarantee implied, etc.
class Net::BufferedIO #:nodoc: aliasld_rbuf_fill :rbuf_fill def rbuf_fill BUFSIZE = 1024 * 16 # HTTPS can't use the non-blocking strategy below in 1.8.6; so at least # increase buffer size over 1.8.6 default of 1024 if !@io.respond_to? :read_nonblock timeout(@read_timeout) { @rbuf << @io.sysread(BUFSIZE) } return end # non-blocking begin @rbuf << @io.read_nonblock(BUFSIZE) rescue Errno::EWOULDBLOCK if IO.select([@io], nil, nil, @read_timeout) @rbuf << @io.read_nonblock(BUFSIZE) else raise Timeout::TimeoutError end end end end
Atrocious Rails API doc of the week
Nov/080
Here’s an example of some atrocious Rails API documentation.
(I say “of the week” which raised the question whether I could really find an atrocious Rails doc every week . . . but I probably can.)
Here’s the doc for the link_to helper:
http://api.rubyonrails.org/classes/ActionView/Helpers/UrlHelper.html#M001378
Notice that there is a heading “Options” describing keys such as :confirm, :popup, etc.
Well guess what? Those Options are for the second Hash, i.e., html_options. How a newbie would ever figure this out is beyond me. I suppose you’re supposed to follow the link over to url_for and figure out that those options are for the first Hash . . . and then somehow that the “Options” given for link_to are for the 2nd Hash. One of my students in my Harvard Ruby/Rails course stumbled across this.
Teaching Ruby and Ruby on Rails again at Harvard
Aug/080
Once again, I’m pleased to be offering a course on Ruby and Ruby on Rails at Harvard: course; course site.
We’ll try to avoid this anti-pattern:
Capistrano logging
May/080
The logging in Capistrano could be so much better. Just for example, when a task is initiated, that log item should really pop out. At present, the log line starts with “executing” for both the indication that a task has been started, and for running a remote command. Because the same word is used, it is hard to see at a glance where you are in a long task. So at present it is:
* executing task reload_apache * executing "sudo /etc/init.d/apache2 reload"
This is lame. Task initiation is important, and should be called out in the first word. Similarly, when a command is run sudo, the fact of “sudo” probably has priority over “executing.” So for my eyes, at least, something like this would be far more communicative:
*** Task: reload_apache * sudo: "/etc/init.d/apache2 reload"
And output and errors might be flagged with two stars.
I bet the changing the logging for the task, at least, would be a quick monkeypatch.
Pro Active Record (Book Review)
Nov/071
Kevin Marshall, Chad Pytel, and Jon Yurek, Pro Active Record: Databases with Ruby and Rails (2007). $39.99. [Amazon]
Pro Active Record is all about ActiveRecord, the object-relational mapping layer that comes with Rails. In this review I’ll call ActiveRecord AR. The book has chapters on SQL, setting up your database, the core features of AR, extra AR goodies, testing and debugging, working with legacy schema, AR and the real world, and, finally, a summary of the AR API (interestingly, the API summary is not a copy of the official docs, but has additional comments from the authors, so there is some real added value in this “back of the book” reference material).
Before I begin I want to note that these kinds of framework books and chapters are really hard to write (I know: I contributed to the O’Reilly JavaEE book). The reason is that a lot of readers come to a subject such as AR without understanding the problem for which AR is the solution. I.e., if you don’t already have some understanding of relational databases and SQL, and maybe have some experience with different RDBMS from more than one vendor, some of what’s going on in AR and in this book can be lost.
Generally I would say that if you do any non-vanilla work with AR, you should own this book as a supplement to the account of AR in Agile Web Development with Rails. There is genuinely useful information throughout the text. Here are a few examples that got a check or exclamation point in the margin while I was reading: executing migrations outside of Rails (p. 49); dealing with migrations in source control (p. 53); a nice discussion of writing good tests (pp. 127ff); more detail on the more obscure test assertions than I’ve seen elsewhere (pp. 129-139); a discussion of transactions and fixtures (p. 141); how to do .csv fixtures (pp. 142-143 — annoying, by the way; this could be a lot better in AR); good stuff on AR exceptions (pp. 144ff); random notes on legacy schema integration (chapter 7), internationalization (p. 204), use of UUIDs for PKs (p. 205), and some canned associations for typical use cases (pp. 208-209).
The book also handles well the true core of AR, setting up associations between models, and validating models (chapter 4, “Core Features of ActiveRecord”). The example “domain” is about managing Farmers, Cows, and Milk, which I found just weird and kinky enough that I learned something. A peculiarity of this chapter is that it kicks off with Callbacks rather than Associations and Validations, so a beginning reader is going to have to wait before getting to the core of the core. But I digress.
A brief tour de force is a detailed example of extending AR with Ruby meta-programming (pp. 109-123). This is one of the longer discussions I’ve seen anywhere regarding AR that provides insight into how it actually works, and how you create your own convenience apparatus for easy-to-write queries.
Now some less good news. The book’s title makes it sound as though it is going to be complete and exhaustive, but it is not. There are lots of little things that are missing. Just for example, I couldn’t find a discussion of :source for the has_many :through association, which is necessary in some cases and has vexed my students. In fact, the chapter that gives the API doesn’t even document ActiveRecord::Associations, which is a major gap (to be sure, there is much information here on associations, but if you’re going to detail the API for ActiveRecord::Base, why leave out stuff as important as ::Associations?). Another aspect that is missing is dealing with some of the cases where you need to have multiple belongs_to foreign keys: AR can get confused, and it would be helpful if someone played this out at some length; this book doesn’t do that. Elsewhere, mention is made of a discussion of SQL injection (p. 150), but that discussion never appears.
Had the title been something like “ActiveRecord Techniques” I’d be less concerned about these gaps. I hope they’ll be addressed in a second edition that is larger and more capacious: We need a single-volume “big book” on AR, and this could be it with some additional work.
In sum: Good book! Great bits! 2nd edition has potential to be awesome.
Sigh: Converting from HABTM to has_many :through is easier than I thought . . .
Nov/070
So the Ministry of Truth has had to intervene on my earlier post.
ActiveRecord: Migrating habtm to model table suitable for has_many :through
Oct/072
[Ha, ha, ActiveRecord has the last laugh, and this is much easier than I thought. The Rails Wiki must have been wrong, or it must have been that you couldn't add a primary key in an earlier version of AR.]
ActiveRecord will facilitate many:many relationships across a plain join table that doesn’t have a primary key. This is called “has and belongs to many.” It turns out that it is hard easy to convert this into a join table where the join table itself represents something.
The example here is a bookmarking application like del.icio.us: you have many users with many links (and vice versa), and model those things as a User class and a Link class. Each User has and belongs to many Links; each Link has and belongs to many Users. As a first cut, you might put the title of the URL in the Link class. So John and Amy might each have a reference to the same link. Note in this model, a catch is that since they share that same Link, it is problematic that the title is on the Link class. Because now they must share that title. Our model prevents one of them from saving his or her link for the NY Times with a personal title such as “The Yankee-Loving New York Times,” or some other appropriate moniker. So with that issue in mind, we would like to migrate our schema so that there is a Bookmark class in between User and Link. Now we will put the title in Bookmark, and let the User edit it. If we do this, Amy and John can have bookmarks for the same link (the URL being represented in the Link class) but different titles (in the Bookmark class).
Let’s take it from the top, shall we? In the original model, if you have a model User, and a model Link, and you want to have a many:many relationship, you can define an association for each one of the form has_and_belongs_to_many :links (and the reverse). Then you create a join table like the following (note the suppression of the primary key):
class CreateLinksUsers < ActiveRecord::Migration
def self.up
create_table :links_users, :id => false do |t|
t.column :link_id, :integer
t.column :user_id, :integer
end
end
def self.down
drop_table :links_users
end
end
As time proceeds, we discover that we are interested in adding some extra data that should go on the join table. For instance, we might want the time/date when it was added, or notes specific to the relationship between the link and the user… Or a title, as above. At this point we would need to model it as a first-class (so to speak) ActiveRecord class and use the associations has_many :through for the “endpoints” and belongs_to for the in-between class.
Sadly, if you are like me, you might read the Rails Wiki pages before you do anything, and you would read the following which is all wrong:
Unfortunately, this is tricky because ActiveRecord does not model the original links_users table: It is purely a vehicle for the habtm association. We might think we could rename the table to, say, :bookmarks, and then add a primary key . . . But ActiveRecord does not allow one to add a primary key column to an already-existing table:
Note: The API doc on add_column refers to column_types which say that it can be of type :primary_key. This is not true. add_column cannot use :primary_key as a type_ (see Rails Wiki, Using Migrations)
So . . . What to do? One way The answer is to create a new bookmarks table, and then model it inside the migration. Loop over your users and links to extract the ids, and add them to the bookmarks. The reverse is much easier: Now you can just drop the primary key and rename the table back to links_users:
class CreateBookmarks < ActiveRecord::Migration
class Bookmark < ActiveRecord::Base; end
def self.up
create_table :bookmarks do |t|
t.column :link_id, :integer
t.column :user_id, :integer
end
User.find(:all).each do |u|
u.links.each do |l|
Bookmark.create!( :link_id => l.id, :user_id => u.id )
end
end
drop_table :links_users
print "Change your associations for User and Link to has_many :through => :bookmarks"
end
def self.down
remove_column :bookmarks, :id
rename_table :bookmarks, :links_users
print "Change your associations for User and Link to has_and_belongs_to_many"
end
end
But it can be even easier than that: You might also just add the new primary key going up, and drop it going down:
class CreateBookmarks < ActiveRecord::Migration
class Bookmark < ActiveRecord::Base; end
def self.up
rename_table :links_users, :bookmarks
add_column :bookmarks, :id, :primary_key
print "Change your associations for User and Link to has_many :through => :bookmarks"
end
def self.down
remove_column :bookmarks, :id
rename_table :bookmarks, :links_users
print "Change your associations for User and Link to has_and_belongs_to_many"
end
end
Yet that’s not all. If you want to preserve data across this migration, you will likely find that information will be lost when you move from the newly-modeled table to the table without the primary key. This is because it is very common to enforce uniqueness on one of the keys in the habtm join table. But when you model a classic many:many join, you probably won’t do that. You may find that going “down” that you have added data to the has_many :through version that is incompatible with what you had in your habtm model; in which case you are going to have to define a rule to re-organize the data.
Tweaks to scaffolding for adding “child” rows
Aug/070
There are times when you need to use a bit of Rails scaffolding to get up and running and for various reasons you don’t want to resort to ActiveScaffold or some other medium-weight strategy. All you want to do is provide the user a means to see a “master” row, and then allow for adding a row to a dependent (”child”) table. Here we’re talking about “the simplest thing that could possibly work.” This example is for Rails 1.2.3.
Typically the recipe is going to be like so:
Add your scaffolds:
script generator/scaffold master Admin::Master script generator/scaffold child Admin::Child
Edit your migrations to get your data model where you want it.
In master/show.rhtml, add
<%= link_to 'Add child', :controller => Admin::Child, :action => 'new', :id => @master.id %> <br/><br/>
In child/_form.rhtml, add
<%= hidden_field 'child', 'master_id', :value => params[:id] %>
That’s essentially it. Now after adding a Master record, you click the scaffold’s “show” link, and that view will give link to add a Child record with its foreign key set to the value of the Master for which you are adding the Child.
As soon as it gets complicated, though, go use ActiveScaffold.

