Retrieving the last N ordered records with ActiveRecord

This post was originally published on the Elabs blog, before Elabs and Varvet joined forces.

Here’s a database of chat messages.

|  id  | created_at |
|——————|————————————|
|   1  |   13:20    |
|   2  |   13:21    |
|   3  |   13:22    |
|   …  |   ……………    |
|  121 |   14:27    |
|  122 |   14:29    |
|  123 |   14:32    |
|——————|————————————|

So, we’re building a chat, and what we’d like to see is the 10 most recent messages, with the oldest of the bunch at the top, and the newest message at the bottom, something like: [<Message 113>, <Message 114>, <Message 115>, …, <Message 123>].

At first, you might think "I’ll just sort all messages by created_at in ascending order, and take the last 10". OK, here’s what that looks like.

Message.order(created_at: :asc).last(10) # => [<Message 113>, <Message 114>, …, <Message 123>]

Looks good? Yes? No. Have a look at the SQL:

 SELECT "messages".* FROM "messages"  ORDER BY "messages"."created_at" ASC

   |  id  | created_at |
   |——————|————————————|
-> |   1  |   13:20    |
-> |   2  |   13:21    |
-> |   3  |   13:22    |
-> |   …  |   ……………    |
-> |  121 |   14:27    |
-> |  122 |   14:29    |
-> |  123 |   14:32    |
   |——————|————————————|

What, no mention of 10 in our SQL query?! .last is not so clever. We end up loading all messages in our database to Ruby, maybe a few hundred thousand, and then we throw away all messages except for the last 10, what a waste.

Okay, how about using OFFSET? Let’s try.

Message.order(created_at: :asc).offset(Message.count - 10).limit(10) # => [<Message 113>, <Message 114>, …, <Message 123>]        

Looks good? Yes? No. Let’s look at the SQL:

 SELECT "messages".* FROM "messages"  ORDER BY "messages"."created_at" ASC OFFSET 112 LIMIT 10

     |  id  | created_at |
     |——————|————————————|
SKIP |   1  |   13:20    |
SKIP |   2  |   13:21    |
SKIP |   …  |   ……………    |
  -> |  113 |   14:13    |
  -> |   …  |   ……………    |
  -> |  123 |   14:32    |
     |——————|————————————|

A few important notes about this.

  1. OFFSET in SQL must be a positive number so we can’t simply use -10 as our offset.
  2. ActiveRecord will try to coerce our offset to an integer using #to_i, so we can’t pass a subquery as our offset, which leaves us with precomputed positive numbers only.
  3. What is our offset? It must be calculated ahead of time using Message.count, this is slow, and prone to race conditions if we get more messages in between our count and select.

Even if we disregard all the above points, have a look at the documentation for LIMIT/OFFSET in PostgreSQL: "The rows skipped by an OFFSET clause still have to be computed inside the server; therefore a large OFFSET might be inefficient."

Oh, OK, so offset is off.

One more try. Let’s sort this out with a new mindset. We change the order of our SQL, and reverse the thing in Ruby. That must work!

Message.order(created_at: :asc).reverse_order.limit(10).reverse # => [<Message 113>, <Message 114>, …, <Message 123>]

Looks good? Yes? Let’s look at the SQL.

 SELECT "messages".* FROM "messages"  ORDER BY "messages"."created_at" DESC LIMIT 10

   |  id  | created_at |
   |——————|————————————|
-> |  123 |   14:32    |
-> |   …  |   ……………    |
-> |  113 |   14:13    |
   |——————|————————————|

Well, this is actually quite OK. We only retrieve 10 records from the database, which is what we want. Sure, the order of the messages is wrong which is kind of sad, but we can fix that later. Let’s make a scope of this and call it a good day!

class << Message
  def in_order
    order(created_at: :asc)
  end

  def recent(n)
    in_order.reverse_order.limit(n).reverse
  end
end

There’s a few downsides with this, can you spot it?

  1. We have to reverse it in Ruby.
  2. The return value is an Array, and not an ActiveRecord::Relation, since we force it with reverse.
  3. We can’t merge an Array with other scopes.
  4. We can’t chain additional SQL conditions to the end of our array to further filter the 10 results, e.g. Message.recent(5).where(…).

We can still do better!

What if I told you there is a way to reverse the result in SQL, and that there’s also an OK way to do so with ActiveRecord? I’m sure you’d believe me after all of this, anything else would be cruel.

class Message
  class << self
    def in_order
      order(created_at: :asc)
    end

    def recent(n)
      in_order.endmost(n)
    end

    def endmost(n)
      all.only(:order).from(all.reverse_order.limit(n), table_name)
    end
  end
end

Looks good? Yes! Let’s use this and have a look at the SQL, Message.recent(10):

SELECT "messages".* FROM (
  SELECT  "messages".* FROM "messages"  ORDER BY "messages"."created_at" DESC LIMIT 10
) messages  ORDER BY "messages"."created_at" ASC

   |  id  | created_at |
   |——————|————————————|
-> |  113 |   14:13    |
-> |   …  |   ……………    |
-> |  123 |   14:32    |
   |——————|————————————|

This is exactly what we want, and it has none of the downsides of reversing the results in Ruby. The final .recent method is our final implementation, and it works as expected.

The keen eye will notice that I extracted part of the logic into a general-purpose .endmost method. .endmost is what you want when you call .last: the last N records in the result set without having to retrieve all records from the database, and it works with any ordering. You can impose filtering before, Message.where(…).in_order.endmost(10), and afterwards to filter your final results as well Message.in_order.endmost(10).where(…).

Thanks for reading! I hope you found it as useful as I did!