Applying a different order to postgres "DISTINCT ON" using rails

12,255

Solution 1

You're trying to get a set of projects but you're starting with current_user.tasks.

Why not start with current_user.projects, which guarantees distinct projects?

@projects = current_user.projects.includes(:tasks).order("projects.name, tasks.milestone")

Alternative Answer

@projects = current_user.projects.joins(:tasks).select('projects.*, min(tasks.milestone) as next_milestone').order('projects.name').group('projects.id')
@projects.each{|p| puts "#{p.name} #{p.next_milestone}"}

That'll give you one row for each project, with a calculated minimum tasks.milestone value, accessible on the project row result via next_milestone. No extra tasks record, just the next milestone date.

Solution 2

In the user controller:

inner_query = current_user.tasks.next.to_sql
@projects = Task.paginate_by_sql("select * from (#{inner_query}) as user_projects order by user_projects.#{sort_column} #{sort_direction}", :page => params[:page])

And in the task model:

scope :next, select("distinct on (projects.id) projects.*, tasks.*").reorder("projects.id, tasks.milestone ASC")

This way uses the power of postgres to only return the necessary records making the recordset smaller and easier to work with but the trade-off is that the RoR code doesn't look as inviting or as readable as it does with Carlos Drew's suggestion.

Solution 3

To answer this question:

I suppose my question is simply how do I wrap an activerecord query in a surrounding SELECT and ORDER B

Since ActiveRecord 4.0.2 there is now <model>.from.

An example using your models:

inner_query = Project.joins(:tasks).select("DISTINCT ON (projects.id), *") // SELECT DISTINCT ON (projects.id) FROM projects INNER JOIN tasks ON tasks.project_id = projects.id;

You can wrap it in a from:

sorted_query = Project.from(inner_query, :projects).order(:name)
Share:
12,255
user1116573
Author by

user1116573

Updated on June 06, 2022

Comments

  • user1116573
    user1116573 almost 2 years

    I have a rails app:

    user has_many :projects

    user has_many :tasks, :through => :projects

    project has_many :tasks

    Each task has a milestone date.

    To show a table of project details with next milestone date I am using:

    @projects = current_user.tasks.joins(:project).select("distinct on (projects.id) projects.*, tasks.*").reorder("projects.id, tasks.milestone ASC")
    

    This works fine.

    I now want to be able to sort the table columns.

    According to Postgres DISTINCT ON is not sortable, you have to wrap it in another select statement, ie SELECT * FROM (SELECT DISTINCT ON....) ORDER BY column_3

    I did think that the column being ordered could just be worked into the SQL as required, ie (to order by project name DESC):

    @projects = current_user.tasks.joins(:project).select("distinct on (projects.name) projects.*, tasks.*").reorder("projects.name DESC, tasks.milestone ASC")
    

    which works but I also want to be able to order by milestone and that doesn't work that way.

    Can someone tell me how to convert my rails query so that it can be ordered by any of the columns?

    UPDATE


    I suppose my question is simply how do I wrap an activerecord query in a surrounding SELECT and ORDER BY?

    I think I've managed to achieve it using:

    inner_query = current_user.tasks.select("distinct on (projects.id) projects.*, tasks.*").reorder("projects.id, tasks.milestone ASC").to_sql
    @projects = Task.paginate_by_sql("select * from (#{inner_query}) as user_projects order by user_projects.name", :page => params[:page])
    

    Is that the best way or can someone think of a better way? - find/paginate_by_sql seems like a workaround and I would have preferred to stay within the realms of activerecord query.

    Thanks

  • user1116573
    user1116573 almost 11 years
    This probably would've worked except that each project has multiple tasks and each task has a milestone date. I need the table to display the next milestone date and I think your suggestion doesn't allow for it without doing a separate db lookup for each row/record.
  • Carlos Drew
    Carlos Drew almost 11 years
    No, you're wrong about the separate database lookup for each row/record. The includes(:tasks) guarantees that you get all the tasks for all the projects within one query. But, I see what you're saying with regards to needing the next milestone date... Let me think about that for a minute. You want to avoid any expensive ruby (or another query) to figure out which task has the next milestone date...
  • Carlos Drew
    Carlos Drew almost 11 years
    I just tested my method and it works. The combination of includes(:tasks) and order("tasks.milestone") means that you get the project.tasks loaded into memory in order by milestone. That means that you can reliably ask for project.tasks.first to get the task with the earliest milestone or project.tasks.last to get the tasks with the latest milestone.
  • user1116573
    user1116573 almost 11 years
    Hi Carlos Drew, thanks for this. I've just given this a go but then I realised I also need the ability to reorder the table. Can you advise how your .order() will change if I want to reoder the table using the following params #{sort_column} #{sort_direction} where sort_column could be projects.name, tasks.milestone or projects.description (another column being displayed). Thanks.
  • Carlos Drew
    Carlos Drew almost 11 years
    What I gave you is a valid ActiveRecord relation, which means you can attach any sort of further ActiveRecord pagination logic to it, such as will_paginate: github.com/mislav/will_paginate
  • Carlos Drew
    Carlos Drew almost 11 years
    With regards to ordering: you can order however you want, by whatever columns requested. But, project.tasks.first will only be the earliest milestone if tasks.milestone is the first tasks order argument.
  • Carlos Drew
    Carlos Drew almost 11 years
    Have I answered your original question sufficiently?
  • user1116573
    user1116573 almost 11 years
    Hi Carlos Drew, I've posted my own alternative answer. Can you see what you think. I'd appreciate your opinion on whether you think the trade-off between readable RoR code is worth the extra records being returned etc (I don't expect there to be too many records per project but I just think it might be easier to work with the data knowing that only the records I need are in the recordset).
  • Carlos Drew
    Carlos Drew almost 11 years
    Honestly, I would not want that code in my project, as it is not readable or maintainable to me. I also work with other coders, and I would not want to burden them with understanding it or relying on it. I find the @projects = Task.etc. to be especially bad code. Projects are not tasks. If this were the only method I could find to do what I wanted, I would doubt that my approach to the problem was a good one and try to rethink the issue.
  • Carlos Drew
    Carlos Drew almost 11 years
    For example, if what you're figuring out is that projects really care about knowing the next milestone date, I would probably find a way to cache that value directly on the project object, instead of relying on a complex, multi-part, unreadable query.
  • Carlos Drew
    Carlos Drew almost 11 years
    But, finally, I don't find the issue of loading many tasks on a project but not using them to be a real issue.
  • Carlos Drew
    Carlos Drew almost 11 years
    I added an alternate solution that doesn't return any tasks at all, but just the next task milestone.
  • user1116573
    user1116573 almost 11 years
    Went with your alternative answer. Thanks for all your help. I had to apply DateTime.parse(next_milestone) so that I could format it.
  • user1116573
    user1116573 almost 11 years
    Hi Carlos Drew. Any ideas how to allow .count on the alternative version? It keeps returning {} instead of 0.
  • user1116573
    user1116573 almost 11 years
    Nevermind, solved it by using .count(:all, :distinct => true) instead of just .count
  • Ken Ratanachai S.
    Ken Ratanachai S. over 2 years
    This answer is the winner! It keeps ActiveRecord::Relation alive, so my pagination is still working with this solution. Thanks!