SQL query that joins two tables and returns multiple matches from one column?

13,074

The group_concat() function does exactly what you need:

SELECT 
  blog_posts.post_id, 
  blog_posts.post_content, 
  blog_posts.post_author, 
  group_concat(blog_categories.category_id)
FROM blog_posts 
JOIN blog_categories ON blog_posts.post_id = blog_categories.post_id
GROUP BY 1, 2, 3
Share:
13,074

Related videos on Youtube

sporker
Author by

sporker

Updated on June 07, 2022

Comments

  • sporker
    sporker almost 2 years

    My title is terrible, and that's probably why I'm not finding what I want on Google.

    What I'm trying to do is export some data from an old in-house blog so I can import it into something else. My issue is that while I can kind of create the sort of JOIN I'm looking for, the match in the second table can contain multiple rows, so I end up with tons of duplicate data. I need to take the results from the second table and concat those (if there are multiple matches) into a single field in the query result. There is no need for a WHERE constraint on the query, I'm trying to retrieve the entire blog_posts table.

    Hopefully this abbreviated layout of the table structure will help illustrate:

    blog_posts              blog_categories
    ---------------------------------------
    post_id                 post_id
    post_content            category_id
    post_author
    

    And here's some sample data.

    blog_posts table data:

    post_id  post_content  post_author
    ----------------------------------
    1        foo1          bob
    2        foo2          bob
    3        foo3          fred
    

    blog_categories table data:

    post_id  category_id
    --------------------
    1        1
    1        2
    1        6
    2        1
    3        2
    3        4
    

    And what my ideal results would look like would be this:

    post_id  post_content  post_author  category_ids
    ------------------------------------------------
    1        foo1          bob          1,2,6
    2        foo2          bob          1
    3        foo3          fred         2,4
    

    The closest I could get was a simple join like this:

    SELECT 
        blog_posts.post_id, 
        blog_posts.post_content, 
        blog_posts.post_author, 
        blog_categories.category_id 
    FROM blog_posts 
        INNER JOIN blog_categories 
            ON blog_posts.post_id = blog_categories.post_id
    

    But that returns matches in the blog_posts table multiple times (one time for each category_id that matches).

    Is there any way to accomplish what I want using just SQL? I'm thinking some sort of sub-select would work, but what I can't wrap my head around how that would work - I know I'd essentially want to do a select in my "loop" for the category ids using the current post id, but the syntax for that escapes me. It need not be efficient, this is a one-time operation.

  • sporker
    sporker about 11 years
    Just for the record, this is MySQL.
  • Bohemian
    Bohemian about 11 years
    @BenjaminM YEs - it's a mysql only function, but the question is mysql
  • Benjamin M
    Benjamin M about 11 years
    the question is mysql since i made it mysql ;)
  • sporker
    sporker about 11 years
    Sorry. On the upside though, I work with PostgreSQL more often than MySQL, so I've added your answer to my local list of notes.
  • sporker
    sporker about 11 years
    I'm looking at this, and while the BLOB output is giving me issues with the export, it seems to work. I don't understand how it works in the slightest though, particularly the "GROUP BY" at the end. Are those category_id's or post_id's? I have around 420 post_id rows and 40 category_id rows - I'm hoping my "GROUP BY" doesn't actually need to list every one of those.
  • sporker
    sporker about 11 years
    Further, if I use GROUP BY blog_posts.post_id, blog_posts.post_content, blog_posts.post_author at the end of the query, it certainly seems to work. I spot-checked some data and it looks good. Even added another JOIN so that I can pull category names from yet another table.
  • Bohemian
    Bohemian about 11 years
    @sporker the SQL standard allows grouped columns to be referenced by their position instead of their expession. This is particularly handy when the column us a long calculation, but I find the brevity pleasing. Done shun this syntax, but I embrace it.