SQL query that joins two tables and returns multiple matches from one column?
The group_concat()
function does exactly what you need:
SELECT
blog_posts.post_id,
blog_posts.post_content,
blog_posts.post_author,
group_concat(blog_categories.category_id)
FROM blog_posts
JOIN blog_categories ON blog_posts.post_id = blog_categories.post_id
GROUP BY 1, 2, 3
Related videos on Youtube
sporker
Updated on June 07, 2022Comments
-
sporker almost 2 years
My title is terrible, and that's probably why I'm not finding what I want on Google.
What I'm trying to do is export some data from an old in-house blog so I can import it into something else. My issue is that while I can kind of create the sort of JOIN I'm looking for, the match in the second table can contain multiple rows, so I end up with tons of duplicate data. I need to take the results from the second table and concat those (if there are multiple matches) into a single field in the query result. There is no need for a WHERE constraint on the query, I'm trying to retrieve the entire blog_posts table.
Hopefully this abbreviated layout of the table structure will help illustrate:
blog_posts blog_categories --------------------------------------- post_id post_id post_content category_id post_author
And here's some sample data.
blog_posts table data:
post_id post_content post_author ---------------------------------- 1 foo1 bob 2 foo2 bob 3 foo3 fred
blog_categories table data:
post_id category_id -------------------- 1 1 1 2 1 6 2 1 3 2 3 4
And what my ideal results would look like would be this:
post_id post_content post_author category_ids ------------------------------------------------ 1 foo1 bob 1,2,6 2 foo2 bob 1 3 foo3 fred 2,4
The closest I could get was a simple join like this:
SELECT blog_posts.post_id, blog_posts.post_content, blog_posts.post_author, blog_categories.category_id FROM blog_posts INNER JOIN blog_categories ON blog_posts.post_id = blog_categories.post_id
But that returns matches in the blog_posts table multiple times (one time for each category_id that matches).
Is there any way to accomplish what I want using just SQL? I'm thinking some sort of sub-select would work, but what I can't wrap my head around how that would work - I know I'd essentially want to do a select in my "loop" for the category ids using the current post id, but the syntax for that escapes me. It need not be efficient, this is a one-time operation.
-
sporker about 11 yearsJust for the record, this is MySQL.
-
Bohemian about 11 years@BenjaminM YEs - it's a mysql only function, but the question is mysql
-
Benjamin M about 11 yearsthe question is mysql since i made it mysql ;)
-
sporker about 11 yearsSorry. On the upside though, I work with PostgreSQL more often than MySQL, so I've added your answer to my local list of notes.
-
sporker about 11 yearsI'm looking at this, and while the BLOB output is giving me issues with the export, it seems to work. I don't understand how it works in the slightest though, particularly the "GROUP BY" at the end. Are those category_id's or post_id's? I have around 420 post_id rows and 40 category_id rows - I'm hoping my "GROUP BY" doesn't actually need to list every one of those.
-
sporker about 11 yearsFurther, if I use
GROUP BY blog_posts.post_id, blog_posts.post_content, blog_posts.post_author
at the end of the query, it certainly seems to work. I spot-checked some data and it looks good. Even added another JOIN so that I can pull category names from yet another table. -
Bohemian about 11 years@sporker the SQL standard allows grouped columns to be referenced by their position instead of their expession. This is particularly handy when the column us a long calculation, but I find the brevity pleasing. Done shun this syntax, but I embrace it.