MySQL subquery - Find only first record in a LEFT JOIN
This is the greatest-n-per-group
problem, which is asked frequently on Stack Overflow.
Here's how I would solve it in your scenario:
SELECT m.memberid, m.membername, m.gender, mp.phone, mh.loggedtime, mh.ipaddy
FROM tbl_members m
INNER JOIN tbl_members_phones mp ON m.defaultphoneid = mp.phoneid
INNER JOIN tbl_members_addresses ma ON m.defaultaddressid = ma.addressid
LEFT OUTER JOIN tbl_members_login_history mh ON m.memberid = mh.memberid
LEFT OUTER JOIN tbl_members_login_history mh2 ON m.memberid = mh2.memberid
AND mh.pk < mh2.pk
WHERE mh2.pk IS NULL;
That is, we want mh
to be the most recent row in tbl_member_login_history for the given memberid. So we search for another row mh2
that is even more recent. If none more recent than the mh
row is found, then mh2.*
will be NULL, so mh
must be the most recent.
I'm assuming this table has a primary key column that contains increasing values. For this example, I assume the column name is pk
.
Using LEFT OUTER JOIN for both references to the login history table means that the m
row will be reported even if there is no matching row.
coffeemonitor
Updated on July 29, 2022Comments
-
coffeemonitor almost 2 years
I'm trying to display a list of member records, and I have a few tables I'm using to display what I need.
That's the easy part. The part I need help with is with a table that has many records to each member record: Login history
I want to display only the first row for each member record, that exists in the Login History table. Alternatively, I may want to flip flop and display the last record in the Login History table, as well.
here's what I've got so far:
SELECT m.memberid, m.membername, m.gender, mp.phone FROM tbl_members m, tbl_members_phones mp, tbl_members_addresses ma WHERE m.defaultphoneid = mp.phoneid AND m.defaultaddressid = ma.addressid
So that returns what's expected.
The 2 columns from
tbl_members_login_history
I'd like to add to the returned result are: mh.loggedtime
, mh.ipaddy
I know adding the
tbl_members_login_history
as a LEFT JOIN would return duplicates, so I'm thinking there must be a Subquery necessity here, in order to return just the 1st record for thatmemberid
that exists intbl_members_login_history
.What I'm worried about is if no record in the history table exists, I still want to display that member info, but leave the history columns as NULL.
Would this be a subquery incident? and if so, how does one add that type of LIMIT?
-
steffen almost 10 yearsI just tried your solution and compared it to a query that contains a subquery using
GROUP BY
withMAX(...)
, joiningON date = maxdate
. Your solution with the<
operator took 2s (!), while the subquery solution took 0.01s (same result table of course). TheEXPLAIN
of your query looks much better than the one with the subquery. I don't know why I get this huge performance difference, but I guess it's because the<
produces a huge intermediate result. -
Bill Karwin almost 10 years@steffen: Yes, since I answered this question, I've seen a lot of other cases, and concluded that either the solution above or a solution like you describe can be faster, depending on how many distinct groups and how many rows per group the data has. So it's best to test both methods with your data and then decide.
-
steffen almost 10 yearsTrue. What confused me was the huge performance difference. I found your solution a couple of times here on SO. And the
EXPLAIN
output look great so I expected that query to be very fast. But instead it was so much slower. Weird. -
Bill Karwin almost 10 years@steffen, keep in mind MySQL does nested-loop joins. So you can get a rough estimate of the work it has to do to perform joins by multiplying the
rows
information from each table joined together. So even if it looks like it's using indexes and everything, if you see a few thousand rows in one table joined to a few thousand rows in another table, it's doing millions of row comparisons.