LEFT JOIN saves day

tim_m

i'm so nice
Joined
Feb 10, 2003
Messages
5,539
being that i don't use terribly complicated mysql relations, i don't often need to use joins (though i'm sure there are some places i use 'WHERE a.id = b.id' when i could use joins). well here's my newest join statement
Code:
SELECT a.sid,a.name,count(b.iid) AS number
FROM sections AS a
LEFT JOIN images AS b
ON a.sid=b.sid
GROUP BY a.name;
basically the sections table is a small list of the sections of the site. images is a table (for an image management system i'm developing) which associates various images that are used in the different sections. in the admin section, this query is used to get each section's name, section id, and number of images 'attached' to that section.

i know i don't really have a question or anything but meh. maybe discuss joins or something ;)

actually i do have a question. i can't do count(b.*), it gives me an error but count(b.iid) works. does it matter, i thought i read that count(*) was optimized but can't use just '*' since it's over 2 tables, maybe it's only optimized for within one table, which kinda makes sense.
 
Depends what you want to do, really. For starters, I'm sure that's supposed to say "LEFT OUTER JOIN", but hey, who's countin' ? :p

Anyway, back to the topic. What exactly are you trying to count? All the records in table b that match? If that's the case, use COUNT(b.sid) and use "WHERE NOT ISNULL(b.sid). That'll show you all the ones which match, although it's defeating the point because you could do that with an INNER JOIN and save yourself considerable processor time. The true power of LEFT and RIGHT OUTER JOINs is in finding records which don't match (in which case, you'd use that query with a "WHERE ISNULL(b.sid)" and COUNT(a.sid) to find all the records in table a which don't have a match in table b. Use the same with a RIGHT OUTER JOIN and WHERE ISNULL(a.sid) to get all the ones in table b without a match.

'Course, I haven't actually followed that link, so it's entirely possible that I've just completely paraphrased ;)
 
Just re-read...you can't use count(*) when you have multiple tables and at least one identical fieldname between them - how would it know what to call the fields in the resultset? I know, I know, but it's part of the SQL spec (I think) that any given resultset cannot have two identical fieldnames. You may not see the resultset, but the database provider does, and gets terribly confused.
 
ok, sections looks like
Code:
+-----+-----------------+
| sid | name            |
+-----+-----------------+
|   1 | images          |
|   2 | activities      |
|   3 | courses         |
|   4 | error           |
|   5 | faculty         |
|   6 | main            |
|   7 | links           |
|   8 | major           |
|   9 | map             |
|  10 | research        |
|  11 | search          |
|  12 | link_categories |
|  13 | link_links      |
+-----+-----------------+
and images looks like
Code:
+-----+-----+------------------+
| iid | sid | file             |
+-----+-----+------------------+
|   9 |   5 | pdunham3.jpg     |
|  10 |   5 | tbowers.jpg      |
|  11 |   5 | wdunham3.jpg     |
|  12 |   5 | euler.jpg        |
|  13 |   5 | journey.jpg      |
|  14 |   5 | universe.jpg     |
|  15 |   5 | faculty_03.jpg   |
|  16 |   5 | faculty_04.jpg   |
|  17 |  10 | prog_contest.jpg |
|  18 |  10 | student2.jpg     |
|  19 |   2 | pimueps4.jpg     |
|  20 |   2 | newmembers2.jpg  |
|  21 |   6 | trumb.jpg        |
|  22 |   4 | 404.gif          |
+-----+-----+------------------+
there are more columns but this gives the gist.

in the admin section for the images, i have the first thing the user sees as a list of the different sections and how many images each section has. that means that it has to show 0 if there aren't any images. this is currently working via the sql in my first post. basically, is that the best way or how else could you do it?
 
OK, something like this ought to work :

Code:
 SELECT s.sid, s.name, count(a.iid)
 FROM tblSection s LEFT OUTER JOIN tblmages i ON s.sid = i.sid
 GROUP BY s.sid, s.name

That oughta do it. It's pretty much what you've already got, but with the OUTER in there ;) As long as your indexes are set up right, you should be fine with that - the count(*) optimisation (which at the moment seems to be folklore) wouldn't do any more than that anyway, and if it did, I can't imagine that it would be optimised for anything other than the entire tablespace, whereas we're searching on the primary key with the JOIN.
 
if i understand the mysql man page correctly
http://dev.mysql.com/doc/mysql/en/JOIN.html
INNER JOIN and , (comma) are semantically equivalent in the absence of a join condition: both will produce a Cartesian product between the specified tables (that is, each and every row in the first table will be joined onto all rows in the second table).
it doesn't matter if i have INNER there

also, could you explain why you added s.sid to the GROUP BY clause? i don't really understand the whole GROUP BY thing anyway
 
tim_m said:
if i understand the mysql man page correctly
http://dev.mysql.com/doc/mysql/en/JOIN.html
it doesn't matter if i have INNER there

also, could you explain why you added s.sid to the GROUP BY clause? i don't really understand the whole GROUP BY thing anyway
I don't quite follow you...if you used an INNER JOIN, then you wouldn't get your zero-counts.

The GROUP BY thing is force of habit, really - in the old days, you had to specify every field which wasn't a summary function (ie count(*), sum(x) etc) in the GROUP BY clause. These days, it's OK if you just cover it with a set of fields which uniquely identify the grouping to the same (or higher) level of granularity as the fields you've specified in the desired resultset - in this case, you want each unique row of the section table, so just grouping by the id or the name (assuming the name is unique) is enough. In your query, you would hit a problem if you had two sections with the same name - the count for each of the namesakes would be amalgamated.
 
hehehe :eek:

when i was reading the man page, i forgot whether you said INNER or OUTER, i thought you said inner for some reason

my quote from the manual has no relevence at all now that i read it again, it means it acts as an inner join if i didn't use the condition "ON a.sid = b.sid" and i believe that it's an outer join by defualt the way i wrote it.
 
You needn't specify the "OUTER" when doing LEFT / RIGHT JOINS when using SQL Server / MSDE (perhaps not true for versions more than a few years old... my experience doesn't date far enough back to say for certain).

However, I believe the "OUTER" is required for a FULL JOIN, IIRC...
 
Back
Top