Please help my create an SQL query for this...

mindstormsguy · Aug 16, 2009

I'm working on a project and I have a query i want to do on the database. I know how I could accomplish what i want using a couple queries and some php, but I much rather accomplish this task in one query.

I have a table like so:

Code:

mysql> select * from ingredients_link;
+-----------+---------------+
| recipe_id | ingredient_id |
+-----------+---------------+
|         1 |             2 | 
|         1 |             8 | 
|         1 |             9 | 
|         2 |            18 | 
|         2 |             1 | 
|         3 |             7 | 
|         3 |             6 | 
|         7 |             2 | 
|         7 |            19 | 
|         7 |            22 | 
|         7 |            26 | 
|         7 |            12 | 
|         7 |            23 | 
|         7 |            20 | 
|         7 |            21 | 
+-----------+---------------+

Quite simply, this is what i want. Given a number of "ingredient_id"s, I want to return every unique "recipe_id" where every single row with that recipe_id has a matching ingredient_id with what was provided. In other words, I specify the ingredients I have, and i want to know which recipes i can make (without needing more ingredients).

The method I can think of that uses two queries and php is as follows:

Code:

mysql> select recipe_id,count(recipe_id) from ingredients_link group by recipe_id;
+-----------+------------------+
| recipe_id | count(recipe_id) |
+-----------+------------------+
|         1 |                3 | 
|         2 |                2 | 
|         3 |                2 | 
|         7 |                8 | 
+-----------+------------------+

which returns the number of total ingredients in each recipe

Code:

mysql> select recipe_id,count(recipe_id) from ingredients_link where ingredient_id = 2 or ingredient_id = 8 or ingredient_id = 9 group by recipe_id;
+-----------+------------------+
| recipe_id | count(recipe_id) |
+-----------+------------------+
|         1 |                3 | 
|         7 |                1 | 
+-----------+------------------+

Which returns the number of ingredients I actually have for each recipe.

Now I could obviously process these two tables in php, to return each recipe_id where the counts are the same for both tables, but there must be a simple wya to get mysql to just return me that result directly, no?

PrincessFrosty · Aug 17, 2009

You could write a stored proceedure on the SQL server to deal with this, in that you can select into temporary tables and do some work on those and return the result set. Then you can call that stored proceedure from the web page I execpt (dont know php specifically)

I'm trying to think if theres an sql query that will pull out what you want in one go, but it's making my head hurt, get back to you on that

PTNL · Aug 17, 2009

You were just about there, you want an "IN" statement like such:

Code:

SELECT a.recipe_id, COUNT(a.recipe_id) AS [COUNT(recipe_id)]
FROM ingredients_link a 
WHERE a.ingredient_id IN (2,8,9)
GROUP BY a.recipe_id

amromousa · Aug 17, 2009

One should be able to use a sub-select, although they're generally not thought of as speedy.

Code:

select something, count(something) from blah where somethingelse in (select somethingelse from blah) ... group by something;

PrincessFrosty · Aug 17, 2009

I think thats finding recipes if one of the ingredients is in that list, we need to select recpies only when all ingredients are in the list.

PTNL · Aug 17, 2009

PrincessFrosty said:
I think thats finding recipes if one of the ingredients is in that list, we need to select recpies only when all ingredients are in the list.

Ah. I misread what the OP was looking for.

To do this entirely database side, it will require temp tables and a cursor. A couple of tables will be needed:
1 - Table to hold unique recipe_id's
2 - Table to hold ingredients tied to a specific recipe id
3 - Output table of recipe_id's that pass the business rule checks

Loop through the first table, and re-loading the second table on each iteration of the first. Check business rules (ie: find any rows/ingredients that are NOT IN (value1, value2, value3, etc.) ), and add any acceptable matches to the third table which gets returned from the DB.

Stored procs could help separate things into more manageable pieces if this call would get made regularly. Or you could write a SQL statement to return a list of potential matches, and do some additional business rule checks within the application.

PrincessFrosty · Aug 17, 2009

I'm pretty sure this will do it

Code:

select s1.recipe1 from
(select recipe_id as recipe1, count(recipe_id) as recipecounted1 from ingredients_link group by recipe1) s1
 left join
(select recipe_id as recipe2, count(recipe_id) as recipecounted2 from ingredients_link where ingredient_id in (2,8,9,1,18) group by recipe2) s2
on (s1.recipecounted1 = recipecounted2)
where s1.recipe1 = s2.recipe2

I've just created a database in mysql and loaded in your test data, gave this a whirl and it seems to work, change the in (2,8,9,1,18) part to the ingredients you want and you should get only the recipe out, this example gives 1,2 as the results, if you try just (2,8,9) you get just 1 as the result, and if you try (2,8) you dont get anything.

The trick here is the "table" you're selecting from can be another select statement in itself, using this you can get both the tables you're after in your original idea by using the correct selects. Then you join them tables together to get one large table, joined only on common recipie counts, and then just find where your recipe_ids match

I've not done a very thorough test of this, it might warrent some more test values because it's a bit of a mind bender

amromousa · Aug 17, 2009

PrincessFrosty said:
I think thats finding recipes if one of the ingredients is in that list, we need to select recpies only when all ingredients are in the list.

You're right, apologies. I looked at his code block and didn't read the text above it.

mikeblas · Aug 17, 2009

PTNL said:
To do this entirely database side, it will require temp tables and a cursor. A couple of tables will be needed:

Actually, I think it can be done in SQL without a cursor or procedural code. You can go against the same table with the same list, but do NOT IN to make sure no rows exist for that given recepie.

I don't use MySQL (at home, anyway), so I've converted everything to SQL Server's T-SQL code. If you're using one of the newer versions of MySQL, hopefully it's good enough to implement these subselects and operators.

First, let's create the table and insert the sample data:

Code:

CREATE TABLE  ingredients_link (recipe_id INT NOT NULL, ingredient_id INT NOT NULL)
GO

INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (1, 2 )
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (1, 8 )
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (1, 9 )
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (2, 18 )
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (2, 1 )
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (3, 7 )
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (3, 6 )
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (7, 2 )
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (7, 19)
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (7, 22)
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (7, 26)
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (7, 12)
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (7, 23)
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (7,20)
INSERT INTO ingredients_link (recipe_id, ingredient_id) VALUES (7, 21) 
GO

Then, let's build the query. We can get a list of the recipes which require ingridents which we don't have using the NOT IN clause, like this:

Code:

SELECT DISTINCT Recipe_ID
FROM ingredients_link
WHERE ingredient_id NOT IN (2,8,9)

Given that list, all we need to do is find all the other recipe IDs. I think this query answers the requirement mindstormsguy has:

Code:

SELECT DISTINCT RECIPE_ID
FROM ingredients_link
WHERE recipe_id NOT IN 
(
SELECT DISTINCT Recipe_ID
FROM ingredients_link
WHERE ingredient_id NOT IN (2,8,9)
)

with the provided sample data, it returns only one row:

Code:

RECIPE_ID
-----------
1

because that's the only recipe_ID that identifies a recipe which doesn't reference an ingredient which we don't have.

PTNL · Aug 18, 2009

Very slick, mikeblas.

BillParrish · Aug 18, 2009

OMFG.

I spent all weekend devising what I thought was a fairly clever algorithm with arrays and simple SQL (I am not a SQL guy) so that given a sheetmetal part (recipe), made on a CNC turret punch press loaded with a set of tools (ingredients), find what other parts ( recipes) that can be made without changing the tool load (ingredients) and and I just got schooled - big time.

Thank you mikeblas. !!!!!!

In a month or so I will let you know how it works with 11,000 recipes and a total of 83,000 ingredients.

The really weird thing is that I almost never drop down here and stay up in the hardware and just for no damn reason at all clicked on this thread not even thinking about my problem.

mikeblas · Aug 19, 2009

BillParrish said:
In a month or so I will let you know how it works with 11,000 recipes and a total of 83,000 ingredients.

It'll work fine on such small tables. The nature of the query requires table scans, but there's not a lot of data there. Even if you had 83,000 recipes, each row is only 12 or 16 bytes or so -- 1.3 million bytes. That fits into memory on the crappiest of servers, and is read into memory in less than a couple seconds even on the saddest of disk subsystems.

PrincessFrosty · Aug 19, 2009

Yeah you're welcome guys

mikeblas · Aug 19, 2009

PrincessFrosty said:
Yeah you're welcome guys

I don't understand your post, Princess. Are you pouting, or something?

william_fontaine · Aug 19, 2009

Princess did it first.

mikeblas · Aug 19, 2009

In which post?

tgabe213 · Aug 19, 2009

#7?

That's the only post of his I see with a query.

mikeblas said:
I don't understand your post, Princess. Are you pouting, or something?

Irony. I love it.

mikeblas · Aug 19, 2009

Oh, I see. Yes, that'll work.

BillParrish · Aug 19, 2009

Ok well Mikes looked simplier so it grabbed my eye but after I posted I did think that I should have thanked everyone who contributied and so here it is.

Thank you everyone, Princess too.

Apreciate the advice on the tables Mike. Will not bore you with details - but back to the drawing board to modify my text processing to create a table for each part number.

You comment about the crappiest of servers makes me wonder how you got into my clients server room !.

PrincessFrosty · Aug 20, 2009

Sorry, was in a bad mood the other day, spent all day fighting IT problems

Please help my create an SQL query for this...

Gawd

Supreme [H]ardness

Supreme [H]ardness

Gawd

Supreme [H]ardness

Supreme [H]ardness

Supreme [H]ardness

Gawd

[H]ard|DCer of the Month - May 2006

Supreme [H]ardness

Supreme [H]ardness

[H]ard|DCer of the Month - May 2006

Supreme [H]ardness

[H]ard|DCer of the Month - May 2006

Gawd

[H]ard|DCer of the Month - May 2006

2[H]4U

[H]ard|DCer of the Month - May 2006

Supreme [H]ardness

Supreme [H]ardness