Robizzle01
Limp Gawd
- Joined
- Sep 30, 2005
- Messages
- 159
I made a recommender system for a website and I use a stored procedures to get predictions. I use another stored procedure to get recommendations (basically the highest predictions.) However, I can't think of a way to implement the recommendations correctly.
First, let me explain how a prediction is generated. Each user is given 8 preferences. A preference is a float and may be positive or negative. Each drink (like a beer or wine) also has 8 preferences associated with it.
The prediction is 1.0 + userFeature1*drinkFeature1 + userFeature2*drinkFeature2 + ... + userFeature8*drinkFeature8. However, there is one exception. The predictions have to be within the range 1.0 and 5.0 since it doesn't make sense to predict that a user would give a drink more than a 5 star rating, or less than a 1 star rating.
Lastly, through testing it has also been shown that instead of calculating the prediction then using "if (prediction < 1) prediction=1; if (prediction > 5) prediction = 5;", predictions are actually more accurate if we perform this sort of rounding after each addition. Basically, this prevents any single feature from throwing the prediction so far in any direction that the rest of the features can't recover.
Here is the code for the predict rating. It is working as expected, I would love any feedback about a better way to accomplish the same results if you have any, but it seems to have fast enough performance as is.
If you would rather see this syntax highlighted, code is here: http://rafb.net/p/k6tztd63.html
Now, we also have a page that we use to give a user a list of the top recommendations. These are just the highest predictions. We also have code in here that makes sure not to predict something that the user has already rated, and also to not predict something unless it has at least 3 ratings.
There is a bug in this code that I simply don't know how to fix. Due to having to calculate the prediction in the query itself, I do not know how I can do the "if (prediction < 1) prediction=1; if (prediction > 5) prediction = 5;" sort of thing. As a result, some predictions are higher than 5.0 or less than 1.0 and they are not the same as the results from uspPredictRating (posted above.)
syntax highlighted version here: http://rafb.net/p/RARQbx36.html
Sorry for the lengthy post and I hope I didn't leave out some information you need. Just let me know if you need to see the table layouts or need any further explination.
First, let me explain how a prediction is generated. Each user is given 8 preferences. A preference is a float and may be positive or negative. Each drink (like a beer or wine) also has 8 preferences associated with it.
The prediction is 1.0 + userFeature1*drinkFeature1 + userFeature2*drinkFeature2 + ... + userFeature8*drinkFeature8. However, there is one exception. The predictions have to be within the range 1.0 and 5.0 since it doesn't make sense to predict that a user would give a drink more than a 5 star rating, or less than a 1 star rating.
Lastly, through testing it has also been shown that instead of calculating the prediction then using "if (prediction < 1) prediction=1; if (prediction > 5) prediction = 5;", predictions are actually more accurate if we perform this sort of rounding after each addition. Basically, this prevents any single feature from throwing the prediction so far in any direction that the rest of the features can't recover.
Here is the code for the predict rating. It is working as expected, I would love any feedback about a better way to accomplish the same results if you have any, but it seems to have fast enough performance as is.
If you would rather see this syntax highlighted, code is here: http://rafb.net/p/k6tztd63.html
Code:
USE [DrinkFish]
GO
/****** Object: StoredProcedure [dbo].[uspPredictRating] Script Date: 01/21/2008 16:50:11 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: <Author,,Name>
-- Create date: <Create Date,,>
-- Description: <Description,,>
-- =============================================
CREATE PROCEDURE [dbo].[uspPredictRating]
@UserID uniqueidentifier,
@DrinkID uniqueidentifier
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
DECLARE @DrinkType smallint
SET @DrinkType = (SELECT DrinkType FROM Drinks WHERE DrinkID = @DrinkID)
DECLARE @Total float
SET @Total = 1.0
DECLARE @DF1 float, @DF2 float, @DF3 float, @DF4 float, @DF5 float, @DF6 float, @DF7 float, @DF8 float
DECLARE @UF1 float, @UF2 float, @UF3 float, @UF4 float, @UF5 float, @UF6 float, @UF7 float, @UF8 float
SELECT @DF1 = (SELECT Feature1 FROM DrinkFeatures WHERE DrinkID = @DrinkID)
SELECT @DF2 = (SELECT Feature2 FROM DrinkFeatures WHERE DrinkID = @DrinkID)
SELECT @DF3 = (SELECT Feature3 FROM DrinkFeatures WHERE DrinkID = @DrinkID)
SELECT @DF4 = (SELECT Feature4 FROM DrinkFeatures WHERE DrinkID = @DrinkID)
SELECT @DF5 = (SELECT Feature5 FROM DrinkFeatures WHERE DrinkID = @DrinkID)
SELECT @DF6 = (SELECT Feature6 FROM DrinkFeatures WHERE DrinkID = @DrinkID)
SELECT @DF7 = (SELECT Feature7 FROM DrinkFeatures WHERE DrinkID = @DrinkID)
SELECT @DF8 = (SELECT Feature8 FROM DrinkFeatures WHERE DrinkID = @DrinkID)
SELECT @UF1 = (SELECT Feature1 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SELECT @UF2 = (SELECT Feature2 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SELECT @UF3 = (SELECT Feature3 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SELECT @UF4 = (SELECT Feature4 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SELECT @UF5 = (SELECT Feature5 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SELECT @UF6 = (SELECT Feature6 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SELECT @UF7 = (SELECT Feature7 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SELECT @UF8 = (SELECT Feature8 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
-- Feature 1
SET @Total = @Total + @DF1*@UF1
IF @Total > 5
BEGIN
SET @Total = 5
END
IF @Total < 1
BEGIN
SET @Total = 1
END
-- Feature 2
SET @Total = @Total + @DF2*@UF2
IF @Total > 5
BEGIN
SET @Total = 5
END
IF @Total < 1
BEGIN
SET @Total = 1
END
-- Feature 3
SET @Total = @Total + @DF3*@UF3
IF @Total > 5
BEGIN
SET @Total = 5
END
IF @Total < 1
BEGIN
SET @Total = 1
END
-- Feature 4
SET @Total = @Total + @DF4*@UF4
IF @Total > 5
BEGIN
SET @Total = 5
END
IF @Total < 1
BEGIN
SET @Total = 1
END
-- Feature 5
SET @Total = @Total + @DF5*@UF5
IF @Total > 5
BEGIN
SET @Total = 5
END
IF @Total < 1
BEGIN
SET @Total = 1
END
-- Feature 6
SET @Total = @Total + @DF6*@UF6
IF @Total > 5
BEGIN
SET @Total = 5
END
IF @Total < 1
BEGIN
SET @Total = 1
END
-- Feature 7
SET @Total = @Total + @DF7*@UF7
IF @Total > 5
BEGIN
SET @Total = 5
END
IF @Total < 1
BEGIN
SET @Total = 1
END
-- Feature 8
SET @Total = @Total + @DF8*@UF8
IF @Total > 5
BEGIN
SET @Total = 5
END
IF @Total < 1
BEGIN
SET @Total = 1
END
SELECT @Total AS Prediction
END
Now, we also have a page that we use to give a user a list of the top recommendations. These are just the highest predictions. We also have code in here that makes sure not to predict something that the user has already rated, and also to not predict something unless it has at least 3 ratings.
There is a bug in this code that I simply don't know how to fix. Due to having to calculate the prediction in the query itself, I do not know how I can do the "if (prediction < 1) prediction=1; if (prediction > 5) prediction = 5;" sort of thing. As a result, some predictions are higher than 5.0 or less than 1.0 and they are not the same as the results from uspPredictRating (posted above.)
syntax highlighted version here: http://rafb.net/p/RARQbx36.html
Code:
USE [DrinkFish]
GO
/****** Object: StoredProcedure [dbo].[uspGetRecommendedDrinks] Script Date: 01/21/2008 16:55:23 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: <Author,,Name>
-- Create date: <Create Date,,>
-- Description: <Description,,>
-- =============================================
CREATE PROCEDURE [dbo].[uspGetRecommendedDrinks]
-- Add the parameters for the stored procedure here
@UserID uniqueidentifier,
@DrinkType smallint,
@NumRows smallint
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
DECLARE @UF1 float
DECLARE @UF2 float
DECLARE @UF3 float
DECLARE @UF4 float
DECLARE @UF5 float
DECLARE @UF6 float
DECLARE @UF7 float
DECLARE @UF8 float
SET @UF1 = (SELECT Feature1 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SET @UF2 = (SELECT Feature2 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SET @UF3 = (SELECT Feature3 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SET @UF4 = (SELECT Feature4 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SET @UF5 = (SELECT Feature5 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SET @UF6 = (SELECT Feature6 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SET @UF7 = (SELECT Feature7 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SET @UF8 = (SELECT Feature8 FROM UserFeatures WHERE UserID = @UserID AND DrinkType = @DrinkType)
SELECT TOP(@NumRows) Creators.Name as Drinkcreator, Drinks.Name as Drinkname, Drinks.Picture as Drinkpicture, Creators.CreatorID, Drinks.DrinkID,
(@UF1*DrinkFeatures.Feature1 + @UF2*DrinkFeatures.Feature2 + @UF3*DrinkFeatures.Feature3 + @UF4*DrinkFeatures.Feature4 + @UF5*DrinkFeatures.Feature5 + @UF6*DrinkFeatures.Feature6 + @UF7*DrinkFeatures.Feature7 + @UF8*DrinkFeatures.Feature8) as Prediction
FROM Drinks
JOIN DrinkFeatures ON DrinkFeatures.DrinkID = Drinks.DrinkID
JOIN Creators ON Drinks.CreatorID = Creators.CreatorID
WHERE Drinks.DrinkType = @DrinkType
AND Drinks.DrinkID NOT IN (SELECT DrinkID From Ratings WHERE UserID = @UserID)
AND (SELECT COUNT(RatingID) FROM Ratings WHERE DrinkID = Drinks.DrinkID) >= 3
ORDER BY Prediction DESC
END
Sorry for the lengthy post and I hope I didn't leave out some information you need. Just let me know if you need to see the table layouts or need any further explination.