At least that's what seems reasonable to me at the moment.
The table is under my control so I could change it if needed.
Now from several tenthousend or maybe hundreds of thousends of entries I need to find those with the closest match. Of course, I need all of the entries that have the exact same answers and this is no problem. But - at least if there are not enough full matches - then I need all records that have maybe 16,15,14... matches out of the 17 answers.
I have not yet the idea on how to handle this without quering 17*16 different answer schemes.
Hi,
I wouldn′t store the data denormalized. The better way to store it IMHO is to normalize the data, if you want closer machtes you can also setup a score for each answered question: Here is an extract of a possible solution:
CREATE TABLE Question
(
QuestionId INT
QuestionText VARCHAR(100)
)
CREATE TABLE Anwers
(
QuestionId INT
AnswerId INT
AnswerValue VARCHAR(10) --indicates the right answer
Score INT
)
CREATE TABLE AnweredQuestions
(
QuestionId INT
AnswerId INT
ParticipantId INT
AnswerValue
)
Thats just a quick one, could be sure more normalized, but these tables could be easy joined and scored as well in one query rather than using the different columns and its even more extensible than your current one.
HTH, Jens Suessmeyer.
http://www.sqlserver2005.de
|||I actually have set up the tables more close to what you recommended, like this and then of course the user table. I first have to do some frontenmd stuff and then will come back to see how to handle this.Actually there are no right or wrong answers since questions have the form like: Iwrite down what I need to buy before I go shopping. true/false.
CREATE TABLE [dbo].[tCmsElementCustomPartnermatchQuestion] (
[question_id] [int] IDENTITY (1, 1) NOT NULL ,
[question_categoryID] [int] NOT NULL ,
[question_weight] [int] NULL ,
[question_text_DE] [nvarchar] (255) ,
[question_text_FR] [nvarchar] (255) ,
[question_text_IT] [nvarchar] (255) ,
[question_active] [tinyint] NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[tCmsElementCustomPartnermatchQuestionUser] (
[question_user_id] [int] IDENTITY (1, 1) NOT NULL ,
[question_user_userID] [int] NOT NULL ,
[question_user_questionID] [int] NOT NULL ,
[question_user_questionanswer] [int] NOT NULL
) ON [PRIMARY]
GO|||
Hi,
glad to hear that you picked up some ideas. There sure could be more normalized (but I do not want to exaggerate :-) ). Come back if you have any more questions, you are welcome :-)
I keep an eye on the post I answered, anyway if I overlook your answer or rerequest, feel free to contact me through my website which is mentioned below.
HTH, Jens Suessmeyer.
http://www.sqlserver2005.de
|||thanks, very friendly.I actually I became totally lost.
I received a hint from my boss but this one needs a totally denormalized table and only allows answers as yes or no.
Something like this:
From this kind of table
CREATE TABLE #test (
[person_id] [smallint] IDENTITY(1,1),
[person_name] [varchar] (20),
[answer001] [tinyint],
[answer002] [tinyint],
[answer003] [tinyint],
[answer004] [tinyint],
[answer005] [tinyint],
[answer006] [tinyint],
[answer007] [tinyint],
[answer008] [tinyint],
[answer009] [tinyint]
)
I should do a query like this:
SELECT t1.person_name AS p1_name, t2.person_name AS p2_name
, '''' AS person_rank
, t1.answer009*1+t1.answer008*2+t1.answer007*4+t1.answer006*8+t1.answer005*16+t1.answer004*32+t1.answer003*64+t1.answer002*128+t1.answer001*256 AS p1_value
, t2.answer009*1+t2.answer008*2+t2.answer007*4+t2.answer006*8+t2.answer005*16+t2.answer004*32+t2.answer003*64+t2.answer002*128+t2.answer001*256 AS p2_value
FROM #test t1, #test t2
WHERE t1.person_id <> t2.person_id
AND t1.person_name = 'Fritz' AND t2.person_name <> t1.person_name
and then find matches with something like this (in ColdFusion since we couldn't find the respectively SQL functions)
<cfoutput query="qgetmatchesprodandtype">
<cfset tmp = QuerySetCell(qTest, 'person_rank', 9- Len(Replace(FormatBaseN(BitXor(qTest.p1_value,qTest.p2_value),2), "0", "", "ALL")), qTest.CurrentRow)>
</cfoutput>
So, I could really need some enlightening examples
I'd prefer to match answers by counting the number of matches for each question and user undependently if the answer can only be 0/1 or any value between 0-9 (ore anything else)|||I might look at doing it this way:
Let's assume that you have a table of traits of people and you have a candidate and want to find the closest matches.... Kind of like a dating service.
If I take the absolute value of (person1.trait1 - person2.trait1) then if that is 0 they are a match on that trait, if it is 1 then they are not a match....
It follows that if I sum up the abs values of the subtracted trait pairs then the lower the overall sum the more "compatible" the two individual are:
That would lead me to look at:
Select
p1.name,
p2.name,
Sum(
abs(p1.trait1 - p2.trait1) +
abs(p1.trait2 - p2.trait2) +
abs(p1.trait3 - p2.trait3) +
.....
abs(p1.trait17 - p2.trait17)
) As matchfactor
From members as p1, members.p2
Where p1.id <> p2.id And p1.name = 'smith'
Order by matchfactor
|||
Jens Sü?meyers answer below exactly answered my problem if I added a
ORDER BY Numberofmatches DESC
the following query will solve your described problem:
SELECT
t2.[question_user_UserId],COUNT(*) AS Numberofmatches
FROM [dbo].[tCmsElementCustomPartnermatchQuestionUser] t1
INNER JOIN
tCmsElementCustomPartnermatchQuestionUser t2
ON t1.[question_user_questionID] = t2.[question_user_questionID] AND
t1.[question_user_questionanswer] = t2.[question_user_questionanswer] AND NOT
t1.[question_user_userid] = t2.[question_user_userid] --to eliminate the actual user which has the best match with himself :-)
Where T1.[question_user_userid] = 15
Group by t2.[question_user_UserId]
Let me know if that worked for you.
-Jens.
No comments:
Post a Comment