|
The evaluation results shown below are based on the official INEX 2014 SBS topic set based on the LibraryThing discussion groups and the user profiles and catalogues of the topic creators.
These are the official Qrels:
The official evaluation measure is nDCG@10.
Run | nDCG@10 | MRR | MAP | R@1000 |
---|---|---|---|---|
USTB - run6.SimQuery1000.rerank_all.L2R_RandomForest | 0.303 | 0.464 | 0.232 | 0.390 |
USTB - run4.newXml.rerank_all.L2R_RandomForest | 0.142 | 0.258 | 0.102 | 0.390 |
HAFSI - 326 | 0.142 | 0.275 | 0.107 | 0.426 |
USTB - run3.newXml.rerank_all.L2R_Coordinate | 0.138 | 0.256 | 0.101 | 0.390 |
USTB - run5.newXml.rerank_all.L2R_RankNet | 0.133 | 0.246 | 0.098 | 0.390 |
USTB - run2.newXml.rerank_T | 0.131 | 0.246 | 0.096 | 0.390 |
USTB - run1.newXml.feedback | 0.128 | 0.246 | 0.095 | 0.390 |
LSIS - InL2 | 0.128 | 0.236 | 0.101 | 0.441 |
AAU - run1.all-plus-query.all-doc-fields | 0.127 | 0.239 | 0.097 | 0.444 |
AAU - run3.all-plus-query.all-doc-fields | 0.120 | 0.227 | 0.090 | 0.425 |
CYUT - Type2QTGN | 0.119 | 0.246 | 0.086 | 0.340 |
CYUT - 0.95AverageType2QTGN | 0.119 | 0.243 | 0.085 | 0.332 |
HAFSI - 328 | 0.117 | 0.226 | 0.088 | 0.392 |
HAFSI - 329 | 0.116 | 0.217 | 0.087 | 0.392 |
HAFSI - 325 | 0.115 | 0.214 | 0.087 | 0.392 |
LSIS - InL2Feedback | 0.114 | 0.230 | 0.094 | 0.434 |
HAFSI - 324 | 0.112 | 0.214 | 0.086 | 0.392 |
LSIS - InL2tagFeedback | 0.102 | 0.212 | 0.075 | 0.388 |
UvA - inex14.ti_qu.fb.10.50.5000 | 0.097 | 0.179 | 0.073 | 0.421 |
UMD - Full_TQG_fb.10.50_0.0000227_50.trec | 0.097 | 0.188 | 0.069 | 0.328 |
UMD - Social_TQG_fb.10.50_0.0000222_50.trec | 0.096 | 0.184 | 0.067 | 0.327 |
UMD - Full_TQG_fb.10.50_0.0000255_100.trec | 0.096 | 0.188 | 0.068 | 0.328 |
UvA - inex14.ti_qu_gr.fb.10.50.5000 | 0.095 | 0.162 | 0.074 | 0.436 |
UvA - inex14.ti_qu.5000 | 0.095 | 0.173 | 0.073 | 0.412 |
UMD - Full_TQG_fb.10.50_traditional.trec | 0.095 | 0.185 | 0.068 | 0.328 |
UvA - inex14.ti_qu_gr.5000 | 0.094 | 0.163 | 0.074 | 0.418 |
UMD - Full_TQ_fb.10.50_0.0000247_100.trec | 0.092 | 0.176 | 0.064 | 0.321 |
UMD - Full_T_fb.10.50_0.0000260_100.trec | 0.070 | 0.139 | 0.047 | 0.253 |
ISMD - 354 | 0.067 | 0.123 | 0.049 | 0.285 |
LSIS - sdm_Rating | 0.062 | 0.120 | 0.047 | 0.314 |
LSIS - sdm_concept | 0.056 | 0.118 | 0.039 | 0.253 |
ISMD - 341 | 0.056 | 0.106 | 0.042 | 0.236 |
LSIS - sdm_tag_feedback | 0.055 | 0.112 | 0.040 | 0.267 |
HAFSI - 345 | 0.052 | 0.113 | 0.037 | 0.383 |
ISMD - 350 | 0.048 | 0.090 | 0.036 | 0.211 |
AAU - run2.query.all-doc-fields | 0.047 | 0.090 | 0.035 | 0.304 |
ISMD - 355 | 0.038 | 0.089 | 0.026 | 0.124 |
CYUT - 0.95RatingType2QTGN | 0.034 | 0.101 | 0.021 | 0.200 |
CYUT - 0.95WRType2QTGN | 0.028 | 0.084 | 0.018 | 0.213 |
ISMD - 342 | 0.010 | 0.018 | 0.007 | 0.081 |
Students from the Aalborg University Copenhagen, Royal School of Library and Information Science (Copenhagen) and the Oslo and Akershus University College have labelled the LibraryThing forum topic threads and the suggestions in those threads.
Forum members can mention books for many different reasons. We want the relevance values to distinguish between books that were mentioned as positive recommendations, negative recommendations (books to avoid), neutral suggestions (mentioned as possibly relevant but not necessarily recommended) and books mentioned for some other reason (not relevant at all).
Furthermore, we want to differentiate between recommendations from members who have read the book they recommend and members who haven't. We assume the recommendation to be of more value to the searcher if it comes from someone who has actually read the book.
Finally, we distinguish between suggestions of books that the user already had in their catalogue versus books that the user added after getting a suggestion from others.
1 - Work mentioned once -> there is only one judgement, use that 2 - Work mentioned multiple times 2.1 - topic creator mentions work 2.1.1 - topic creator *suggests* neutral -> use replies (go to 2.2) 2.1.2 - topic creator *suggests* pos/neg -> use creator judgement 2.1.3 - topic creator *replies* -> use creator judgement only 2.2 - topic creator doesn't mention work 2.2.1 - there are some has_read suggestions/replies -> use has_read judgements 2.2.2 - there are no has_read suggestions/replies -> use all judgements
1 - catalogued by topic creator 1.1 - post-catalogued -> rv=8 1.2 - pre-catalogued -> rv=0 2 - single judgement 2.1 - creator has_read judgement 2.1.1 - creator pos/neg/neu -> rv=0 2.2 - creator not_read judgement 2.2.1 - creator positive -> rv= 8 2.2.2 - creator neutral -> rv=2 2.2.3 - creator negative -> rv=0 2.3 - other has_read judgement 2.3.1 - has_read positive -> rv= 4 2.3.2 - has_read neutral -> rv=2 2.3.3 - has_read negative -> rv=0 2.4 - other not_read judgement 2.4.1 - not_read positive -> rv= 3 2.4.2 - not_read neutral -> rv=2 2.4.3 - not_read negative -> rv=0 3 - multiple judgements 3.1 - multi has_read judgements 3.1.1 - some positive, no negative -> rv=6 3.1.2 - #positive > #negative -> rv=4 3.1.3 - #positive == #negative -> rv=2 3.1.4 - all neutral -> rv=2 3.1.5 - #positive < #negative -> rv=1 3.1.6 - no positive, some negative -> rv=0 3.2 - multi not_read judgements 3.2.1 - some positive, no negative -> rv=4 3.2.2 - #positive > #negative -> rv=3 3.2.3 - #positive == #negative -> rv=2 3.2.4 - all neutral -> rv=2 3.2.5 - #positive < #negative -> rv=1 3.2.6 - no positive, some negative -> rv=0
If you have questions, please send an email to marijn.koolen@uva.nl.