|
For centuries books were the dominant source of information, but how we acquire, share, and publish information is changing in fundamental ways due to the Web. The goal of the Social Book Search Track is to investigate techniques to support users in searching and navigating professional metadata and user-generated content from social media as well as providing a forum for the exchange of research ideas and contributions. Towards this goal the track is building appropriate evaluation benchmarks complete with test collections for focused, social and semantic search tasks.
This year the Social Book Search Track consists of two tasks:
The system-oriented Suggestion task is similar to the 2013 Social Book Search task. However, we want to focus more on recommendation, so the topics are enriched with the user catalogue data (i.e. the books that the topic creator had in her personal catalogue at the time of creating the topic). In addition, we will release a large set of anonymised user profiles from other LT forum members, so task participants can run recommendation experiments.
Schedule:
One of the challenges is dealing with a mixture of professional and social metadata, which differ both in quantity as well as in kind. Professional metadata is often based on controlled vocabularies to describe topical information, with a minimal set of subject headings or classification information. Social metadata comes in the form of reviews that vary widely in length, opinion, clarity, seriousness and in the aspects of the book they discuss, such as writing style, comprehensiveness, engagement, accuracy, recency, topical coverage and diversity and genre.
The task attempts to address questions such as:Book search is highly complex. Searcher may want to read reviews and ratings from others to inform their decisions. When searching for themselves their relevance criteria may be very different from when they are searching for someone else (as a birthday present or merely to help someone in their search). They may be searching for books in genres or about topics they are familiar with, in which case a profile of their reading habits may be helpful, but they may also be searching for new genres and/or topics, for which little preference information is available.
The document collection consists of 2.8 million book descriptions with metadata from Amazon and LibraryThing. From Amazon there is formal metadata like booktitle, author, publisher, publication year, library classification codes, Amazon categories and similar product information, as well as user-generated content in the form of user ratings and reviews. From LibraryThing, there are user tags and user-provided metadata on awards, book characters and locations and blurbs. There are additional records from the British Library and the Library of Congress. To get access to the document collection, participants have to sign a Licence agreement.
Participants are allowed to submit up to 6 runs in standard TREC format. Any field in the topic statement may be used as well as any information in the user profiles. The topics and user profiles for 2014 are available on the Document Collection page. The submission deadline is 10 May 2014.
Each topic statement contains the title and message of a LibraryThing member who requested book suggestions, as well as the name of the discussion in which the message was posted. The topic statements are enriched with a user profile of the topic creator, which contains information about the books catalogued by these members, including tags and ratings. In addition, there is a large set of 94,000 anonymised user profiles from LibraryThing, which can be used to derive recommendations based on collaborative filtering. The topics and user profiles for 2014 are available on the Document Collection page.
The goal of the Interactive SBS task is to investigate how book searchers deal with professional metadata and user-generated content at different stages on the search process. For the task, we provide two book search interfaces and two tasks, one goal-oriented and one non-goal task. Participating teams have to recruit a minimum of 20 users. The user data (interaction logs and questionnaire data) are shared among all participating teams that manage to recruit at least 20 users. Registration details are below.
Two different experimental tasks will be assigned to users, one goal-oriented and one non-goal task.
The goal-oriented task will be a subject search task: You are looking for some interesting physics and mathematics books for a layperson. You have heard about the Feynman books but you have never really read anything in this area. You would like to find an �interesting facts� sort of book on mathematics.
The non-goal task will follow Borlund's situations. In this case, a common scenario identified from existing research will be used to prime participants. This will not be just about finding specific books, but about an "experience." The scenario (which will be the same across all observations) will describe a non-intentional interaction (no predetermined information need), in the line of: Imagine that you are sitting in a doctor's office, a bookstore, the airport, a pub or coffee shop. You have 10 minutes to spend looking for books using the Amazon/LibraryThing collection. Insert in your "bag" (an interface object that will work like a shopping cart without the ecommerce checkout) any books that you found unexpected, surprising or novel and make notes on the stuff you found interesting along the way.
Half of the users will browse and search the A/LT collection using an experimental multistage interface designed at supporting different stages of the search process. The other half of he users will browse and search the same collection using a more traditional interface.
The experimental system will use pre- and post-interaction questionnaires and logging of user interactions in order to capture user behaviour. The task attempts to address research questions such as:
To conduct the task we will use the web-based experimental system developed by the PROMISE NoE at the University of Sheffield. The following data will be collected:
Participants in the study must be adults (18 or older).
Participants will receive web-based access to the experimental system, which comes with a standardized protocol for pre-questionnaire, task-based interaction and post-questionnaire. One user will approximately need 25 minutes for the entire experiment.
Participating research groups agree to:
22�31 April | Registration |
1-31 May | Data gathering |
5 June | Release of shared data pool to all participants |
15 June | CLEF working notes papers due |
For questions regarding this track, please contact marijn.koolen@uva.nl