Overview

For centuries books were the dominant source of information, but how we acquire, share, and publish information is changing in fundamental ways due to the Web. The goal of the Social Book Search Track is to investigate techniques to support users in searching and navigating professional metadata and user-generated content from social media as well as providing a forum for the exchange of research ideas and contributions. Towards this goal the track is building appropriate evaluation benchmarks complete with test collections for focused, social and semantic search tasks.

This year the Social Book Search Track consists of two tasks:

Suggestion task: a system-oriented batch retrieval/recommendation task, similar to previous year,
Interactive task: a user-oriented interactive task where we want to gather user data on searching for different search tasks and different search interfaces.

Suggestion task

The system-oriented Suggestion task is similar to the 2013 Social Book Search task. However, we want to focus more on recommendation, so the topics are enriched with the user catalogue data (i.e. the books that the topic creator had in her personal catalogue at the time of creating the topic). In addition, we will release a large set of anonymised user profiles from other LT forum members, so task participants can run recommendation experiments.

Schedule:

release topics and profiles on 1 March
run submissions due 10 May
evaluation results distributed 15 May
working note papers due 7 June
overview papers due 30 June

One of the challenges is dealing with a mixture of professional and social metadata, which differ both in quantity as well as in kind. Professional metadata is often based on controlled vocabularies to describe topical information, with a minimal set of subject headings or classification information. Social metadata comes in the form of reviews that vary widely in length, opinion, clarity, seriousness and in the aspects of the book they discuss, such as writing style, comprehensiveness, engagement, accuracy, recency, topical coverage and diversity and genre.

The task attempts to address questions such as:

Is the function of professional metadata to cluster search results, or to steer the user in the right direction? Or is it merely a fall-back option for books where no social metadata is available?
Do professional and social metadata serve different purposes? For what type of book search needs are they useful, necessary or sufficient?
When and to what extent is user preference information useful?

Book search is highly complex. Searcher may want to read reviews and ratings from others to inform their decisions. When searching for themselves their relevance criteria may be very different from when they are searching for someone else (as a birthday present or merely to help someone in their search). They may be searching for books in genres or about topics they are familiar with, in which case a profile of their reading habits may be helpful, but they may also be searching for new genres and/or topics, for which little preference information is available.

Document Collection

The document collection consists of 2.8 million book descriptions with metadata from Amazon and LibraryThing. From Amazon there is formal metadata like booktitle, author, publisher, publication year, library classification codes, Amazon categories and similar product information, as well as user-generated content in the form of user ratings and reviews. From LibraryThing, there are user tags and user-provided metadata on awards, book characters and locations and blurbs. There are additional records from the British Library and the Library of Congress. To get access to the document collection, participants have to sign a Licence agreement.

Submissions

Participants are allowed to submit up to 6 runs in standard TREC format. Any field in the topic statement may be used as well as any information in the user profiles. The topics and user profiles for 2014 are available on the Document Collection page. The submission deadline is 10 May 2014.

User Profiles

Each topic statement contains the title and message of a LibraryThing member who requested book suggestions, as well as the name of the discussion in which the message was posted. The topic statements are enriched with a user profile of the topic creator, which contains information about the books catalogued by these members, including tags and ratings. In addition, there is a large set of 94,000 anonymised user profiles from LibraryThing, which can be used to derive recommendations based on collaborative filtering. The topics and user profiles for 2014 are available on the Document Collection page.

Interactive task

The goal of the Interactive SBS task is to investigate how book searchers deal with professional metadata and user-generated content at different stages on the search process. For the task, we provide two book search interfaces and two tasks, one goal-oriented and one non-goal task. Participating teams have to recruit a minimum of 20 users. The user data (interaction logs and questionnaire data) are shared among all participating teams that manage to recruit at least 20 users. Registration details are below.

Task motivation

The goal of the Interactive Social Book Search (SBS) task is to investigate how book searchers use professional metadata and user-generated content at different stages on the search process. The purpose of this task which will be to gauge user interaction and user experience in social book search by observing user activity with a large collection of rich book descriptions under controlled and simulated conditions, aiming for as much "real-life" experiences intruding into the experimentation. The output will be a rich data set that includes user profiles, selected individual differences (such as a motivation to explore), a log of user interactivity, and a structured set of questions about the experience.

Task definition

Two different experimental tasks will be assigned to users, one goal-oriented and one non-goal task.

The goal-oriented task will be a subject search task: You are looking for some interesting physics and mathematics books for a layperson. You have heard about the Feynman books but you have never really read anything in this area. You would like to find an �interesting facts� sort of book on mathematics.

The non-goal task will follow Borlund's situations. In this case, a common scenario identified from existing research will be used to prime participants. This will not be just about finding specific books, but about an "experience." The scenario (which will be the same across all observations) will describe a non-intentional interaction (no predetermined information need), in the line of: Imagine that you are sitting in a doctor's office, a bookstore, the airport, a pub or coffee shop. You have 10 minutes to spend looking for books using the Amazon/LibraryThing collection. Insert in your "bag" (an interface object that will work like a shopping cart without the ecommerce checkout) any books that you found unexpected, surprising or novel and make notes on the stuff you found interesting along the way.

Document collection

The Interactive task will use the same collection as the Suggestion task. The Amazon/LibraryThing (A/LT) collection consists of book descriptions for 1.5 million books. The descriptions are taken from Amazon.com and are enriched with user-generated content from LibraryThing. The data was crawled by the University of Duisburg-Essen in early 2009. Each book description contains publishers supplied metadata (book title, author, publisher, year publication), subject metadata (classification code, subject headings) and user-generated content (Amazon user ratings and reviews and LibraryThing user tags).

Data Collection & Experimental System Set-up

Half of the users will browse and search the A/LT collection using an experimental multistage interface designed at supporting different stages of the search process. The other half of he users will browse and search the same collection using a more traditional interface.

The experimental system will use pre- and post-interaction questionnaires and logging of user interactions in order to capture user behaviour. The task attempts to address research questions such as:

How do searchers use professional metadata and user-generated content in book search?
Do professional and social metadata serve different purposes?
How can different stages in the book search process be supported?
What patterns emerge from a non-goal based social book search task?

To conduct the task we will use the web-based experimental system developed by the PROMISE NoE at the University of Sheffield. The following data will be collected:

user profile (using questionnaires), e.g. age, gender, level of education, native Language, plus all languages fluent, country of residence
user behaviour/cognition (use questionnaires)
user actions (from logs), using the following metrics: Queries, mouse clicks, result page, documents viewed, documents saved, etc.
user experience with browsing (use questionnaires)
motivation (use questionnaires)

User constraints

Participants in the study must be adults (18 or older).

Participants will receive web-based access to the experimental system, which comes with a standardized protocol for pre-questionnaire, task-based interaction and post-questionnaire. One user will approximately need 25 minutes for the entire experiment.

Requirements for participation

Participating research groups agree to:

use the standard protocol that specifies instructions to the participant, the task the participant will do, and the data that will be collected,
use the standard interface provided by Interactive Social Book Search group,
process a minimum of 20 users, whereas 10 users should be observed in a lab-based environment and 10 users can remotely access the system,
conduct all observations in English (since the collection is in English, other languages cannot be served).
how the communal data will be used (shared data use of all observations),
specify a research question for data analysis of the shared data points before data is shared (to avoid redundant analyses).

Interactive task timeline

22�31 April	Registration
1-31 May	Data gathering
5 June	Release of shared data pool to all participants
15 June	CLEF working notes papers due

Registration

To register for this task and get access to the experimental system, please send an email to marijn.koolen@uva.nl.

Organizers

Marijn Koolen (University of Amsterdam)
Toine Bogers (Aalborg University Copenhagen)
Antoine Doucet (University of Caen)
Preben Hansen (Stockholm University)
Hugo Huurdeman (University of Amsterdam)
Jaap Kamps (University of Amsterdam)
Gabriella Kazai (Microsoft Research Cambridge)
Monica Landoni (University of Lugano)
Birger Larsen (Aalborg University Copenhagen)
Vivien Petras (Humboldt University Berlin)
Michael Preminger (Oslo and Akershus University College of Applied Sciences)
Mette Skov (Aalborg University Copenhagen)
Elaine Toms (University of Sheffield)

For questions regarding this track, please contact marijn.koolen@uva.nl

Imprint | Data protection | Contact someone about INEX