|
INEX 2011 Data Centric Track
|
|
|
Announcements
Results for the AdHoc task available
Overview
Current approaches proposed for keyword
search on XML data can be categorized into two broad classes: one for
document-centric XML, where the structure is simple and long text fields
predominate; the other for data-centric XML, where the structure is very rich
and carries important information about objects and their relationships. The
INEX 2011 Data Centric Track is investigating retrieval over a strongly
structured collection of documents based on IMDB. There are two tasks. The
Ad Hoc Search Task has informational
requests to be answered by the entities in IMDB (movies, actors, directors,
etc.). The Faceted Search Task asks
for a restricted list of facets and facet-values that will optimally guide the
searcher toward relevant information.
Data Collection
The track uses the IMDB data collection
generated from the plain text files published on the IMDb web site on April 10,
2010. There are two kinds of objects in the collection, movies and persons
involved in movies, e.g. actors/actresses, directors, producers and so on. Each
object is richly structured. For example, each movie has title, rating,
directors, actors, plot, keywords, genres, release dates, trivia, etc.; and each
person has name, birth date, biography, filmography, etc. Please refer to the
movie.dtd and person.dtd in the data collection respectively. Each XML file
contains information about one object, i.e. a single movie or person. In total,
the IMDB data collection contains 4,418,081 XML files, including 1,594,513
movies, 1,872,471 actors, 129,137 directors who did not act in any movie,
178,117 producers who did not direct nor act in any movie, and 643,843 other
people involved in movies who did not produce nor direct nor act in any movie.
2011 IMDB Colleciton (1.4GB)
Information courtesy of The Internet Movie Database (http://www.imdb.com). Used with permission. It is available for personal and non-commercial use. See the IMDb Licence.
1. Ad Hoc Search Task
Each XML document in the data collection
represents a structured object in XML format, e.g. a movie or person. The task
is to return a ranked list of results (objects, or equivalently documents in the
collection) estimated relevant to the user’s information need.
Topics
Each participating group will be asked to submit a total of 3 topics, one for each of the categories below:
- Known-item: Topics that ask for a particular object (movie or person). Example:
"I am searching for the version of the movie "Titanic" in which the two major
characters are called Jack and Rose respectively". For these topics the relevant
answer is a single (or a few) documents. We will ask participants to submit the
file(s) name(s) of the relevant document(s).
- List: Topics that ask for a list of objects (movies or
persons). For example: "Find movies about drugs that are based on a true story",
"Find movies about the era of ancient Rome".
- Informational: Topics that ask for
information about any topic/movie/person contained in the collection. For
example: "Find information about the making of The Lord of the Rings and similar
movies", "I want to know more about Ingmar Bergman and the movies she played
in".
We can classify all the fields in the collection into three types: categorical
(e.g. genre, keyword, director), numerical (e.g. rating, release_date, year),
and free-text (e.g. title, plot, trivia, quote). All submitted topics should
involve, at least, one free-text field. A list of all the fields contained in
the collection
as well as their types
can be found in
this file.
We ask participants to submit
challenging topics that are not easily solved by
a current search engine or a DB system. Both Content Only (CO) and
Content And Structure (CAS) variants of the information need are requested. An
online system will be provided in which participants can try out their topics to
check on output.
You can download the official topic set for this task here.
Submission Format
Each participant may submit up to 3
runs. Each run can contain a maximum of 1000 results per topic, ordered by
decreasing value of relevance. The results of one run must be contained in one
submission file (i.e. up to 3 files can be submitted in total). For relevance
assessment and evaluation of the results we require submission files to be in
the format described as below.
The submission format is a variant of the familiar TREC
format.
<qid> Q0 <file> <rank> <rsv> <run_id>
Here:
-
the first column is the topic number.
-
the second column is the query number within that topic.
This is currently unused and should always be Q0.
-
the third column is the file name (without .xml) from
which a result is retrieved.
-
the fourth column is the rank of the result.
-
the fifth column shows the score (integer or floating
point) that generated the ranking. This score MUST be in descending
(non-increasing) order and is important to include so that we can handle
tied scores (for a given run) in a uniform fashion (the evaluation routines
rank documents from these scores, not from your ranks). If you want the
precise ranking you submit to be evaluated, the SCORES must reflect that
ranking.
-
the sixth column is called the "run tag" and should be a
unique identifier for your group AND for the method used. That is, each run
should have a different tag that identifies the group and the method that
produced the run. Please change the tag from year to year, since often we
compare across years (for graphs and such). Also run tags must contain 12 or
fewer letters and numbers, with NO punctuation, to facilitate labeling
graphs with the tags.
An example submission is:
2011001 Q0 9996 1 0.9999 2011UniXRun1
2011001 Q0 9997 2 0.9998 2011UniXRun1
2011001 Q0 person_9989 3 0.9997 2011UniXRun1
Here are three results for topic
"2011001". The first result is the movie from the file 9996.xml. The second
result is the movie from the file 9997.xml, and the third result is the person
from the file person_9989.xml.
Relevance Assessments
Relevance assessment will be conducted by participating groups.
All the submitted results for each query will be
pooled and assessors are asked to identify all relevant results in the pool
using the INEX assessment tool.
Evaluation
The effectiveness of the retrieval
results submitted by the participants will be evaluated using the metrics as
that in traditional IR, e.g. precision, recall, MAP, P@5, P@10, NDCG and so on.
2. Faceted Search Task
Given an exploratory or broad query, the
search system may return a large number of results. Faceted search is a way to
help users navigate through the large set of query results to quickly identify
the results of interest. It presents the user a list of facet-values to choose
from along with the list of results. By choosing from the suggested
facet-values, the user can refine the query and thus narrow down the list of
candidate results. Then, the system may present a new list of facet-values for
the user to further refine the query. The interactive process continues until
the user finds the items of interest. The key issue in faceted search is to
recommend appropriate facet-values for the user to refine the query and thus
quickly identify what he/she really wants in the large set of results. The task
aims to investigate and evaluate different techniques and strategies of
recommending facets and facet-values to the user at each step in a search
session.
Topics
Each participating group will be asked to create a set of candidate topics representative of real user needs. Each group should submit 4 topics: two of them specifying a specific point of interest (facet) given a general topic proposed by the organizers and the other two doing the same but based on a general topic proposed by the participants.
Each submitted topic should refine a general topic by exploring one or several subtopics/facets of it. The submitted topic should be restricted enough to be satisfied by a minimum of 10 and a maximum of 50 results.
a) Specifying a specific point of interest given a general topic
We ask participants to refine two of the following general topics by exploring one or several subtopics/facets. Each topic should be related to a different general topic from the following list:
"trained animals", "dogme", "food", "asian cinema", "art house", "silent movies",
"second world war", "animation", "nouvelle vague", "wuxia".
b) Proposing a general topic and specifying its subtopics/facets
Here we ask participants to propose themselves two general topics and refine them by specifying some specific point(s) of interest. Please check that your general topic produces between 1000 and 2000 results aprox.
You can download the official topic set for this task here.
Submission Format
Each participant may submit up to 3 runs of which one run must use this result file. Each run consists of the following two or three files:
(1) Result file. It contains a ranked list of maximum 2000 results per topic. The submission format is the same as that of ad hoc search task.
(2) Facet-Value file. It contains a list or
hierarchy of recommended facet-values for each topic, in which each node
represents a facet-value and all of its children form the new recommended
facet-value list when the user selects this facet-value to refine the query. The
maximum number of recommended facet-values on each level of the hierarchy is
restricted to be 20. The submission format is in an XML format conforming to the
following DTD.
<!ELEMENT run (topic+)>
<!ATTLIST run rid ID #REQUIRED>
<!ELEMENT topic (fv+)>
<!ATTLIST topic tid ID #REQUIRED>
<!ELEMENT fv (fv*)>
<!ATTLIST fv f CDATA #REQUIRED
v CDATA #REQUIRED>
Here:
-
the root element is <run>, which has an ID type
attribute, rid, representing the
unique identifier of the run. It must be identical with that in the
Result file of the same run.
-
the <run> contains one or more <topic>s. The ID type
attribute, tid, in each <topic>
gives the topic number.
-
each <topic> includes a list of <fv>s. Each <fv> shows a
facet-value pair, with f attribute being the facet and
v attribute being the value. The facet is expressed as an XPath
expression. The set of all the possible facets represented as XPath
expressions in the IMDB data collection can be found in
this file. We allow only
categorical or numerical fields to be possible facets. Free-text
fields are not considered. Each facet-value pair corresponds to a
facet-value condition to refine the query. For example, <fv
f="/movie/overview/directors/director" v="Yimou Zhang"> corresponds to the
query condition /movie/overview/directors/director="Yimou Zhang". Note that
by default the comparison operator between the facet and value is "=", but
if you submit the third file, Faceted Search Module (FSM) as described next,
you can implement other possible operators, e.g. ">", "<", "between", and so
on.
-
the <fv>s can be nested to form a hierarchy of
facet-values.
An example submission is:
<run rid="2011UniXRun1">
<topic
tid="2011001">
<fv f="/movie/overview/directors/director" v="Yimou Zhang">
<fv f="/movie/cast/actors/actor/name" v="Li Gong">
<fv f="/movie/overview/releasedates/releasedate" v="2002"/>
<fv f="/movie/overview/releasedates/releasedate" v="2003"/>
</fv>
<fv f="/movie/cast/actors/actor/name" v="Ziyi Zhang">
<fv f="/movie/overview/releasedates/releasedate" v="2005"/>
</fv>
</fv>
...
</topic>
<topic tid="2011002">
...
</topic>
…
</run>
Here for the topic "2011001", the search
system first recommends the facet-value condition
/movie/overview/directors/director="Yimou Zhang" among other facet-value
conditions, which are on the same level of the hierarchy. If the user selects
this facet-value condition to refine the query, the faceted search system will
recommend a new list of facet-value conditions, which are
/movie/cast/actors/actor/name="Li Gong" and /movie/cast/actors/actor/name="Ziyi
Zhang", for the user to choose from to further refine the query. If the user
then selects /movie/cast/actors/actor/name="Li Gong", the system will recommend
the facet-value conditions /movie/overview/releasedates/releasedate="2002" and
/movie/overview/releasedates/releasedate ="2003". Note that the selected
facet-value conditions to refine the query form a path in the tree, e.g.
/movie/overview/directors/director="Yimou Zhang"
and /movie/cast/actors/actor/name = "Li Gong" and /movie/overview/releasedates/releasedate ="2003". They must not be duplicated,
i.e. no facet-value condition occurs twice on the path.
(3) FacetedSearch.jar file. Instead of submitting
a static hierarchy of facet-values, participants are given the freedom to
dynamically generate lists of recommended facet-values and even change the
ranking order of the candidate result list at each step in the search session.
This is achieved by submitting a self-implemented dynamically linkable module,
called Faceted Search Module (FSM). It implements the FacetedSearchInterface
defined as the following:
public interface FacetedSearchInterface {
public String [] openQuery (String queryID, String [] resultList, String []
fvList);
public String [] selectFV (String facet, String value, String [] selectedFV);
public String [] refineQuery (String facet, String value, String [] selectedFV);
public String [] expandFacet(String facet);
public void closeQuery (String queryID);
}
public class FacetedSearch implements FacetedSearchInterface {
// to be implemented by the
participant
}
The User Simulation System (USS) used in evaluation will
interact with the FSM to simulate a faceted search session. The USS starts to
evaluate a run by instantiating a FacetedSearch object. For each query to be
evaluated, the USS first invokes openQuery() method to initialize the object
with the query id, initial result list and recommended facet-value list for this
query. The result list is actually the list of values in the third column of the
Result file, i.e. the names (without .xml) of retrieved files. The initial
recommended facet-value list can be the top-level facet-value pairs shown in the
Facet-Value file, or empty. If it is empty, then the openQuery() method would
generate and return the initial list of recommended facet-values for the given
result list. Each facet-value pair, in the input or output to the openQuery()
method, is encoded into a String in the format "<facet>::<value>", for example,
"/movie/overview/directors/director::Yimou Zhang".
After opening a query, the USS then simulates a user’s
behaviors in a faceted search system based on some user models. When the
simulated user selects a facet-value to refine the query, the selectFacetValue()
method would be called to return a new list of recommended facet-values; and the
refineQuery() method would be called to return a new list of candidate results
satisfying all the selected facet-value conditions. The inputs to both methods
are the currently selected facet and value, along with the list of previously
selected facet-values. The outputs of both methods are the new list of
recommended facet-values and new list of candidate results respectively. Each
String encodes a single facet-value pair or result in the same formats as that
in the openQuery() method. If the user could not find a relevant facet-value to
refine the query in the recommended list, he/she could probably expand the
facet-value list by choosing a facet among all possible facets, examine all its
values and then select one to refine the query. In such a case, the USS invokes
the expandFacet() method with the name of the facet to be expanded as the input
and the list of all possible values of this facet as output. Observe that in the
specification of FacetedSearchInterface, we do not restrict facet-value
comparisons to be of equality, but can be of any other possible semantics since
the interpretation of facet-value conditions is capsulated into the
implementation of FacetedSearchInterface. Thus, given the same facet, different
systems may give different sets of all possible values depending on if they will
cluster and how they will cluster some values.
When the search session of a query ends, the closeQuery()
method is invoked. The FacetedSearch object will be used as a persistent object
over the entire evaluation of a run. That is, different queries in the same run
will be evaluated using the same FacetedSearch object. But different runs may
have different implementations of the FacetedSearch class.
Relevance Assessments
For faceted search, the set of all possible relevant results to a
broad query is typically huge. The goal of faceted search is not to identify all
of them, but to help users identify a small set of interest among them through
gradual query refinements. We will use two approaches to decide the set of
interest for a broad topic.
- Subtopic:
Assessors from a third party like Amazon Mechanical Turk would be asked to
decide on a subtopic under the broad topic by adding more constraints to the
original query, which will result in a small set of relevant results. Then the
Amazon Mechanical Turk or participating groups are asked to assess this subtopic
on the set of pooled results of the broad topic. The set of relevant results for
this subtopic will be treated as the set of results that a user might be
interested in when he/she submits the broad topic.
- Random: Randomly select a result from the result list as the target
result. This way was adopted in most research papers on faceted search, which
makes sense since given a broad topic, different users’ interests could vary
greatly and it can be said that each result could be relevant to a specific
subtopic of a particular user.
Evaluation
For its very first year, we will try two
types of evaluation approaches and metrics to gain better understanding to the
problem.
-
NDCG: The relevance of the
list or hierarchy of recommended facet-values is evaluated based on the
relevance of the data covered by these facet-values, measured by NDCG for
example. The details of this evaluation methodology are given in
this article (bib info).
-
User Simulation: The
effectiveness of a faceted search system is evaluated by measuring the
interaction cost or the amount of efforts spent by a user in meeting his/her
information needs. To avoid expensive user study and make the evaluation
repeatable, we will apply user simulation methodology like that used in [1, 2]
to measure the costs. Detailed specifications on the evaluation methodology
based on user simulation will be distributed in June.
[1] J. Koren, Y. Zhang, X. Liu,
Personalized Interactive Faceted Search,
WWW 2008.
[2] A. Kashyap, V. Hristidis, M.
Petropoulos, FACeTOR: Cost-Driven
Exploration of Faceted Query Results, CIKM 2010.
New Schedule
- June 28: Topic submission deadline
- July 20: Topics distributed
- Sep. 26: Run submission deadline
- Oct. 1-Oct. 20: Relevance assessments
- Nov. 1: Relevance assessments and results available
Organizers
Qiuyue Wang, Renmin University of China
Georgina Ramírez Camps, Universitat Pompeu Fabra
Maarten Marx, University of Amsterdam
Timm Meiser, Max-Planck-Institut für Informatik
Jaap Kamps, University of Amsterdam