Announcements

Overview

Current approaches proposed for keyword search on XML data can be categorized into two broad classes: one for document-centric XML, where the structure is simple and long text fields predominate; the other for data-centric XML, where the structure is very rich and carries important information about objects and their relationships. The INEX 2011 Data Centric Track is investigating retrieval over a strongly structured collection of documents based on IMDB. There are two tasks. The Ad Hoc Search Task has informational requests to be answered by the entities in IMDB (movies, actors, directors, etc.). The Faceted Search Task asks for a restricted list of facets and facet-values that will optimally guide the searcher toward relevant information.

Data Collection

The track uses the IMDB data collection generated from the plain text files published on the IMDb web site on April 10, 2010. There are two kinds of objects in the collection, movies and persons involved in movies, e.g. actors/actresses, directors, producers and so on. Each object is richly structured. For example, each movie has title, rating, directors, actors, plot, keywords, genres, release dates, trivia, etc.; and each person has name, birth date, biography, filmography, etc. Please refer to the movie.dtd and person.dtd in the data collection respectively. Each XML file contains information about one object, i.e. a single movie or person. In total, the IMDB data collection contains 4,418,081 XML files, including 1,594,513 movies, 1,872,471 actors, 129,137 directors who did not act in any movie, 178,117 producers who did not direct nor act in any movie, and 643,843 other people involved in movies who did not produce nor direct nor act in any movie.

2011 IMDB Colleciton (1.4GB)
Information courtesy of The Internet Movie Database (http://www.imdb.com). Used with permission. It is available for personal and non-commercial use. See the IMDb Licence.

1. Ad Hoc Search Task

Each XML document in the data collection represents a structured object in XML format, e.g. a movie or person. The task is to return a ranked list of results (objects, or equivalently documents in the collection) estimated relevant to the user’s information need.

Topics

Each participating group will be asked to submit a total of 3 topics, one for each of the categories below:
- Known-item: Topics that ask for a particular object (movie or person). Example: "I am searching for the version of the movie "Titanic" in which the two major characters are called Jack and Rose respectively". For these topics the relevant answer is a single (or a few) documents. We will ask participants to submit the file(s) name(s) of the relevant document(s).
- List: Topics that ask for a list of objects (movies or persons). For example: "Find movies about drugs that are based on a true story", "Find movies about the era of ancient Rome".
- Informational: Topics that ask for information about any topic/movie/person contained in the collection. For example: "Find information about the making of The Lord of the Rings and similar movies", "I want to know more about Ingmar Bergman and the movies she played in".
We can classify all the fields in the collection into three types: categorical (e.g. genre, keyword, director), numerical (e.g. rating, release_date, year), and free-text (e.g. title, plot, trivia, quote). All submitted topics should involve, at least, one free-text field. A list of all the fields contained in the collection as well as their types can be found in this file.
We ask participants to submit challenging topics that are not easily solved by a current search engine or a DB system. Both Content Only (CO) and Content And Structure (CAS) variants of the information need are requested. An online system will be provided in which participants can try out their topics to check on output.

You can download the official topic set for this task here.

Submission Format

Each participant may submit up to 3 runs. Each run can contain a maximum of 1000 results per topic, ordered by decreasing value of relevance. The results of one run must be contained in one submission file (i.e. up to 3 files can be submitted in total). For relevance assessment and evaluation of the results we require submission files to be in the format described as below.
The submission format is a variant of the familiar TREC format.
<qid> Q0 <file> <rank> <rsv> <run_id>
Here:

the first column is the topic number.
the second column is the query number within that topic. This is currently unused and should always be Q0.
the third column is the file name (without .xml) from which a result is retrieved.
the fourth column is the rank of the result.
the fifth column shows the score (integer or floating point) that generated the ranking. This score MUST be in descending (non-increasing) order and is important to include so that we can handle tied scores (for a given run) in a uniform fashion (the evaluation routines rank documents from these scores, not from your ranks). If you want the precise ranking you submit to be evaluated, the SCORES must reflect that ranking.
the sixth column is called the "run tag" and should be a unique identifier for your group AND for the method used. That is, each run should have a different tag that identifies the group and the method that produced the run. Please change the tag from year to year, since often we compare across years (for graphs and such). Also run tags must contain 12 or fewer letters and numbers, with NO punctuation, to facilitate labeling graphs with the tags.

An example submission is:
2011001 Q0 9996 1 0.9999 2011UniXRun1
2011001 Q0 9997 2 0.9998 2011UniXRun1
2011001 Q0 person_9989 3 0.9997 2011UniXRun1
Here are three results for topic "2011001". The first result is the movie from the file 9996.xml. The second result is the movie from the file 9997.xml, and the third result is the person from the file person_9989.xml.

Relevance Assessments

Relevance assessment will be conducted by participating groups. All the submitted results for each query will be pooled and assessors are asked to identify all relevant results in the pool using the INEX assessment tool.

Evaluation

The effectiveness of the retrieval results submitted by the participants will be evaluated using the metrics as that in traditional IR, e.g. precision, recall, MAP, P@5, P@10, NDCG and so on.

2. Faceted Search Task

Given an exploratory or broad query, the search system may return a large number of results. Faceted search is a way to help users navigate through the large set of query results to quickly identify the results of interest. It presents the user a list of facet-values to choose from along with the list of results. By choosing from the suggested facet-values, the user can refine the query and thus narrow down the list of candidate results. Then, the system may present a new list of facet-values for the user to further refine the query. The interactive process continues until the user finds the items of interest. The key issue in faceted search is to recommend appropriate facet-values for the user to refine the query and thus quickly identify what he/she really wants in the large set of results. The task aims to investigate and evaluate different techniques and strategies of recommending facets and facet-values to the user at each step in a search session.

Topics

Each participating group will be asked to create a set of candidate topics representative of real user needs. Each group should submit 4 topics: two of them specifying a specific point of interest (facet) given a general topic proposed by the organizers and the other two doing the same but based on a general topic proposed by the participants. Each submitted topic should refine a general topic by exploring one or several subtopics/facets of it. The submitted topic should be restricted enough to be satisfied by a minimum of 10 and a maximum of 50 results.

a) Specifying a specific point of interest given a general topic

We ask participants to refine two of the following general topics by exploring one or several subtopics/facets. Each topic should be related to a different general topic from the following list: "trained animals", "dogme", "food", "asian cinema", "art house", "silent movies", "second world war", "animation", "nouvelle vague", "wuxia".

b) Proposing a general topic and specifying its subtopics/facets

Here we ask participants to propose themselves two general topics and refine them by specifying some specific point(s) of interest. Please check that your general topic produces between 1000 and 2000 results aprox.

You can download the official topic set for this task here.

Submission Format

Each participant may submit up to 3 runs of which one run must use this result file. Each run consists of the following two or three files:
(1) Result file. It contains a ranked list of maximum 2000 results per topic. The submission format is the same as that of ad hoc search task.
(2) Facet-Value file. It contains a list or hierarchy of recommended facet-values for each topic, in which each node represents a facet-value and all of its children form the new recommended facet-value list when the user selects this facet-value to refine the query. The maximum number of recommended facet-values on each level of the hierarchy is restricted to be 20. The submission format is in an XML format conforming to the following DTD.
<!ELEMENT run (topic+)>
<!ATTLIST run rid ID #REQUIRED>
<!ELEMENT topic (fv+)>
<!ATTLIST topic tid ID #REQUIRED>
<!ELEMENT fv (fv*)>
<!ATTLIST fv f CDATA #REQUIRED
v CDATA #REQUIRED>
Here:

the root element is <run>, which has an ID type attribute, rid, representing the unique identifier of the run. It must be identical with that in the Result file of the same run.
the <run> contains one or more <topic>s. The ID type attribute, tid, in each <topic> gives the topic number.
each <topic> includes a list of <fv>s. Each <fv> shows a facet-value pair, with f attribute being the facet and v attribute being the value. The facet is expressed as an XPath expression. The set of all the possible facets represented as XPath expressions in the IMDB data collection can be found in this file. We allow only categorical or numerical fields to be possible facets. Free-text fields are not considered. Each facet-value pair corresponds to a facet-value condition to refine the query. For example, <fv f="/movie/overview/directors/director" v="Yimou Zhang"> corresponds to the query condition /movie/overview/directors/director="Yimou Zhang". Note that by default the comparison operator between the facet and value is "=", but if you submit the third file, Faceted Search Module (FSM) as described next, you can implement other possible operators, e.g. ">", "<", "between", and so on.
the <fv>s can be nested to form a hierarchy of facet-values.

An example submission is:
<run rid="2011UniXRun1">
       <topic tid="2011001">
              <fv f="/movie/overview/directors/director" v="Yimou Zhang">
                     <fv f="/movie/cast/actors/actor/name" v="Li Gong">
                            <fv f="/movie/overview/releasedates/releasedate" v="2002"/>
                            <fv f="/movie/overview/releasedates/releasedate" v="2003"/>
                     </fv>
                     <fv f="/movie/cast/actors/actor/name" v="Ziyi Zhang">
                            <fv f="/movie/overview/releasedates/releasedate" v="2005"/>
                     </fv>
           </fv>
            ...
       </topic>
       <topic tid="2011002">
            ...
       </topic>
       …
</run>
Here for the topic "2011001", the search system first recommends the facet-value condition /movie/overview/directors/director="Yimou Zhang" among other facet-value conditions, which are on the same level of the hierarchy. If the user selects this facet-value condition to refine the query, the faceted search system will recommend a new list of facet-value conditions, which are /movie/cast/actors/actor/name="Li Gong" and /movie/cast/actors/actor/name="Ziyi Zhang", for the user to choose from to further refine the query. If the user then selects /movie/cast/actors/actor/name="Li Gong", the system will recommend the facet-value conditions /movie/overview/releasedates/releasedate="2002" and /movie/overview/releasedates/releasedate ="2003". Note that the selected facet-value conditions to refine the query form a path in the tree, e.g. /movie/overview/directors/director="Yimou Zhang" and /movie/cast/actors/actor/name = "Li Gong" and /movie/overview/releasedates/releasedate ="2003". They must not be duplicated, i.e. no facet-value condition occurs twice on the path.
(3) FacetedSearch.jar file. Instead of submitting a static hierarchy of facet-values, participants are given the freedom to dynamically generate lists of recommended facet-values and even change the ranking order of the candidate result list at each step in the search session. This is achieved by submitting a self-implemented dynamically linkable module, called Faceted Search Module (FSM). It implements the FacetedSearchInterface defined as the following:

public interface FacetedSearchInterface {
public String [] openQuery (String queryID, String [] resultList, String [] fvList);
public String [] selectFV (String facet, String value, String [] selectedFV);
public String [] refineQuery (String facet, String value, String [] selectedFV);
public String [] expandFacet(String facet);
public void closeQuery (String queryID);
}

public class FacetedSearch implements FacetedSearchInterface {
   // to be implemented by the participant
}

    The User Simulation System (USS) used in evaluation will interact with the FSM to simulate a faceted search session. The USS starts to evaluate a run by instantiating a FacetedSearch object. For each query to be evaluated, the USS first invokes openQuery() method to initialize the object with the query id, initial result list and recommended facet-value list for this query. The result list is actually the list of values in the third column of the Result file, i.e. the names (without .xml) of retrieved files. The initial recommended facet-value list can be the top-level facet-value pairs shown in the Facet-Value file, or empty. If it is empty, then the openQuery() method would generate and return the initial list of recommended facet-values for the given result list. Each facet-value pair, in the input or output to the openQuery() method, is encoded into a String in the format "<facet>::<value>", for example, "/movie/overview/directors/director::Yimou Zhang".
    After opening a query, the USS then simulates a user’s behaviors in a faceted search system based on some user models. When the simulated user selects a facet-value to refine the query, the selectFacetValue() method would be called to return a new list of recommended facet-values; and the refineQuery() method would be called to return a new list of candidate results satisfying all the selected facet-value conditions. The inputs to both methods are the currently selected facet and value, along with the list of previously selected facet-values. The outputs of both methods are the new list of recommended facet-values and new list of candidate results respectively. Each String encodes a single facet-value pair or result in the same formats as that in the openQuery() method. If the user could not find a relevant facet-value to refine the query in the recommended list, he/she could probably expand the facet-value list by choosing a facet among all possible facets, examine all its values and then select one to refine the query. In such a case, the USS invokes the expandFacet() method with the name of the facet to be expanded as the input and the list of all possible values of this facet as output. Observe that in the specification of FacetedSearchInterface, we do not restrict facet-value comparisons to be of equality, but can be of any other possible semantics since the interpretation of facet-value conditions is capsulated into the implementation of FacetedSearchInterface. Thus, given the same facet, different systems may give different sets of all possible values depending on if they will cluster and how they will cluster some values.
    When the search session of a query ends, the closeQuery() method is invoked. The FacetedSearch object will be used as a persistent object over the entire evaluation of a run. That is, different queries in the same run will be evaluated using the same FacetedSearch object. But different runs may have different implementations of the FacetedSearch class.

Relevance Assessments

For faceted search, the set of all possible relevant results to a broad query is typically huge. The goal of faceted search is not to identify all of them, but to help users identify a small set of interest among them through gradual query refinements. We will use two approaches to decide the set of interest for a broad topic.
- Subtopic: Assessors from a third party like Amazon Mechanical Turk would be asked to decide on a subtopic under the broad topic by adding more constraints to the original query, which will result in a small set of relevant results. Then the Amazon Mechanical Turk or participating groups are asked to assess this subtopic on the set of pooled results of the broad topic. The set of relevant results for this subtopic will be treated as the set of results that a user might be interested in when he/she submits the broad topic.
- Random: Randomly select a result from the result list as the target result. This way was adopted in most research papers on faceted search, which makes sense since given a broad topic, different users’ interests could vary greatly and it can be said that each result could be relevant to a specific subtopic of a particular user.

Evaluation

For its very first year, we will try two types of evaluation approaches and metrics to gain better understanding to the problem.
- NDCG: The relevance of the list or hierarchy of recommended facet-values is evaluated based on the relevance of the data covered by these facet-values, measured by NDCG for example. The details of this evaluation methodology are given in this article (bib info).
- User Simulation: The effectiveness of a faceted search system is evaluated by measuring the interaction cost or the amount of efforts spent by a user in meeting his/her information needs. To avoid expensive user study and make the evaluation repeatable, we will apply user simulation methodology like that used in [1, 2] to measure the costs. Detailed specifications on the evaluation methodology based on user simulation will be distributed in June.

[1] J. Koren, Y. Zhang, X. Liu, Personalized Interactive Faceted Search, WWW 2008.
[2] A. Kashyap, V. Hristidis, M. Petropoulos, FACeTOR: Cost-Driven Exploration of Faceted Query Results, CIKM 2010.

New Schedule

June 28: Topic submission deadline
July 20: Topics distributed
Sep. 26: Run submission deadline
Oct. 1-Oct. 20: Relevance assessments
Nov. 1: Relevance assessments and results available

Organizers

Qiuyue Wang, Renmin University of China
Georgina Ramírez Camps, Universitat Pompeu Fabra
Maarten Marx, University of Amsterdam
Timm Meiser, Max-Planck-Institut für Informatik
Jaap Kamps, University of Amsterdam

Imprint | Data protection | Contact someone about INEX