Traditional search engines identify whole documents that are relevant to a user's information need; the task of locating the relevant information within the document is left to the user. Next generation search engines will perform both tasks; they will identify relevant parts of relevant documents. A search engine that performs such a task is referred to as focused and the discipline is known as Focused Retrieval. The main goal of INEX is to promote the evaluation of focused retrieval by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results.
Focused Retrieval takes many forms. Passage Retrieval from a long document, Element Retrieval from an XML document, Page Retrieval from books, as well as Question Answering. Conducting experiments to measure the retrieval performance of a search engine (focused or otherwise) is time consuming and expensive - so much so that it is impractical for any one institute to perform the task on their own. Worse, performance depends not only on the used search engine, but is also affected by the document collection, the search requests, and the searcher's judgment of the relevancy of results, necessitating the comparative evaluation of different search engines under the exact same conditions. In order to measure performance gains over time (and across institute) it is necessary to standardize the performance evaluation to the extent that the results are scientifically reproducible. That is, standard test collections, metrics, and methodology are essential for progress.
The standardized testing of search engines is performed by Information Retrieval Evaluation Forums. The main four forums are TREC, CLEF, NTCIR, and INEX. INEX is unique because, unlike the others, it provides the means to evaluate focused retrieval search engines.
What INEX Provides
An IR test collection (consisting of a set of documents, a set of information needs (queries), and the answers to those information needs) is needed in order to measure the performance of a search engine. For comparative experiments across the research community, the community must agree on which test collections should and should not be used and under what circumstances. These test collections must be available for the community to use, they must be distributed to the community, and they must be kept up-to-date.
In 2002 INEX licensed a collection of IEEE articles for use in XML element retrieval experiments. In 2005 this collection was expanded with more IEEE articles. In 2006 the IEEE collection was complemented with an XML dump of the Wikipedia, which was itself updated in 2009. The Lonely Planet Guide has also been used, and since 2007 a collection of scanned books (licensed from Microsoft) has also been made available for book retrieval experiments. It is important to note the INEX does not own the document sets, it only standardizes and distributes the document sets.
Each year INEX subscribers (participants) provide sample queries (called topics) they believe are suitable for experimental purposes. These are collected, verified, and de-duplicated by INEX before being distributed back to the participants as a new set of topics. Participants then run the topics through their search engines and submit back to INEX their results. From the submitted result sets (using a technique known as pooling) a set of documents are chosen for evaluation. These documents are then distributed back to the original authors of the topics to make judgments as to which are relevant and which are not for each topic. In this way the relevance of a document to a query is not known before the participant submits their runs, and no one person is responsible for creating the set of topics or making the decision as to which documents are relevant to which topics.
Of note is that INEX neither creates the topic set nor makes the judgments. INEX facilitates the process by allowing research groups interested in Focused Retrieval to identify themselves, and provides a centralized facility to make the community collaboration possible.
Finally the participant's results lists are scored using community agreed metrics and the results are published. Again, INEX does not own the metrics, the software that generates the scores, or the results of the experiments. INEX does, however, make sure the same software implementation of the same metric is applied in the same way to all results list so that the results can be compared fairly.
Workshop and Publications
Each December INEX participants gather for a workshop. There they present details of their retrieval approaches and how successful they were. This workshop is a properly chaired scientific meeting with formal paper presentations as well as informal get-togethers. As INEX does not control the evaluation process (it facilitates the processes) discussions on the document sets, the methodology, the metrics, new experiments, and so on also occur at the workshop. Finally, decisions as to how INEX will function in the following year are made and presented back to workshop participants during the last workshop session.
Two sets of workshop proceeding are published. The first is the pre-proceedings which is a collection of the drafts of papers submitted by participants for presentation at the workshop. The second is a collection of peer-reviewed papers submitted by participants after the workshop and after they have had time to re-evaluate their work in the light of the workshop. These peer-reviewed papers have recently been published by Springer in the Lecture Notes in Computer Science series.
Participation in INEX
There are no restrictions on participation in INEX. It is, however, expected that participants will submit topics, submit the results from their search engines, and will assess topics. If participants are not prepared to fully participate in INEX then it INEX cannot provide the necessary services for the continuation of the community supported experiments.