Generative retrieval (GR) has become a highly active area of information retrieval (IR) that has witnessed significant growth recently. Compared to the traditional ``index-retrieve-then-rank'' pipeline, the GR paradigm aims to consolidate all information within a corpus into a single model. Typically, a sequence-to-sequence model is trained to directly map a query to its relevant document identifiers (i.e., docids). This tutorial offers an introduction to the core concepts of the GR paradigm and a comprehensive overview of recent advances in its foundations and applications.
We start by providing preliminary information covering foundational aspects and problem formulations of GR. Then, our focus shifts towards recent progress in docid design, training approaches, inference strategies, and the applications of GR. We end by outlining remaining challenges and issuing a call for future GR research. This tutorial is intended to be beneficial to both researchers and industry practitioners interested in developing novel GR solutions or applying them in real-world scenarios.
Our tutorial is scheduled for November 26th from 13:00 to 16:30 (GMT+8). Please note that there could be revisions to the presentation slides. [Slides]
Time | Section | Presenter |
---|---|---|
13:00 — 13:10 | Section 1: Introduction | Maarten de Rijke |
13:10 — 13:30 | Section 2: Definition & Preliminaries | Jiafeng Guo |
13:30 — 14:30 | Section 3: Docid designs | Yubao Tang |
14:30 — 14:45 | 15min coffee break | |
14:45 — 15:20 | Section 4: Training approaches | Ruqing Zhang |
15:20 — 15:40 | Section 5: Inference strategies | Ruqing Zhang |
15:40 — 16:00 | Section 6: Applications | Yubao Tang |
16:00 — 16:10 | Section 7: Challenges & Opportunities | Maarten de Rijke |
16:10 — 16:30 | Q & A | All |
The tutorial extensively covers papers highlighted in bold.
Unstructured atomic integers
Naively structured strings
Semantically structured strings
Product quantization strings
Titles
URLs
Pseudo queries
Important terms
Constrained beam search with prefix tree
Constrained greedy search with inverted index
Constrained beam search with FM-index
Aggregation functions
@inproceedings{tang-2023-recent,
author = {Tang, Yubao and Zhang, Ruqing and Guo, Jiafeng and de Rijke, Maarten},
booktitle = {SIGIR-AP 2023: 1st International ACM SIGIR Conference on Information Retrieval in the Asia Pacific},
date-added = {2023-10-07 17:24:48 +0200},
date-modified = {2023-10-07 17:26:24 +0200},
month = {November},
publisher = {ACM},
title = {Recent Advances in Generative Information Retrieval},
year = {2023}
}