Faster BERT-based re-ranking through Candidate Passage Extraction

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

Abstract

Most modern information retrieval systems employ a multi-step approach to retrieving documents relevant to a query, first retrieving a set of candidate documents before re-ranking the candidates. The most effective methods of re-ranking use a transformer-based classifier to score documents. Since many documents exceed the input length of transformers, they are split into passages and each passage is classified independently, aggregating the scores for an overall document score. As transformers are slow due to their quadratic attention mechanism, we investigate whether extracting only the most promising passages from documents as input for the classifier can alleviate slow performance on longer documents at inference time while maintaining comparable performance. We explore three methods of passage extraction and find these approaches prove effective, performing comparably to the state-of-the-art while significantly reducing the run-time, with the best results coming from a graph-based passage-ranking algorithm.
Original languageEnglish
Title of host publicationThe Twenty-Ninth Text REtrieval Conference (TREC 2020)
Pages1-5
Number of pages5
Publication statusPublished - 31 Dec 2020

Fingerprint

Dive into the research topics of 'Faster BERT-based re-ranking through Candidate Passage Extraction'. Together they form a unique fingerprint.

Cite this