Context Aware Query Rewriting for Text Rankers using LLM

AI-generated keywords: Document Ranking Query Rewriting Context-Aware Prompting Large-Language Models (LLMs) Natural Language Question Generation

AI-generated Key Points

  • Comprehensive framework for leveraging large-language models (LLMs) to improve document ranking through query rewriting
  • Proposed approach called context-aware query rewriting (CAR)
  • CAR offers significant improvements in passage and document ranking tasks compared to using original queries
  • Importance of considering surrounding context in paraphrasing
  • Challenges associated with developing the CAR framework
  • More principled approach for identifying and filtering ambiguous queries suggested
  • Mention of other related approaches in query rewriting, such as statistical methods and generative models
  • Discussion on recent advancements in natural language question generation for query reformulation
  • Relevance feedback methods used in e-commerce domains mentioned
  • Experimental results demonstrate effectiveness of the CAR framework in improving retrieval performance
  • Potential of LLMs for improved document ranking highlighted
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Abhijit Anand, Venktesh V, Vinay Setty, Avishek Anand

License: CC BY-SA 4.0

Abstract: Query rewriting refers to an established family of approaches that are applied to underspecified and ambiguous queries to overcome the vocabulary mismatch problem in document ranking. Queries are typically rewritten during query processing time for better query modelling for the downstream ranker. With the advent of large-language models (LLMs), there have been initial investigations into using generative approaches to generate pseudo documents to tackle this inherent vocabulary gap. In this work, we analyze the utility of LLMs for improved query rewriting for text ranking tasks. We find that there are two inherent limitations of using LLMs as query re-writers -- concept drift when using only queries as prompts and large inference costs during query processing. We adopt a simple, yet surprisingly effective, approach called context aware query rewriting (CAR) to leverage the benefits of LLMs for query understanding. Firstly, we rewrite ambiguous training queries by context-aware prompting of LLMs, where we use only relevant documents as context.Unlike existing approaches, we use LLM-based query rewriting only during the training phase. Eventually, a ranker is fine-tuned on the rewritten queries instead of the original queries during training. In our extensive experiments, we find that fine-tuning a ranker using re-written queries offers a significant improvement of up to 33% on the passage ranking task and up to 28% on the document ranking task when compared to the baseline performance of using original queries.

Submitted to arXiv on 31 Aug. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2308.16753v1

This work presents a comprehensive framework for leveraging large-language models (LLMs) to improve document ranking through query rewriting. The proposed approach, called context-aware query rewriting (CAR), addresses limitations of using LLMs as query re-writers and offers significant improvements in passage and document ranking tasks compared to using original queries. The authors also discuss the importance of considering surrounding context in paraphrasing and highlight challenges associated with developing their framework. They suggest a more principled approach for identifying and filtering ambiguous queries and mention other related approaches in query rewriting, such as statistical methods and generative models. Additionally, they discuss recent advancements in natural language question generation for query reformulation and relevance feedback methods used in e-commerce domains. Experimental results demonstrate the effectiveness of the CAR framework in improving downstream retrieval performance. Overall, this work highlights the potential of LLMs for improved document ranking and presents a promising solution with its CAR approach.
Created on 15 Jan. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.