Learning to Rank Context for Named Entity Recognition Using a Synthetic Dataset

Summaries already available in other languages: fr

Authors: Arthur Amalvy (LIA), Vincent Labatut (LIA), Richard Dufour (LS2N - équipe TALN)

The 2023 Conference on Empirical Methods in Natural Language Processing, Dec 2023, Singapore, Singapore

Abstract: While recent pre-trained transformer-based models can perform named entity recognition (NER) with great accuracy, their limited range remains an issue when applied to long documents such as whole novels. To alleviate this issue, a solution is to retrieve relevant context at the document level. Unfortunately, the lack of supervision for such a task means one has to settle for unsupervised approaches. Instead, we propose to generate a synthetic context retrieval training dataset using Alpaca, an instructiontuned large language model (LLM). Using this dataset, we train a neural context retriever based on a BERT model that is able to find relevant context for NER. We show that our method outperforms several retrieval baselines for the NER task on an English literary dataset composed of the first chapter of 40 books.

Submitted to arXiv on 16 Oct. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

⚠The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2310.10118v1

⚠This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

Comprehensive Summary
Key points
Layman's Summary
Blog article

The summary is not ready yet

The key points are not ready yet

The Layman's summary is not ready yet

The blog article is not ready yet

Created on 17 Oct. 2023

Available in other languages: fr

Assess the quality of the AI-generated content by voting

Score: 0

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

⚠The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

80.5%

RoBERTa: A Robustly Optimized BERT Pretraining Approach

cs.CL

79.8%

KG-BERT: BERT for Knowledge Graph Completion

cs.CL

79.8%

Context-sensitive neocortical neurons transform the effectiveness and efficie…

cs.NE

79.3%

Leveraging Contextual Information for Effective Entity Salience Detection

cs.CL

78.8%

Efficient Self-supervised Learning with Contextualized Target Representations…

cs.LG

78.5%

Large language models effectively leverage document-level context for literar…

cs.CL

78.0%

BERT: Pre-training of Deep Bidirectional Transformers for Language Understand…

cs.CL

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.