An Effective System for Multi-format Information Extraction

AI-generated keywords: Information Extraction Multiple Slots Event Extraction Named Entity Recognition Multi-Task Learning

AI-generated Key Points

  • System for LIC-2021 multi-format Information Extraction (IE) task
  • Evaluation of information extraction from multiple dimensions
  • Multiple slots relation extraction
  • Event extraction at sentence-level and document-level
  • Methods employed to address challenges in the competition
  • Schema disintegration method for relation extraction subtask
  • Voting-based method for maximizing model utilization
  • Conversion of sentence-level event extraction into Named Entity Recognition (NER) task
  • Pointer labeling based approach for efficient event extraction
  • Auxiliary trigger recognition model for aiding event extraction
  • Integration of trigger features using multi-task learning mechanism
  • Encoder-Decoder based method with Transformer-alike decoder architecture for document-level event extraction subtask
  • Achieved results and rankings on test set leaderboard:
  • Relation extraction: F1 score of 79.887%
  • Sentence-level event extractions: F1 score of 85.179%
  • Document level event extractions: F1 score of 70.828%
  • Room for improvement in the system:
  • Unannotated triples negatively impacting performance in relation extraction
  • Challenges in processing long text in document-level event extraction subtask
  • Correctly extracting two arguments of one event when they are far apart in a sentence or document requires further study.
  • Funding support from various sources.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Yaduo Liu, Longhui Zhang, Shujuan Yin, Xiaofeng Zhao, Feiliang Ren

NLPCC-Evaluation 2021
License: CC BY-NC-SA 4.0

Abstract: The multi-format information extraction task in the 2021 Language and Intelligence Challenge is designed to comprehensively evaluate information extraction from different dimensions. It consists of an multiple slots relation extraction subtask and two event extraction subtasks that extract events from both sentence-level and document-level. Here we describe our system for this multi-format information extraction competition task. Specifically, for the relation extraction subtask, we convert it to a traditional triple extraction task and design a voting based method that makes full use of existing models. For the sentence-level event extraction subtask, we convert it to a NER task and use a pointer labeling based method for extraction. Furthermore, considering the annotated trigger information may be helpful for event extraction, we design an auxiliary trigger recognition model and use the multi-task learning mechanism to integrate the trigger features into the event extraction model. For the document-level event extraction subtask, we design an Encoder-Decoder based method and propose a Transformer-alike decoder. Finally,our system ranks No.4 on the test set leader-board of this multi-format information extraction task, and its F1 scores for the subtasks of relation extraction, event extractions of sentence-level and document-level are 79.887%, 85.179%, and 70.828% respectively. The codes of our model are available at {https://github.com/neukg/MultiIE}.

Submitted to arXiv on 16 Aug. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2108.06957v1

This paper presents our system for the LIC-2021 multi-format Information Extraction (IE) task. The task aims to evaluate information extraction from various dimensions, including multiple slots relation extraction and event extraction at both sentence-level and document-level. To address the challenges in this competition, we employ different methods. For the relation extraction subtask, we tackle the issue of multiple-O-values schema by using a schema disintegration method. This helps in converting the subtask into a traditional triple extraction task. Additionally, we design a voting-based method that maximizes the utilization of existing models. For the sentence-level event extraction subtask, we convert it into a Named Entity Recognition (NER) task. We utilize a pointer labeling based approach for efficient event extraction. Furthermore, recognizing that annotated trigger information can aid in event extraction, we develop an auxiliary trigger recognition model. We integrate trigger features into the event extraction model using multi-task learning mechanism. In order to handle document-level event extraction subtask, we propose an Encoder-Decoder based method with a Transformer-alike decoder architecture. Our system achieves promising results and ranks No.4 on the test set leaderboard of this multi-format IE task with F1 scores obtained for relation extraction, sentence-level event extractions and document level event extractions being 79.887%, 85.179% and 70.828% respectively. However there is still room for improvement in our system as many triples are not annotated which negatively impacts performance and processing long text remains challenging in document level event extraction subtask along with extracting two arguments of one event correctly when they are far apart in either a sentence or a document being an area that requires further study .In conclusion , our system demonstrates effectiveness in addressing various challenges posed by the LIC 2021 multi format IE task and achieves competitive performance while there are opportunities for further exploration and improvement in future research .This work is supported by National Key R&D Program of China (No .2018YFC0830701), National Natural Science Foundation of China (No .61572120), Fundamental Research Funds for Central Universities (No .N181602013 & N171602003), Ten Thousand Talent Program (No .ZX20200035) & Liaoning Distinguished Professor (No .XLYC1902057).
Created on 26 Aug. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.