KLUE: Korean Language Understanding Evaluation

AI-generated keywords: KLUE benchmark NLU tasks NER task pretrained language models annotation protocols

AI-generated Key Points

  • The paper introduces the Korean Language Understanding Evaluation (KLUE) benchmark, consisting of 8 NLU tasks in Korean.
  • The tasks include Topic Classification, Semantic Textual Similarity, Natural Language Inference, Named Entity Recognition (NER), Relation Extraction, Dependency Parsing, Machine Reading Comprehension, and Dialogue State Tracking.
  • The authors built these tasks from scratch using diverse source corpora while respecting copyrights.
  • Annotation protocols were designed with ethical considerations in mind.
  • Suitable evaluation metrics and fine-tuning recipes for pretrained language models are provided for each task.
  • Two pretrained language models (KLUE-BERT and KLUE-RoBERTa) are released to reproduce baseline models on KLUE and facilitate future research.
  • Preliminary experiments show that KLUE-RoBERTa-large outperforms other baselines and existing open-source Korean PLMs.
  • Performance is minimally affected when personally identifiable information is replaced from the pretraining corpus, suggesting privacy and NLU capability are not at odds with each other.
  • BPE tokenization combined with morpheme-level pre-tokenization is effective in tasks involving morpheme-level tagging detection and generation.
  • Comprehensive documentation on creating KLUE is provided to accelerate Korean NLP research and facilitate similar resources for other languages in the future.
  • Section 2 discusses source corpora selection criteria; Section 3 presents detailed information about each task; Section 4 focuses on the Named Entity Recognition (NER) task.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Yongsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Lyu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jungwoo Ha, Kyunghyun Cho Alice Oh Jungwoo Ha Kyunghyun Cho

76 pages, 10 figures, 36 tables
License: CC BY-SA 4.0

Abstract: We introduce Korean Language Understanding Evaluation (KLUE) benchmark. KLUE is a collection of 8 Korean natural language understanding (NLU) tasks, including Topic Classification, Semantic Textual Similarity, Natural Language Inference, Named Entity Recognition, Relation Extraction, Dependency Parsing, Machine Reading Comprehension, and Dialogue State Tracking. We build all of the tasks from scratch from diverse source corpora while respecting copyrights, to ensure accessibility for anyone without any restrictions. With ethical considerations in mind, we carefully design annotation protocols. Along with the benchmark tasks and data, we provide suitable evaluation metrics and fine-tuning recipes for pretrained language models for each task. We furthermore release the pretrained language models (PLM), KLUE-BERT and KLUE-RoBERTa, to help reproduce baseline models on KLUE and thereby facilitate future research. We make a few interesting observations from the preliminary experiments using the proposed KLUE benchmark suite, already demonstrating the usefulness of this new benchmark suite. First, we find KLUE-RoBERTa-large outperforms other baselines, including multilingual PLMs and existing open-source Korean PLMs. Second, we see minimal degradation in performance even when we replace personally identifiable information from the pretraining corpus, suggesting that privacy and NLU capability are not at odds with each other. Lastly, we find that using BPE tokenization in combination with morpheme-level pre-tokenization is effective in tasks involving morpheme-level tagging, detection and generation. In addition to accelerating Korean NLP research, our comprehensive documentation on creating KLUE will facilitate creating similar resources for other languages in the future. KLUE is available at this https URL (https://klue-benchmark.com/).

Submitted to arXiv on 20 May. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2105.09680v1

The paper introduces the Korean Language Understanding Evaluation (KLUE) benchmark, which consists of 8 natural language understanding (NLU) tasks in Korean. These tasks include Topic Classification, Semantic Textual Similarity, Natural Language Inference, Named Entity Recognition (NER), Relation Extraction, Dependency Parsing, Machine Reading Comprehension and Dialogue State Tracking. The authors built these tasks from scratch using diverse source corpora while respecting copyrights to ensure accessibility for anyone without restrictions. They also designed annotation protocols with ethical considerations in mind. In addition to the benchmark tasks and data, the authors provide suitable evaluation metrics and fine-tuning recipes for pretrained language models for each task. They also release two pretrained language models (PLMs), KLUE-BERT and KLUE-RoBERTa, to help reproduce baseline models on KLUE and facilitate future research. Preliminary experiments using the proposed KLUE benchmark suite have yielded interesting observations. First, KLUE-RoBERTa-large outperforms other baselines including multilingual PLMs and existing open-source Korean PLMs. Second, there is minimal degradation in performance even when personally identifiable information is replaced from the pretraining corpus suggesting that privacy and NLU capability are not at odds with each other. Lastly, using BPE tokenization in combination with morpheme-level pre-tokenization is effective in tasks involving morpheme-level tagging detection and generation. The paper also provides comprehensive documentation on creating KLUE to accelerate Korean NLP research and facilitate the creation of similar resources for other languages in the future. Section 2 discusses source corpora selection criteria and provides details about selected corpora; Section 3 presents detailed information about each task in the KLUE benchmark suite; Section 4 focuses on the Named Entity Recognition (NER) task including dataset construction evaluation metrics related work and conclusions. The authors use two corpora WIKITREE and NSMC to incorporate both formal and informal writing styles in the NER task. WIKITREE is a news article corpus suitable for NER due to its formal sentences with many entity types while NSMC includes colloquial reviews of movies or TV shows providing a noisy dataset that broadens the application field of NER models.
Created on 13 Sep. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.