Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models
AI-generated Key Points
- The study focuses on Pretrained Language Models (PLMs) and their ability to capture cross-lingual word sense knowledge.
- PLMs can be finetuned for tasks such as translation and multilingual word sense disambiguation (WSD), but they often struggle in a zero-shot setting.
- The authors introduce Contextual Word-Level Translation (C-WLT), an extension of word-level translation that prompts the model to translate a given word in context, to address this issue.
- Larger models perform better at using context to improve WLT performance.
- The authors propose a zero-shot approach for WSD, tested on 18 languages from the XL-WSD dataset. Their method outperforms fully supervised baselines on recall for many evaluation languages without additional training or finetuning.
- The study compares their approach to prior work on multilingual WSD using automatic metrics such as recall and Jaccard index. They found that ensembling English, Chinese, and Russian as target languages with English prompts achieved a balance between recall and Jaccard Index.
- Despite being performed zero-shot from a pretrained language model, their method achieves higher recall compared to prior works in 11 out of the 18 source languages, showing that translation-based approaches can identify correct sense labels as well or better than supervised methods.
- Limitations of their approach include relying on the availability of high quality translations and not considering polysemous words with multiple senses within one language.
- Future research directions are suggested to address these limitations and improve applicability.
- Overall, this study presents a first step towards leveraging cross-lingual knowledge inside PLMs for robust zero-shot reasoning in any language.
Authors: Haoqiang Kang, Terra Blevins, Luke Zettlemoyer
Abstract: Pretrained Language Models (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks such as translation and multilingual word sense disambiguation (WSD). However, they often struggle at disambiguating word sense in a zero-shot setting. To better understand this contrast, we present a new study investigating how well PLMs capture cross-lingual word sense with Contextual Word-Level Translation (C-WLT), an extension of word-level translation that prompts the model to translate a given word in context. We find that as the model size increases, PLMs encode more cross-lingual word sense knowledge and better use context to improve WLT performance. Building on C-WLT, we introduce a zero-shot approach for WSD, tested on 18 languages from the XL-WSD dataset. Our method outperforms fully supervised baselines on recall for many evaluation languages without additional training or finetuning. This study presents a first step towards understanding how to best leverage the cross-lingual knowledge inside PLMs for robust zero-shot reasoning in any language.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.