In this study, we introduce new large labeled datasets on bias in three languages - Italian, Dutch, and German. These datasets consist of almost 2 million samples each and are accompanied by lexica of sensitive terms for bias detection in the respective languages. Our experiments reveal that bias exists in all ten datasets of five languages evaluated, including benchmark datasets on the English GLUE/SuperGLUE leaderboards. We utilize state-of-the-art multilingual pretrained models such as mT5 and mBERT to benchmark on these datasets. The motivation behind our research stems from recent events highlighting the prevalence of social bias in AI and large language models (LLMs). We aim to estimate bias in multiple datasets by comparing various bias evaluation methods, including the bipol metric which offers explainability. Additionally, we confirm the assumption that toxic comments contain bias by annotating 200 samples from a toxic dataset population with a confidence level of 95% and an error margin of 7%. To ensure annotation quality, we include 30 gold samples with unanimous agreement from the original data. Our findings indicate that many datasets exhibit male bias (prejudice against women) along with other forms of bias. The literature review highlights previous efforts to measure and mitigate bias in languages other than English, emphasizing gender biases as well as biases related to origin and age. For Dutch language studies specifically, binary gender bias is a common focus. To verify our assumption regarding toxic comments containing bias, we conduct experiments using SotA pre-trained multilingual models mT5-small and mBERT-base to compare their macro F1 performance. Our methodology involves rigorous annotation techniques with multiple annotators and gold samples to ensure high-quality results. Overall, this study contributes new labeled datasets, lexica of sensitive terms, models, and codes for detecting bias in multiple languages. By shedding light on the presence of biases across various datasets and languages, we aim to address the challenge of social bias in AI systems effectively.
- - Introduction of new large labeled datasets on bias in Italian, Dutch, and German languages
- - Bias detected in all evaluated datasets across five languages, including English GLUE/SuperGLUE leaderboards
- - Utilization of state-of-the-art multilingual pretrained models like mT5 and mBERT for benchmarking
- - Motivation from recent events highlighting social bias in AI and large language models (LLMs)
- - Comparison of various bias evaluation methods, including the bipol metric for explainability
- - Confirmation of bias in toxic comments through annotation with a confidence level of 95% and error margin of 7%
- - Identification of male bias along with other forms of bias in many datasets
- - Emphasis on gender biases as well as biases related to origin and age in previous research efforts
- - Focus on binary gender bias in Dutch language studies
- - Conducting experiments using SotA pre-trained multilingual models mT5-small and mBERT-base to compare macro F1 performance for toxic comments containing bias
- - Methodology involving rigorous annotation techniques with multiple annotators and gold samples for high-quality results
- - Contribution of new labeled datasets, lexica of sensitive terms, models, and codes for detecting bias in multiple languages
Summary- New big labeled datasets were introduced to look for unfairness in Italian, Dutch, and German languages.
- Unfairness was found in all datasets across five languages, including English leaderboards for language tasks.
- Advanced multilingual models like mT5 and mBERT were used to compare and test the datasets.
- Recent events showing social unfairness in AI inspired this work.
- Different methods were compared to find unfairness, such as the bipol metric.
Definitions- Datasets: Collections of information or data used for research or analysis.
- Bias: Unfair preferences or prejudices that affect decisions or outcomes.
- Multilingual: Capable of understanding or using multiple languages.
- Pretrained models: Models that are already trained on a large dataset before being used for specific tasks.
Introduction
In recent years, there has been growing concern about the presence of bias in artificial intelligence (AI) systems and large language models (LLMs). These systems are trained on vast amounts of data, which can often reflect societal biases and prejudices. As a result, AI systems may perpetuate these biases when making decisions or generating text. To address this issue, researchers have been working to develop methods for detecting and mitigating bias in AI.
In this study, we introduce new large labeled datasets on bias in three languages - Italian, Dutch, and German. These datasets consist of almost 2 million samples each and are accompanied by lexica of sensitive terms for bias detection in the respective languages. Our goal is to estimate bias in multiple datasets by comparing various evaluation methods using state-of-the-art multilingual pretrained models such as mT5 and mBERT.
Motivation
The motivation behind our research stems from recent events highlighting the prevalence of social bias in AI and LLMs. For example, studies have shown that facial recognition software can exhibit racial biases due to imbalanced training data. Additionally, natural language processing (NLP) models have been found to generate biased text based on their training data.
To address these issues, it is crucial to understand the extent of biases present in different datasets and languages. By identifying these biases, we can work towards developing more fair and inclusive AI systems.
Methodology
To conduct our study, we utilized state-of-the-art multilingual pretrained models such as mT5-small and mBERT-base to benchmark on our newly introduced datasets as well as existing benchmark datasets on the English GLUE/SuperGLUE leaderboards. We compared various evaluation methods including the bipol metric which offers explainability.
Additionally, we conducted experiments using rigorous annotation techniques with multiple annotators to ensure high-quality results. We also included gold samples with unanimous agreement from the original data to verify our annotations' accuracy.
Findings
Our experiments revealed that bias exists in all ten datasets of five languages evaluated, including benchmark datasets on the English GLUE/SuperGLUE leaderboards. This highlights the need for further research and efforts to mitigate biases in AI systems.
We also found that many datasets exhibit male bias (prejudice against women) along with other forms of bias. Our literature review highlighted previous efforts to measure and mitigate bias in languages other than English, emphasizing gender biases as well as biases related to origin and age.
For Dutch language studies specifically, binary gender bias is a common focus. To verify our assumption regarding toxic comments containing bias, we conducted experiments using SotA pre-trained multilingual models mT5-small and mBERT-base to compare their macro F1 performance. Our findings confirmed that toxic comments do contain bias, further emphasizing the importance of addressing this issue.
Conclusion
In conclusion, our study contributes new labeled datasets, lexica of sensitive terms, models, and codes for detecting bias in multiple languages. By shedding light on the presence of biases across various datasets and languages, we aim to address the challenge of social bias in AI systems effectively.
Moving forward, it is crucial for researchers and developers to continue working towards developing fairer AI systems by identifying and mitigating biases present in training data. Additionally, more efforts should be made towards creating diverse and inclusive datasets to train these systems on.
Overall, this study highlights the importance of considering biases when developing AI systems and provides valuable resources for future research on this topic. With continued efforts towards understanding and addressing social biases in AI, we can work towards creating a more equitable future for all individuals impacted by these technologies.