Papers
arxiv:2404.11726

Investigating Gender Bias in Turkish Language Models

Published on Apr 17
Authors:
,
,

Abstract

Language models are trained mostly on Web data, which often contains social stereotypes and biases that the models can inherit. This has potentially negative consequences, as models can amplify these biases in downstream tasks or applications. However, prior research has primarily focused on the English language, especially in the context of gender bias. In particular, grammatically gender-neutral languages such as Turkish are underexplored despite representing different linguistic properties to language models with possibly different effects on biases. In this paper, we fill this research gap and investigate the significance of gender bias in Turkish language models. We build upon existing bias evaluation frameworks and extend them to the Turkish language by translating existing English tests and creating new ones designed to measure gender bias in the context of T\"urkiye. Specifically, we also evaluate Turkish language models for their embedded ethnic bias toward Kurdish people. Based on the experimental results, we attribute possible biases to different model characteristics such as the model size, their multilingualism, and the training corpora. We make the Turkish gender bias dataset publicly available.

Community

Hey @orhunc and @malteos ,

really great work on investigating gender bias!

I have some comments :)

Indeed, the uncased variants of BERTurk models used accent stripping, which is not a good idea. The 32k and 128k vocab variant were trained on the same corpus with around ~35GB of text.

Additionally, I pretrained models (ELECTRA and ConvBERT architecture) on the Turkish mC4 split with 242GB of training data:

  • dbmdz/electra-base-turkish-mc4-cased-discriminator - here
  • dbmdz/electra-base-turkish-mc4-uncased-discriminator - here
  • dbmdz/convbert-base-turkish-mc4-cased - here
  • dbmdz/convbert-base-turkish-mc4-uncased - here

For the uncased variants of these models, I avoided accent stripping.

So maybe these models could also be tested/analyzed, because they were trained on much more - potentially biased - web data.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2404.11726 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2404.11726 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2404.11726 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.