--- widget: - text: "gelirken bir litre [MASK] aldım." example_title: "ürün" --- # turkish-tiny-bert-uncased This is a Turkish Tiny uncased BERT model, developed to fill the gap for small-sized BERT models for Turkish. Since this model is uncased: it does not make a difference between turkish and Turkish. #### ⚠ Uncased use requires manual lowercase conversion Note that due to a [known issue](https://github.com/huggingface/transformers/issues/6680) with the tokenizer, the `do_lower_case = True` flag should **NOT** be used with the tokenizer. Instead, convert your text to lower case as follows: ```python text.replace("I", "ı").lower() ``` Be aware that this model may exhibit biased predictions as it was trained primarily on crawled data, which inherently can contain various biases. Other relevant information can be found in the [paper](https://arxiv.org/abs/2307.14134). ```python from transformers import AutoTokenizer, BertForMaskedLM from transformers import pipeline model = BertForMaskedLM.from_pretrained(r"turkish-tiny-bert-uncased") # or # model = BertForMaskedLM.from_pretrained(r"turkish-tiny-bert-uncased", from_tf = True) tokenizer = AutoTokenizer.from_pretrained(r"turkish-tiny-bert-uncased") unmasker = pipeline('fill-mask', model=model, tokenizer=tokenizer) unmasker("gelirken bir litre [MASK] aldım.") # [{'score': 0.202457457780838, # 'token': 2417, # 'token_str': 'su', # 'sequence': 'gelirken bir litre su aldım.'}, # {'score': 0.09290537238121033, # 'token': 11818, # 'token_str': 'benzin', # 'sequence': 'gelirken bir litre benzin aldım.'}, # {'score': 0.07785643637180328, # 'token': 2026, # 'token_str': '##den', # 'sequence': 'gelirken bir litreden aldım.'}, # {'score': 0.06889808923006058, # 'token': 2299, # 'token_str': '##yi', # 'sequence': 'gelirken bir litreyi aldım.'}, # {'score': 0.03152570128440857, # 'token': 2647, # 'token_str': '##ye', # 'sequence': 'gelirken bir litreye aldım.'}] ``` # Acknowledgments - Research supported with Cloud TPUs from [Google's TensorFlow Research Cloud](https://sites.research.google/trc/about/) (TFRC). Thanks for providing access to the TFRC ❤️ - Thanks to the generous support from the Hugging Face team, it is possible to download models from their S3 storage 🤗 # License MIT