|
--- |
|
license: other |
|
language: |
|
- tr |
|
library_name: transformers |
|
pipeline_tag: text2text-generation |
|
--- |
|
|
|
|
|
# Model Card for TURNA |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
TURNA is a Turkish language model based on the UL2 framework which is suitable for both understanding and generation tasks. |
|
|
|
Evaluations across three generation and six understanding tasks in Turkish show that TURNA outperforms several multilingual models and competes with monolingual Turkish models in understanding tasks. |
|
|
|
The model is shared with the public to be used solely for non-commercial academic research purposes. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
- **Developed by:** Bogazici University Computer Engineering Department TABILAB (special thanks to VNGRS-AI for sharing their tokenizer) |
|
- **Funded by:** We thank the Google TPU Research Cloud program for providing us with credits to pretrain our model on TPU v3-8 machines. |
|
<!-- - **Shared by [optional]:** [More Information Needed] --> |
|
- **Model type:** Transformer-based encoder-decoder |
|
- **Language(s) (NLP):** Turkish |
|
- **License:** The model is shared with the public to be used solely for non-commercial academic research purposes. |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** [Training code](https://github.com/boun-tabi-LMG/turna), [Finetuning library](https://github.com/boun-tabi-LMG/turkish-lm-tuner) |
|
- **Paper:** [Arxiv paper](https://arxiv.org/abs/) |
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
### Direct Use |
|
|
|
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. --> |
|
|
|
This model can be used for research purposes. You give some text and this model tries to predict the next words. |
|
|
|
### Downstream Use |
|
|
|
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app --> |
|
|
|
This model can be finetuned using [our library](https://github.com/boun-tabi-LMG/turkish-lm-tuner) to solve your own task involving Turkish language. |
|
|
|
This model can be further trained for behaving more helpful, less harmful and better for dialog use cases. |
|
|
|
### Out-of-Scope Use |
|
|
|
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. --> |
|
|
|
Any commercial or malicious activity. |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
We refer to the Flan-T5's [official model card](https://arxiv.org/pdf/2210.11416.pdf): |
|
|
|
> Language models, including Flan-T5, can potentially be used for language generation in a harmful way, according to Rae et al. (2021). Flan-T5 should not be used directly in any application, without a prior assessment of safety and fairness concerns specific to the application. |
|
|
|
### Ethical considerations and risks |
|
|
|
> ... (ed. The model) is fine-tuned on a large corpus of text data that was not filtered for explicit content or assessed for existing biases. As a result the model itself is potentially vulnerable to generating equivalently inappropriate content or replicating inherent biases in the underlying data. |
|
|
|
### Known Limitations |
|
|
|
> ... (ed. The model) has not been tested in real world applications. |
|
|
|
### Sensitive Use: |
|
|
|
> ... (ed. The model) should not be applied for any unacceptable use cases, e.g., generation of abusive speech. |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
You can find the technical usage guidance at our library's Github [page](https://github.com/boun-tabi-LMG/turkish-lm-tuner). |
|
|
|
## Training Details |
|
|
|
Refer to the paper for more information. |
|
|
|
## Evaluation |
|
|
|
Refer to the paper for more information. |
|
|
|
## Environmental Impact |
|
|
|
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly --> |
|
|
|
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). |
|
|
|
- **Hardware Type:** TPU v3-8 |
|
- **Hours used:** About 400 hours |
|
- **Cloud Provider:** Google Cloud |
|
- **Compute Region:** europe-west4-a |
|
- **Carbon Emitted:** 64.52 kg CO2_2 |
|
|
|
## Technical Specifications |
|
|
|
Refer to the paper for more information. |
|
|
|
## Citation |
|
|
|
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. --> |
|
|
|
**BibTeX:** |
|
|
|
Coming soon! |
|
|
|
**APA:** |
|
|
|
Coming soon! |
|
|
|
## Model Card Authors |
|
|
|
Paper authors. |
|
|
|
## Model Card Contact |
|
|
|
Onur Güngör |
|
|
|
|
|
|