Model Card for Model ID
Model Description
Given 2 words in Turkish, the model predicts whether they share an affix or not. Fine-tuned on dbmdz/bert-base-turkish-cased, fine-tuned on a task similar to NLI, but on word level and with 2 labels. It was created as a final project for one of my classes.
- Developed by: Scoup123
- Model type: BERT
- Language(s) (NLP): Turkish
- Finetuned from model [optional]: dbmdz/bert-base-turkish-cased
Model Sources [optional]
- Repository: [More Information Needed]
- Paper [optional]: in-works
Uses
It can be used in morphological analyzing tasks.
Direct Use
It can probably be used without additional finetuning on Turkish.
Training Details
Training Data
scoup123/affixfinder
The dataset used was generated from a generated dataset mentioned in the paper titled Turkish language resources: Morphological parser, morphological disambiguator and web corpus.
Evaluation
Test Accuracy: 0.9874 Precision: 0.9874 Recall: 0.9874 F1 Score: 0.9874
**It should be used with caution as these scores are too high.
Testing Data, Factors & Metrics
Testing Data
A testing split data was created from the training data
Summary
This model aims to create an affix identifier for Turkish.
Model Examination [optional]
I have just created it, so further testing needed to check if it actually works. Additionally, you should check it if it works before using it.
[More Information Needed]
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: Free Colab T4 GPU
- Hours used: ~2.5 hours
- Cloud Provider: Google
- Compute Region: Europe
- Carbon Emitted: [More Information Needed]
Citation [optional]
APA:
Sak, H., Güngör, T., & Saraçlar, M. (2008). Turkish language resources: Morphological parser, morphological disambiguator and web corpus. In Advances in natural language processing (pp. 417-427). Springer Berlin Heidelberg.
Model Card Authors [optional]
Kaan Bayar
Model Card Contact
- Downloads last month
- 0