Edit model card

Model Card for Model ID

Model Description

Given 2 words in Turkish, the model predicts whether they share an affix or not. Fine-tuned on dbmdz/bert-base-turkish-cased, fine-tuned on a task similar to NLI, but on word level and with 2 labels. It was created as a final project for one of my classes.

  • Developed by: Scoup123
  • Model type: BERT
  • Language(s) (NLP): Turkish
  • Finetuned from model [optional]: dbmdz/bert-base-turkish-cased

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: in-works

Uses

It can be used in morphological analyzing tasks.

Direct Use

It can probably be used without additional finetuning on Turkish.

Training Details

Training Data

scoup123/affixfinder

The dataset used was generated from a generated dataset mentioned in the paper titled Turkish language resources: Morphological parser, morphological disambiguator and web corpus.

Evaluation

Test Accuracy: 0.9874 Precision: 0.9874 Recall: 0.9874 F1 Score: 0.9874

**It should be used with caution as these scores are too high.

Testing Data, Factors & Metrics

Testing Data

A testing split data was created from the training data

Summary

This model aims to create an affix identifier for Turkish.

Model Examination [optional]

I have just created it, so further testing needed to check if it actually works. Additionally, you should check it if it works before using it.

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: Free Colab T4 GPU
  • Hours used: ~2.5 hours
  • Cloud Provider: Google
  • Compute Region: Europe
  • Carbon Emitted: [More Information Needed]

Citation [optional]

APA:

Sak, H., Güngör, T., & Saraçlar, M. (2008). Turkish language resources: Morphological parser, morphological disambiguator and web corpus. In Advances in natural language processing (pp. 417-427). Springer Berlin Heidelberg.

Model Card Authors [optional]

Kaan Bayar

Model Card Contact

kaan.bayar13@gmail.com

Downloads last month
0
Safetensors
Model size
111M params
Tensor type
F32
·