cuneytkaya commited on
Commit
1c34ec4
1 Parent(s): e66ce11

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - alibayram/turkish_mmlu
5
+ language:
6
+ - tr
7
+ base_model:
8
+ - google-t5/t5-small
9
+ ---
10
+ # fine-tuned-t5-small-turkish-mmlu
11
+
12
+ <!-- Provide a quick summary of what the model is/does. -->
13
+
14
+ The fine-tuned [T5-Small](https://huggingface.co/google-t5/t5-small) model is a question-answering model trained on the [Turkish MMLU](https://huggingface.co/datasets/alibayram/turkish_mmlu) dataset, which consists of questions from various academic and professional exams in Turkey, including KPSS and TUS. The model takes a Turkish question as input and generates the correct answer. It is designed to perform well on Turkish-language question-answering tasks, leveraging the structure of the T5 architecture to handle text-to-text transformations.
15
+
16
+ ### Training Data
17
+
18
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
19
+
20
+ @dataset{bayram_2024_13378019,
21
+ author = {Bayram, M. Ali},
22
+ title = {{Turkish MMLU: Yapay Zeka ve Akademik Uygulamalar
23
+ İçin En Kapsamlı ve Özgün Türkçe Veri Seti}},
24
+ month = aug,
25
+ year = 2024,
26
+ publisher = {Zenodo},
27
+ version = {v1.2},
28
+ doi = {10.5281/zenodo.13378019},
29
+ url = {https://doi.org/10.5281/zenodo.13378019}
30
+ }
31
+
32
+
33
+ #### Training Hyperparameters
34
+
35
+ learning_rate=5e-5
36
+ per_device_train_batch_size=8
37
+ per_device_eval_batch_size=8
38
+ num_train_epochs=3
39
+ weight_decay=0.01
40
+
41
+
42
+
43
+ #### Metrics
44
+
45
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
46
+ Training loss was monitored to evaluate how well the model is learning and to avoid overfitting. In this case, after 3 epochs, the model achieved a training loss of 0.0749, reflecting its ability to generalize well to the given data.