Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -118,6 +118,7 @@ We evaluated the models on the following datasets:
 #### Evaluation of English Benchmark datasets
 - **llama-3.2-1b** consistently leads across all tasks in both 0-shot and 5-shot settings, with top scores of **0.75** in **PIQA** and **0.64** in **BoolQ**.
 - **hishab/titulm-llama-3.2-1b-v1.0** shows competitive performance but generally scores lower than **llama-3.2-1b**, particularly in the 5-shot setting.
 | Model                                | Shots  | MMLU         | BoolQ      | Commonsense QA     | OpenBook QA     | PIQA      |
 |--------------------------------------|--------|--------------|------------|--------------------|-----------------|-----------|

 #### Evaluation of English Benchmark datasets
 - **llama-3.2-1b** consistently leads across all tasks in both 0-shot and 5-shot settings, with top scores of **0.75** in **PIQA** and **0.64** in **BoolQ**.
 - **hishab/titulm-llama-3.2-1b-v1.0** shows competitive performance but generally scores lower than **llama-3.2-1b**, particularly in the 5-shot setting.
+- It is expected as we have trained the model only on Bangla text.
 | Model                                | Shots  | MMLU         | BoolQ      | Commonsense QA     | OpenBook QA     | PIQA      |
 |--------------------------------------|--------|--------------|------------|--------------------|-----------------|-----------|