--- license: llama3 language: - tr --- # CERE-LLMA-3-8b-TR This model is an fine-tuned version of a Llama3 8b Large Language Model (LLM) for Turkish. It was trained on a high quality Turkish instruction sets created from various open-source and internal resources. Turkish Instruction dataset carefully annotated to carry out Turkish instructions in an accurate and organized manner. ## Model Details - **Base Model**: LLMA 3 7B based LLM - **Tokenizer Extension**: Specifically extended for Turkish - **Training Dataset**: Cleaned Turkish raw data with 5 billion tokens, custom Turkish instruction sets - **Training Method**: Initially with DORA, followed by fine-tuning with LORA [Open LLM Turkish Leaderboard v0.2 Evaluation Results] Metric Value Avg. AI2 Reasoning Challenge_tr HellaSwag_tr MMLU_tr TruthfulQA_tr Winogrande _tr GSM8k_tr