--- license: llama2 language: - ru metrics: - accuracy --- # ruadapt_llama2_7b_v0.1 This model is a fine-tuned (embeddings, lm head) version of TheBloke/Llama-2-7B-fp16 on the Russian dataset (33GB). It achieves the following results on the evaluation set: - Loss: 2.7569 - Accuracy: 0.4617 Instruct version: https://huggingface.co/rccmsu/ruadapt_saiga2_7b_v0.1 ## Model description Russian adaptation of LLaMa-2-7B by replacing the tokenizer. Paper: Tikhomirov M., Chernyshev D. Impact of Tokenization on LLaMa Russian Adaptation //arXiv preprint arXiv:2312.02598. – 2023. ## Intended uses & limitations LLAMA 2 COMMUNITY LICENSE AGREEMENT ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 6 - eval_batch_size: 6 - seed: 42 - distributed_type: multi-GPU - num_devices: 16 - gradient_accumulation_steps: 2 - total_train_batch_size: 192 - total_eval_batch_size: 96 - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05 - lr_scheduler_type: linear - num_epochs: 2.0 ### Framework versions - Transformers 4.34.0 - Pytorch 2.0.1+cu118 - Datasets 2.14.5 - Tokenizers 0.14.1