--- language: - en license: apache-2.0 datasets: - teknium/OpenHermes-2.5 - abhinand/ultrachat_200k_sharegpt model-index: - name: TinyLlama-1.1B-OpenHermes-2.5-Chat-v0.1-sft results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 33.79 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=abhinand/TinyLlama-1.1B-OpenHermes-2.5-Chat-v0.1-sft name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 58.72 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=abhinand/TinyLlama-1.1B-OpenHermes-2.5-Chat-v0.1-sft name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 24.52 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=abhinand/TinyLlama-1.1B-OpenHermes-2.5-Chat-v0.1-sft name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 36.22 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=abhinand/TinyLlama-1.1B-OpenHermes-2.5-Chat-v0.1-sft name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 60.93 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=abhinand/TinyLlama-1.1B-OpenHermes-2.5-Chat-v0.1-sft name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 5.38 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=abhinand/TinyLlama-1.1B-OpenHermes-2.5-Chat-v0.1-sft name: Open LLM Leaderboard --- # TinyLLaMA OpenHermes2.5 [Work in Progress] This a finetune of TinyLLaMA base model finetuned on [OpenHermes 2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5) and [UltraChat 200k](https://huggingface.co/datasets/abhinand/ultrachat_200k_sharegpt) for a single epoch. Training was generously supported by [Jarvislabs.ai](https://jarvislabs.ai/). If you appreciate this work and would like to support its continued development, consider [buying me a coffee](https://www.buymeacoffee.com/abhinand.b). Your support is invaluable and greatly appreciated. [!["Buy Me A Coffee"](https://www.buymeacoffee.com/assets/img/custom_images/orange_img.png)](https://www.buymeacoffee.com/abhinand.b)
See axolotl config axolotl version: `0.4.0` ```yaml base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T model_type: AutoModelForCausalLM tokenizer_type: AutoTokenizer trust_remote_code: true is_llama_derived_model: true # huggingface repo datasets: - path: teknium/OpenHermes-2.5 type: sharegpt conversation: chatml train_on_split: train - path: abhinand/ultrachat_200k_sharegpt type: sharegpt conversation: chatml train_on_split: train load_in_4bit: false load_in_8bit: false bf16: true # require >=ampere chat_template: chatml dataset_prepared_path: last_run_prepared_path hub_model_id: abhinand/TinyLlama-1.1B-OpenHermes-2.5-Chat-v1.0 group_by_length: false val_set_size: 0.0 sequence_len: 2048 sample_packing: true pad_to_sequence_len: true adapter: lora lora_model_dir: lora_r: 32 lora_alpha: 16 lora_target_modules: - q_proj - v_proj - k_proj - o_proj - gate_proj - down_proj - up_proj lora_modules_to_save: - embed_tokens - lm_head lora_dropout: 0.05 lora_target_linear: true lora_fan_in_fan_out: output_dir: /home/tiny-llama/trained_models gradient_accumulation_steps: 2 micro_batch_size: 32 eval_batch_size: 32 num_epochs: 1 logging_steps: 1 save_steps: 50 save_total_limit: 3 save_safetensors: true gradient_checkpointing: true lr_scheduler: cosine optimizer: "adamw_bnb_8bit" adam_beta2: 0.95 adam_epsilon: 0.00001 weight_decay: 0.1 learning_rate: 0.0005 max_grad_norm: 1.0 warmup_ratio: 0.05 # warmup_steps: 100 flash_attention: true # Resume from a specific checkpoint dir resume_from_checkpoint: # If resume_from_checkpoint isn't set and you simply want it to start where it left off. # Be careful with this being turned on between different models. # auto_resume_from_checkpoints: true # wandb configuration if you're using it # Make sure your `WANDB_API_KEY` environment variable is set (recommended) or you login to wandb with `wandb login`. wandb_mode: # "offline" to save run metadata locally and not sync to the server, "disabled" to turn off wandb wandb_project: "tiny-llama-sft" wandb_name: wandb_run_id: special_tokens: bos_token: "" eos_token: "" unk_token: "" tokens: # these are delimiters - "<|im_start|>" - "<|im_end|>" ```
## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0005 - train_batch_size: 32 - eval_batch_size: 32 - seed: 42 - gradient_accumulation_steps: 2 - total_train_batch_size: 64 - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 476 - num_epochs: 1 ### Framework versions - PEFT 0.8.2 - Transformers 4.38.0.dev0 - Pytorch 2.0.1 - Datasets 2.16.1 - Tokenizers 0.15.0 # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_abhinand__TinyLlama-1.1B-OpenHermes-2.5-Chat-v0.1-sft) | Metric |Value| |---------------------------------|----:| |Avg. |36.59| |AI2 Reasoning Challenge (25-Shot)|33.79| |HellaSwag (10-Shot) |58.72| |MMLU (5-Shot) |24.52| |TruthfulQA (0-shot) |36.22| |Winogrande (5-shot) |60.93| |GSM8k (5-shot) | 5.38|