--- license: other library_name: transformers datasets: - vicgalle/alpaca-gpt4 extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license license_name: gemma-terms-of-use license_link: https://ai.google.dev/gemma/terms base_model: - google/gemma-2b model-index: - name: Gemmalpaca-2B results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 48.72 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Gemmalpaca-2B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 71.36 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Gemmalpaca-2B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 36.3 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Gemmalpaca-2B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 41.24 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Gemmalpaca-2B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 65.59 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Gemmalpaca-2B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 10.69 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Gemmalpaca-2B name: Open LLM Leaderboard --- ![image/webp](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/uwPjZeV-JQwKWrI7nHg4w.webp) # Gemmalpaca-2B This is gemma-2b model supervised fine-tuned on the [vicgalle/alpaca-gpt4](https://huggingface.co/datasets/vicgalle/alpaca-gpt4) dataset. It outperforms gemma-2b-it, Google's chat version, on Nous' benchmark suite. It's mostly a test to see how fine-tuning works with Gemma models on a well-known dataset. It turned out better than expected. :) ## 🔍 Applications This model has a context length of 8k. I recommend using it with the Alpaca chat template and NOT the Gemma Instruct template (works perfectly with LM Studio). You also want to add `` as a stop token. ## ⚡ Quantized models * **GGUF**: https://huggingface.co/mlabonne/Gemmalpaca-2B-GGUF ## 🏆 Evaluation ### Nous Gemmalpaca-2B outperforms gemma-2b and gemma-2b-it on Nous' benchmark suite (evaluation performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval)). See the entire leaderboard [here](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard). | Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench | |---|---:|---:|---:|---:|---:| | [mlabonne/Gemmalpaca-2B](https://huggingface.co/mlabonne/Gemmalpaca-2B) [📄](https://gist.github.com/mlabonne/4b638752fc3227df566f9562064cb864) | 38.39 | 24.48 | 51.22 | 47.02 | 30.85 | | [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) [📄](https://gist.github.com/mlabonne/db0761e74175573292acf497da9e5d95) | 36.1 | 23.76 | 43.6 | 47.64 | 29.41 | | [google/gemma-2b](https://huggingface.co/google/gemma-2b) [📄](https://gist.github.com/mlabonne/7df1f238c515a5f63a750c8792cef59e) | 34.26 | 22.7 | 43.35 | 39.96 | 31.03 | ### [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_mlabonne__Gemmalpaca-2B) | Metric |Value| |---------------------------------|----:| |Avg. |45.65| |AI2 Reasoning Challenge (25-Shot)|48.72| |HellaSwag (10-Shot) |71.36| |MMLU (5-Shot) |36.30| |TruthfulQA (0-shot) |41.24| |Winogrande (5-shot) |65.59| |GSM8k (5-shot) |10.69| ## 🧩 Configuration It was trained using [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) with the following configuration. ```yaml base_model: alpindale/gemma-2b model_type: GemmaForCausalLM tokenizer_type: GemmaTokenizer load_in_8bit: false load_in_4bit: true strict: false datasets: - path: vicgalle/alpaca-gpt4 type: alpaca dataset_prepared_path: val_set_size: 0.01 output_dir: ./out sequence_len: 2048 sample_packing: true pad_to_sequence_len: true adapter: qlora lora_model_dir: lora_r: 32 lora_alpha: 64 lora_dropout: 0.05 lora_target_linear: true wandb_project: axolotl wandb_entity: wandb_watch: wandb_name: wandb_log_model: gradient_accumulation_steps: 4 micro_batch_size: 2 num_epochs: 3 optimizer: adamw_bnb_8bit lr_scheduler: cosine learning_rate: 0.0002 train_on_inputs: false group_by_length: false bf16: auto fp16: tf32: false gradient_checkpointing: true early_stopping_patience: resume_from_checkpoint: local_rank: logging_steps: 1 xformers_attention: flash_attention: warmup_steps: 10 evals_per_epoch: 4 eval_table_size: eval_table_max_new_tokens: 128 saves_per_epoch: 1 debug: deepspeed: weight_decay: 0.1 fsdp: fsdp_config: special_tokens: bos_token: eos_token: unk_token: ``` [Built with Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl)