--- license: other license_name: yi-license license_link: LICENSE model-index: - name: Yi-34B-200K-AEZAKMI-RAW-2301-LoRA results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 65.96 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=adamo1139/Yi-34B-200K-AEZAKMI-RAW-2301-LoRA name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 83.89 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=adamo1139/Yi-34B-200K-AEZAKMI-RAW-2301-LoRA name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 74.76 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=adamo1139/Yi-34B-200K-AEZAKMI-RAW-2301-LoRA name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 57.08 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=adamo1139/Yi-34B-200K-AEZAKMI-RAW-2301-LoRA name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 78.69 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=adamo1139/Yi-34B-200K-AEZAKMI-RAW-2301-LoRA name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 55.5 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=adamo1139/Yi-34B-200K-AEZAKMI-RAW-2301-LoRA name: Open LLM Leaderboard --- THIS MODEL IS EXPERIMENTAL AND MIGHT BE BUGGY, I DIDN'T PERFECT THE STRENGTH OF DPO AND SFT YET. \ Submitting to Open LLM leaderboard with base model yi-34b-200k-llamafied to see whether there's a point in merging a lora over a lora if both have the same lora_r or if it doesn't matter. Another AEZAKMI v2 finetune over Yi-34B-200K-rawrr-r3. Sequence length 2200 I was able to squeeze that in using Unsloth, script I used is in this repo. Training took around 18 hours on local RTX 3090 Ti. Will be uploading fp16 and exl2 soon. So far it seems like de-contaminating Yi worked nicely. This lora goes over Yi-34B-200K-rawrr1-LORA-DPO-experimental-r3 lora. So first get Yi-34B-200K llamafied, merge in Yi-34B-200K-rawrr1-LORA-DPO-experimental-r3, then merge in this lora. Credits for mlabonne (I was using his Mistral fine-tuning script pieces for dataset preparation), Daniel Han and Michael Han (Unsloth AI team) [made with Unsloth](https://github.com/unslothai/unsloth) # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_adamo1139__Yi-34B-200K-AEZAKMI-RAW-2301-LoRA) | Metric |Value| |---------------------------------|----:| |Avg. |69.31| |AI2 Reasoning Challenge (25-Shot)|65.96| |HellaSwag (10-Shot) |83.89| |MMLU (5-Shot) |74.76| |TruthfulQA (0-shot) |57.08| |Winogrande (5-shot) |78.69| |GSM8k (5-shot) |55.50|