Edit model card

This model was created as an experiment on using LoRA extraction to replicate Openchat-3.5-0106 using Mistral-7B-v0.2 as a base model instead of the original Mistral-7B-v0.1.

Openchat-3.5-0106 is an excellent model but was based on Mistral-7B-v0.1 which has a context window of 8192 tokens. Mistral-7B-v0.2 has a context window of 32768 tokens. I could have extended OpenChat-3.5 context myself with RoPE and/or YaRN but that has been done. There are many models on HF that have done exactly that. Instead I decided to try and replicate OpenChat-3.5-0106 using the LoRA extraction method available in mergekit. These are the steps I followed:

  • Extract a LoRA with rank 512 from OpenChat-3.5-0106 using One's Mistral_7B_with_EOT_token as the base model.
  • Replicate imone's work by adding the EOT token to Mistral-7B-v0.2, creating Mistral-7B-v0.2_EOT.
  • Merge the LoRA's weights to the Mistral-7B-v0.2_EOT model.

This is the result. This model is not meant for use, it was created to test if this method is viable for replacing the base model of fine-tuned models (when tokenizer and weights have not been changed too much). I am uploading here for evaluation. I don't expect this model to match the original OpenChat-3.5-0106 since I used a LoRA with rank 512, so it won't be equivalent to a full fine-tuning. I have been able to extract LoRAs with higher rank, but currently I don't have the resources to merge them with the model as the memory requirements exceed what I have at my disposal. If you would like to help my work, check my Ko-Fi and/or Patreon:

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 15.94
IFEval (0-Shot) 37.06
BBH (3-Shot) 10.91
MATH Lvl 5 (4-Shot) 3.85
GPQA (0-shot) 2.91
MuSR (0-shot) 20.57
MMLU-PRO (5-shot) 20.33
Downloads last month
24
Safetensors
Model size
7.24B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results