Locutusque
/

lr-experiment1-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Locutusque commited on Mar 12

Commit

6ec1080

•

1 Parent(s): 184813a

Update README.md

Files changed (1) hide show

README.md +29 -1

README.md CHANGED Viewed

@@ -12,4 +12,32 @@ language:
 The lr-experiment model series is a research project I'm conducting that I will be using to determine the best learning rate to use while fine-tuning Mistral. This model uses a learning rate of 2e-5 with a cosine scheduler and no warmup steps.
-I used Locutusque/Hercules-2.0-Mistral-7B as a base model, and further fine-tuned it on CollectiveCognition/chats-data-2023-09-22 using QLoRA for 3 epochs. I will be keeping track of evaluation results, and will comparing it to upcoming models.

 The lr-experiment model series is a research project I'm conducting that I will be using to determine the best learning rate to use while fine-tuning Mistral. This model uses a learning rate of 2e-5 with a cosine scheduler and no warmup steps.
+I used Locutusque/Hercules-2.0-Mistral-7B as a base model, and further fine-tuned it on CollectiveCognition/chats-data-2023-09-22 using QLoRA for 3 epochs. I will be keeping track of evaluation results, and will comparing it to upcoming models.
+# Evals
+|              Tasks              |Version|Filter|n-shot| Metric |Value |   |Stderr|
+|---------------------------------|-------|------|------|--------|-----:|---|-----:|
+|agieval_nous                     |N/A    |none  |None  |acc     |0.3645|±  |0.0093|
+|                                 |       |none  |None  |acc_norm|0.3468|±  |0.0092|
+| - agieval_aqua_rat              |      1|none  |None  |acc     |0.2283|±  |0.0264|
+|                                 |       |none  |None  |acc_norm|0.2283|±  |0.0264|
+| - agieval_logiqa_en             |      1|none  |None  |acc     |0.2965|±  |0.0179|
+|                                 |       |none  |None  |acc_norm|0.3303|±  |0.0184|
+| - agieval_lsat_ar               |      1|none  |None  |acc     |0.2217|±  |0.0275|
+|                                 |       |none  |None  |acc_norm|0.1783|±  |0.0253|
+| - agieval_lsat_lr               |      1|none  |None  |acc     |0.4039|±  |0.0217|
+|                                 |       |none  |None  |acc_norm|0.3686|±  |0.0214|
+| - agieval_lsat_rc               |      1|none  |None  |acc     |0.4870|±  |0.0305|
+|                                 |       |none  |None  |acc_norm|0.4424|±  |0.0303|
+| - agieval_sat_en                |      1|none  |None  |acc     |0.6408|±  |0.0335|
+|                                 |       |none  |None  |acc_norm|0.5971|±  |0.0343|
+| - agieval_sat_en_without_passage|      1|none  |None  |acc     |0.3932|±  |0.0341|
+|                                 |       |none  |None  |acc_norm|0.3835|±  |0.0340|
+| - agieval_sat_math              |      1|none  |None  |acc     |0.3455|±  |0.0321|
+|                                 |       |none  |None  |acc_norm|0.2727|±  |0.0301|
+|   Groups   |Version|Filter|n-shot| Metric |Value |   |Stderr|
+|------------|-------|------|------|--------|-----:|---|-----:|
+|agieval_nous|N/A    |none  |None  |acc     |0.3645|±  |0.0093|
+|            |       |none  |None  |acc_norm|0.3468|±  |0.0092|