Locutusque
commited on
Commit
•
6ec1080
1
Parent(s):
184813a
Update README.md
Browse files
README.md
CHANGED
@@ -12,4 +12,32 @@ language:
|
|
12 |
|
13 |
The lr-experiment model series is a research project I'm conducting that I will be using to determine the best learning rate to use while fine-tuning Mistral. This model uses a learning rate of 2e-5 with a cosine scheduler and no warmup steps.
|
14 |
|
15 |
-
I used Locutusque/Hercules-2.0-Mistral-7B as a base model, and further fine-tuned it on CollectiveCognition/chats-data-2023-09-22 using QLoRA for 3 epochs. I will be keeping track of evaluation results, and will comparing it to upcoming models.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
|
13 |
The lr-experiment model series is a research project I'm conducting that I will be using to determine the best learning rate to use while fine-tuning Mistral. This model uses a learning rate of 2e-5 with a cosine scheduler and no warmup steps.
|
14 |
|
15 |
+
I used Locutusque/Hercules-2.0-Mistral-7B as a base model, and further fine-tuned it on CollectiveCognition/chats-data-2023-09-22 using QLoRA for 3 epochs. I will be keeping track of evaluation results, and will comparing it to upcoming models.
|
16 |
+
|
17 |
+
# Evals
|
18 |
+
|
19 |
+
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|
20 |
+
|---------------------------------|-------|------|------|--------|-----:|---|-----:|
|
21 |
+
|agieval_nous |N/A |none |None |acc |0.3645|± |0.0093|
|
22 |
+
| | |none |None |acc_norm|0.3468|± |0.0092|
|
23 |
+
| - agieval_aqua_rat | 1|none |None |acc |0.2283|± |0.0264|
|
24 |
+
| | |none |None |acc_norm|0.2283|± |0.0264|
|
25 |
+
| - agieval_logiqa_en | 1|none |None |acc |0.2965|± |0.0179|
|
26 |
+
| | |none |None |acc_norm|0.3303|± |0.0184|
|
27 |
+
| - agieval_lsat_ar | 1|none |None |acc |0.2217|± |0.0275|
|
28 |
+
| | |none |None |acc_norm|0.1783|± |0.0253|
|
29 |
+
| - agieval_lsat_lr | 1|none |None |acc |0.4039|± |0.0217|
|
30 |
+
| | |none |None |acc_norm|0.3686|± |0.0214|
|
31 |
+
| - agieval_lsat_rc | 1|none |None |acc |0.4870|± |0.0305|
|
32 |
+
| | |none |None |acc_norm|0.4424|± |0.0303|
|
33 |
+
| - agieval_sat_en | 1|none |None |acc |0.6408|± |0.0335|
|
34 |
+
| | |none |None |acc_norm|0.5971|± |0.0343|
|
35 |
+
| - agieval_sat_en_without_passage| 1|none |None |acc |0.3932|± |0.0341|
|
36 |
+
| | |none |None |acc_norm|0.3835|± |0.0340|
|
37 |
+
| - agieval_sat_math | 1|none |None |acc |0.3455|± |0.0321|
|
38 |
+
| | |none |None |acc_norm|0.2727|± |0.0301|
|
39 |
+
|
40 |
+
| Groups |Version|Filter|n-shot| Metric |Value | |Stderr|
|
41 |
+
|------------|-------|------|------|--------|-----:|---|-----:|
|
42 |
+
|agieval_nous|N/A |none |None |acc |0.3645|± |0.0093|
|
43 |
+
| | |none |None |acc_norm|0.3468|± |0.0092|
|