Update README.md
Browse files
README.md
CHANGED
@@ -18,4 +18,26 @@ library_name: transformers
|
|
18 |
- **Finetuned by:** [sayhan](https://huggingface.co/sayhan)
|
19 |
- **License:** [apache-2.0](https://choosealicense.com/licenses/apache-2.0/)
|
20 |
- **Finetuned from model :** [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
|
21 |
-
- **Dataset:** [sayhan/strix-philosophy-qa](https://huggingface.co/datasets/sayhan/strix-philosophy-qa)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
- **Finetuned by:** [sayhan](https://huggingface.co/sayhan)
|
19 |
- **License:** [apache-2.0](https://choosealicense.com/licenses/apache-2.0/)
|
20 |
- **Finetuned from model :** [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
|
21 |
+
- **Dataset:** [sayhan/strix-philosophy-qa](https://huggingface.co/datasets/sayhan/strix-philosophy-qa)
|
22 |
+
---
|
23 |
+
**LoRA rank:** 8
|
24 |
+
**LoRA alpha:** 16
|
25 |
+
**LoRA dropout:** 0
|
26 |
+
**Rank-stabilized LoRA:** Yes
|
27 |
+
**Number of epochs:** 3
|
28 |
+
**Learning rate:** 1e-5
|
29 |
+
**Batch size:** 2
|
30 |
+
**Gradient accumulation steps:** 4
|
31 |
+
**Weight decay:** 0.01
|
32 |
+
**Target modules:**
|
33 |
+
```
|
34 |
+
- Query projection (`q_proj`)
|
35 |
+
- Key projection (`k_proj`)
|
36 |
+
- Value projection (`v_proj`)
|
37 |
+
- Output projection (`o_proj`)
|
38 |
+
- Gate projection (`gate_proj`)
|
39 |
+
- Up projection (`up_proj`)
|
40 |
+
- Down projection (`down_proj`)
|
41 |
+
```
|
42 |
+
|
43 |
+
|