sayhan commited on
Commit
d82f3d8
1 Parent(s): 4b36bc8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -1
README.md CHANGED
@@ -18,4 +18,26 @@ library_name: transformers
18
  - **Finetuned by:** [sayhan](https://huggingface.co/sayhan)
19
  - **License:** [apache-2.0](https://choosealicense.com/licenses/apache-2.0/)
20
  - **Finetuned from model :** [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
21
- - **Dataset:** [sayhan/strix-philosophy-qa](https://huggingface.co/datasets/sayhan/strix-philosophy-qa)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  - **Finetuned by:** [sayhan](https://huggingface.co/sayhan)
19
  - **License:** [apache-2.0](https://choosealicense.com/licenses/apache-2.0/)
20
  - **Finetuned from model :** [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
21
+ - **Dataset:** [sayhan/strix-philosophy-qa](https://huggingface.co/datasets/sayhan/strix-philosophy-qa)
22
+ ---
23
+ **LoRA rank:** 8
24
+ **LoRA alpha:** 16
25
+ **LoRA dropout:** 0
26
+ **Rank-stabilized LoRA:** Yes
27
+ **Number of epochs:** 3
28
+ **Learning rate:** 1e-5
29
+ **Batch size:** 2
30
+ **Gradient accumulation steps:** 4
31
+ **Weight decay:** 0.01
32
+ **Target modules:**
33
+ ```
34
+ - Query projection (`q_proj`)
35
+ - Key projection (`k_proj`)
36
+ - Value projection (`v_proj`)
37
+ - Output projection (`o_proj`)
38
+ - Gate projection (`gate_proj`)
39
+ - Up projection (`up_proj`)
40
+ - Down projection (`down_proj`)
41
+ ```
42
+
43
+