AtAndDev commited on
Commit
4bcf161
1 Parent(s): 114712b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -1
README.md CHANGED
@@ -4,4 +4,40 @@ datasets:
4
  - Photolens/alpaca-cleaned-airoboros-2.1-no-code-oasst1-en-merged
5
  language:
6
  - en
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - Photolens/alpaca-cleaned-airoboros-2.1-no-code-oasst1-en-merged
5
  language:
6
  - en
7
+ ---
8
+
9
+ ## Model overview
10
+ This model is finetuned on *[a merged dataset of: oasst1-en, alpaca-cleaned and airoboros-2.1-no-code](https://huggingface.co/datasets/Photolens/alpaca-cleaned-airoboros-2.1-no-code-oasst1-en-merged)* on a base model: *[Marx-3b-V2](https://huggingface.co/acrastt/Marx-3B-V2)*
11
+ - License: "`Creative-Commons-Attribution-4.0`"
12
+ - Language: "`en`"
13
+ - Size: "`3.43b params`"
14
+
15
+ ## Prompt template
16
+ Prompt template:
17
+ ```
18
+ ### SYSTEM:
19
+ <system_prompt_here>
20
+
21
+ ### HUMAN:
22
+ <prompter_message_here>
23
+
24
+ ### INPUT:
25
+ <input_text_here>
26
+
27
+ ### RESPONSE:
28
+ <leave_a_blank_line_here>
29
+ ```
30
+ *Note: If you dont have a system or input text, do not include the tokens in the prompt.*
31
+
32
+ ## Training Details
33
+ This model took `2:40:54` to train in LoRA on a single `A100 40gb` GPU.<br>
34
+ - *epochs*: `1`
35
+ - *train batch size*: `8`
36
+ - *eval batch size*: `8`
37
+ - *gradient accumulation steps*: `1`
38
+ - *maximum gradient normal*: `0.3`
39
+ - *learning rate*: `2e-4`
40
+ - *weight decay*: `0.001`
41
+ - *optimizer*: `paged_adamw_32bit`
42
+ - *learning rate schedule*: `cosine`
43
+ - *warmup ratio (linear)*: `0.03`