Sharathhebbar24 commited on
Commit
a6e6b98
1 Parent(s): 8f6336e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -1
README.md CHANGED
@@ -5,4 +5,24 @@ language:
5
  pipeline_tag: text-generation
6
  ---
7
 
8
- Sharathhebbar24/ssh_1.8B is a 1.8B model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  pipeline_tag: text-generation
6
  ---
7
 
8
+ Sharathhebbar24/ssh_1.8B is a 1.8B model
9
+
10
+ The model is a modified version of [qnguyen3/quan-1.8b-chat](https://huggingface.co/qnguyen3/quan-1.8b-chat)
11
+
12
+ ## Training hyperparameters
13
+ The following hyperparameters were used during training:
14
+
15
+ learning_rate: 2e-05
16
+ train_batch_size: 2
17
+ eval_batch_size: 2
18
+ seed: 42
19
+ distributed_type: multi-GPU
20
+ num_devices: 4
21
+ gradient_accumulation_steps: 4
22
+ total_train_batch_size: 32
23
+ total_eval_batch_size: 8
24
+ optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
25
+ lr_scheduler_type: cosine
26
+ lr_scheduler_warmup_steps: 100
27
+ num_epochs: 4
28
+