manojpreveen commited on
Commit
5f228b7
1 Parent(s): 98e2ff0

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bigscience-openrail-m
3
+ datasets:
4
+ - manojpreveen/Instruction_Tuning
5
+ ---
6
+ Instruction Tuned GPT-NeoXT-20B model on Instruction Tuning dataset as listed below (~560k data) using ***Colossal AI***
7
+
8
+ **Base Model:** togethercomputer/GPT-NeoXT-Chat-Base-20B (GPT-NeoXT-Chat-Base-20B-v0.16 - fine-tuned on feedback data)
9
+
10
+ **Training Details :**
11
+ * Epochs: 5
12
+ * Batch Size : 16 instantaneous per device x 1 gradient accumulation steps x 8 gpus = 128
13
+ * Max Length : 1024
14
+ * Weight Decay : 0
15
+ * Learning Rate : 2e-5
16
+ * Learning Rate Scheduler Type : Cosine
17
+ * Number of warmup steps : 240
18
+ * Machine : 8xA100 80GB
19
+
20
+ **Dataset Details :**
21
+
22
+ Dataset : manojpreveen/Instruction_Tuning
23
+
24
+ Files :
25
+ * stanford_alpaca_it_v2.csv
26
+ * ColossalChat.csv
27
+ * unified_chip2.csv
28
+ * iamai_summarization_v1.csv
29
+ * iamai_v1.csv