Text Generation
Transformers
PyTorch
gpt_neox
text-generation-inference
Inference Endpoints
manojpreveen commited on
Commit
9209279
·
1 Parent(s): 95dbf9f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -0
README.md ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bigscience-openrail-m
3
+ datasets:
4
+ - manojpreveen/Instruction_Tuning
5
+ - manojpreveen/Conversational_Data
6
+ ---
7
+ Instruction Tuned GPT-NeoXT-20B model on Instruction Tuning dataset as listed below (~5.2M data) using ***Colossal AI***
8
+
9
+ **Base Model:** togethercomputer/GPT-NeoXT-Chat-Base-20B (GPT-NeoXT-Chat-Base-20B-v0.16 - fine-tuned on feedback data)
10
+
11
+ **Training Details :**
12
+ * Epochs: 2
13
+ * Batch Size : 5 instantaneous per device x 1 gradient accumulation steps x 8 gpus = 40
14
+ * Block Size : 2020
15
+ * Weight Decay : 0
16
+ * Learning Rate : 1e-6
17
+ * Learning Rate Scheduler Type : Cosine
18
+ * Number of warmup steps : 600
19
+ * Machine : 8xA100 80GB
20
+
21
+ **Training Data Specifics :**
22
+ * Labels and Input ids are exactly the same.
23
+ * Block Size is 2020, Multiple instructions are clubbed together in each data.
24
+ * "###" is the EOS Token used in the data.