aaabiao Chasell commited on
Commit
e013265
1 Parent(s): 6e20f7e

Create README.md (#1)

Browse files

- Create README.md (3fdf3fca1400c2767dd14fd7f0831d6a848f9a54)


Co-authored-by: YanGPT <Chasell@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +22 -0
README.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # OpenLLaMA 7Bv2 Model Card
2
+
3
+ ## Model Description
4
+
5
+ OpenLLaMA 7Bv2 is a cutting-edge language model, trained with a focus on delivering high-quality, contextually relevant text predictions. It leverages a diverse composite dataset that includes web-crawled data, scholarly articles, and a wide range of literature and question-answer pairs to ensure broad domain coverage and applicability.
6
+
7
+ ## Training Data
8
+
9
+ The model was trained on a composite dataset that includes:
10
+
11
+ - Falcon refined-web dataset
12
+ - starcoder datasets
13
+ - Contributions from Wikipedia for encyclopedic knowledge
14
+ - Academic papers from arXiv for scientific understanding
15
+ - A vast collection of books spanning multiple genres
16
+ - Stack Exchange data curated by RedPajama
17
+
18
+ ## Training Procedure
19
+
20
+ - **Learning Rate:** Utilized a maximum learning rate of 3e-4 and a minimum learning rate of 3e-5.
21
+ - **Batch Size:** Employed a batch size of 4 million tokens, optimizing the training process for both efficiency and performance.
22
+ - **Learning Rate Scheduler:** The model's learning rate scheduling closely follows the strategy used in Llama2, ensuring gradual adjustments for optimal convergence.