JRosenkranz
commited on
Commit
•
625f5e4
1
Parent(s):
7c4bdb9
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,23 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
|
5 |
+
#### Model Name: Granite-7b-base
|
6 |
+
|
7 |
+
#### License: Apache-2.0
|
8 |
+
|
9 |
+
#### Languages: Primarily English
|
10 |
+
|
11 |
+
#### Architecture: The model architecture is a replica of Meta’s Llama2-7B base variant with MHA, trained with 1M batch size on 2T tokens.
|
12 |
+
|
13 |
+
#### Context Length: 4k tokens
|
14 |
+
|
15 |
+
#### Tokenizer: Llama2
|
16 |
+
|
17 |
+
#### Model Developers: IBM Research
|
18 |
+
|
19 |
+
Representing IBM’s commitment to open source innovation IBM has released granite-7b-base, a base pre-trained LLM from IBM’s Granite model series, under an apache-2.0 license for community and commercial use. Granite-7b-base was pre-trained from scratch on IBM-curated data as an open reference implementation of Meta’s Llama-2-7B. In a commitment to data transparency and fostering open innovation, the data sources, sampling proportions, and URLs for access are provided below.
|
20 |
+
|
21 |
+
#### Pre-Training Data
|
22 |
+
|
23 |
+
The model was trained on 2T tokens, with sampling proportions designed to match the sampling distributions released in the Llama1 paper as closely as possible.
|