bhatta1 commited on
Commit
7b526cd
·
verified ·
1 Parent(s): 0309dda

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -1,3 +1,11 @@
 
 
 
 
 
 
 
 
1
  **Model Summary**
2
 
3
  Recently, IBM has introduced GneissWeb; a large dataset yielding around 10 trillion tokens that caters to the data quality and quantity requirements of training LLMs. The models trained using GneissWeb dataset outperform those trained on FineWeb 1.1.0 by 2.14 percentage points in terms of average score computed on a set of 11 commonly used benchmarks.
 
1
+ ---
2
+ viewer: false
3
+ license:
4
+ - apache-2.0
5
+ language:
6
+ - en
7
+ ---
8
+
9
  **Model Summary**
10
 
11
  Recently, IBM has introduced GneissWeb; a large dataset yielding around 10 trillion tokens that caters to the data quality and quantity requirements of training LLMs. The models trained using GneissWeb dataset outperform those trained on FineWeb 1.1.0 by 2.14 percentage points in terms of average score computed on a set of 11 commonly used benchmarks.