princeton-nlp commited on
Commit
3bc5cf3
1 Parent(s): 83be984

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -14
README.md CHANGED
@@ -18,26 +18,28 @@ model = AutoModelForCausalLM.from_pretrained("princeton-nlp/Sheared-LLaMA-1.3B")
18
 
19
  We evaluate on an extensive set of downstream tasks including reasoning, reading comprehension, language modeling and knowledge intensive tasks. Our Sheared-LLaMA models outperform existing large language models.
20
 
21
- | Model | # Pre-training Tokens | Average Performance |
22
- | --- | --- | --- |
23
- | LLaMA2-7B | 2T | 64.6 |
24
 
25
  **1.3B**
26
 
27
- | OPT-1.3B | 300B | 48.2 |
28
- | --- | --- | --- |
29
- | Pythia-1.4B | 300B | 48.9 |
30
- | Sheared-LLaMA-1.3B | 50B | 51.0 |
 
31
 
32
  **3B**
33
 
34
- | OPT-2.7B | 300B | 51.4 |
35
- | --- | --- | --- |
36
- | Pythia-2.8B | 300B | 52.5 |
37
- | INCITE-Base-3B | 800B | 54.7 |
38
- | Open-LLaMA-3B-v1 | 1T | 55.1 |
39
- | Open-LLaMA-3B-v2 | 1T | 55.7 |
40
- | Sheared-LLaMA-2.7B | 50B | 56.7 |
 
41
 
42
  ## Bibtex
43
  ```
 
18
 
19
  We evaluate on an extensive set of downstream tasks including reasoning, reading comprehension, language modeling and knowledge intensive tasks. Our Sheared-LLaMA models outperform existing large language models.
20
 
21
+ | Model | # Pre-training Tokens | Average Performance |
22
+ | ------------------- | --------------------- | ------------------- |
23
+ | LLaMA2-7B | 2T | 64.6 |
24
 
25
  **1.3B**
26
 
27
+ | Model | # Pre-training Tokens | Average Performance |
28
+ | ------------------- | --------------------- | ------------------- |
29
+ | OPT-1.3B | 300B | 48.2 |
30
+ | Pythia-1.4B | 300B | 48.9 |
31
+ | **Sheared-LLaMA-1.3B** | **50B** | **51.0** |
32
 
33
  **3B**
34
 
35
+ | Model | # Pre-training Tokens | Average Performance |
36
+ | ------------------- | --------------------- | ------------------- |
37
+ | OPT-2.7B | 300B | 51.4 |
38
+ | Pythia-2.8B | 300B | 52.5 |
39
+ | INCITE-Base-3B | 800B | 54.7 |
40
+ | Open-LLaMA-3B-v1 | 1T | 55.1 |
41
+ | Open-LLaMA-3B-v2 | 1T | 55.7 |
42
+ | Sheared-LLaMA-2.7B | 50B | 56.7 |
43
 
44
  ## Bibtex
45
  ```