ZenQin commited on
Commit
e3b7323
1 Parent(s): 05f111b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -2,15 +2,14 @@
2
  license: apache-2.0
3
  ---
4
 
5
- # JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars
6
-
7
-
8
  <div align="center">
9
  <div>&nbsp;</div>
10
  <img src="https://cdn-uploads.huggingface.co/production/uploads/641de0213239b631552713e4/ieHnwuczidNNoGRA_FN2y.png" width="500"/>
11
  <img src="https://cdn-uploads.huggingface.co/production/uploads/641de0213239b631552713e4/UOsk9_zcbHpCCy6kmryYM.png" width="530"/>
12
  </div>
13
 
 
 
14
  ## Key Messages
15
 
16
  1. JetMoE-8B is **trained with less than $ 0.1 million**<sup>1</sup> **cost but outperforms LLaMA2-7B from Meta AI**, who has multi-billion-dollar training resources. LLM training can be **much cheaper than people generally thought**.
@@ -63,7 +62,7 @@ We use the same evaluation methodology as in the Open LLM leaderboard. For MBPP
63
  To our surprise, despite the lower training cost and computation, JetMoE-8B performs even better than LLaMA2-7B, LLaMA-13B, and DeepseekMoE-16B. Compared to a model with similar training and inference computation, like Gemma-2B, JetMoE-8B achieves better performance.
64
 
65
  ## Model Usage
66
- To load the models, you need install this package:
67
  ```
68
  pip install -e .
69
  ```
 
2
  license: apache-2.0
3
  ---
4
 
 
 
 
5
  <div align="center">
6
  <div>&nbsp;</div>
7
  <img src="https://cdn-uploads.huggingface.co/production/uploads/641de0213239b631552713e4/ieHnwuczidNNoGRA_FN2y.png" width="500"/>
8
  <img src="https://cdn-uploads.huggingface.co/production/uploads/641de0213239b631552713e4/UOsk9_zcbHpCCy6kmryYM.png" width="530"/>
9
  </div>
10
 
11
+ # JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars
12
+
13
  ## Key Messages
14
 
15
  1. JetMoE-8B is **trained with less than $ 0.1 million**<sup>1</sup> **cost but outperforms LLaMA2-7B from Meta AI**, who has multi-billion-dollar training resources. LLM training can be **much cheaper than people generally thought**.
 
62
  To our surprise, despite the lower training cost and computation, JetMoE-8B performs even better than LLaMA2-7B, LLaMA-13B, and DeepseekMoE-16B. Compared to a model with similar training and inference computation, like Gemma-2B, JetMoE-8B achieves better performance.
63
 
64
  ## Model Usage
65
+ To load the models, you need install [this package](https://github.com/myshell-ai/JetMoE):
66
  ```
67
  pip install -e .
68
  ```