English
LLM
BELLE
bigmoyan commited on
Commit
cc3990f
1 Parent(s): 57951e0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -9
README.md CHANGED
@@ -8,13 +8,13 @@ tags:
8
  ---
9
  ## Model Card for lyraBELLE
10
 
11
- lyraBELLE is currently the **fastest BELLE model** available. To the best of our knowledge, it is the **first accelerated version of BELLE**.
12
 
13
- The inference speed of lyraBELLE has achieved **100x** acceleration upon the original version.
14
 
15
  Among its main features are:
16
 
17
- - weights: the original BELLE-7B-2M weights released by BelleGroup.
18
  - device: Nvidia Ampere architechture or newer (e.g A100)
19
 
20
  Note that:
@@ -23,20 +23,27 @@ Note that:
23
  - **int8 mode**: not supported yet, please always set it to 0
24
  - **data type**: only `fp16` available.
25
 
 
 
26
  ## Speed
27
 
28
  ### test environment
29
 
 
 
30
  - device: Nvidia A100 40G
31
  - warmup: 10 rounds
32
  - percision:fp16
33
- - batch size for our version: 64 (maximum under A100 40G)
34
- - batch size for original: xx (maximum under A100 40G)
 
 
35
 
36
  |version|batch size|speed|
37
- |:-:|:-:|
38
- |original|xxx|
39
- |lyraBELLE|80|3030.36 tokens/sec|
 
40
 
41
 
42
 
@@ -81,7 +88,7 @@ print(output_texts)
81
 
82
  ## Citation
83
  ``` bibtex
84
- @Misc{lyraBELLE2023,
85
  author = {Kangjian Wu, Zhengtao Wang, Bin Wu},
86
  title = {lyraBELLE: Accelerating BELLE by 100x+},
87
  howpublished = {\url{https://huggingface.co/TMElyralab/lyraBELLE},
 
8
  ---
9
  ## Model Card for lyraBELLE
10
 
11
+ lyraBelle is currently the **fastest BELLE model** available. To the best of our knowledge, it is the **first accelerated version of BELLE**.
12
 
13
+ The inference speed of lyraBelle has achieved **100x** acceleration upon the ealry original version. We are still working hard to further improve the performance.
14
 
15
  Among its main features are:
16
 
17
+ - weights: original BELLE-7B-2M weights released by BelleGroup.
18
  - device: Nvidia Ampere architechture or newer (e.g A100)
19
 
20
  Note that:
 
23
  - **int8 mode**: not supported yet, please always set it to 0
24
  - **data type**: only `fp16` available.
25
 
26
+
27
+
28
  ## Speed
29
 
30
  ### test environment
31
 
32
+ **it takes few minutes to load model and much longer time when you use original version. Just be patient..**
33
+
34
  - device: Nvidia A100 40G
35
  - warmup: 10 rounds
36
  - percision:fp16
37
+ - batch size for our version: 96 (almost maximum under A100 40G)
38
+ - batch size for the original: xx (almost maximum under A100 40G)
39
+ - language: chinese, keep same in a batch.
40
+
41
 
42
  |version|batch size|speed|
43
+ |:-:|:-:|:-:|
44
+ |original|50|xx|
45
+ |lyraBELLE|50|xx|
46
+ |lyraBELLE|96|3507.00/sec|
47
 
48
 
49
 
 
88
 
89
  ## Citation
90
  ``` bibtex
91
+ @Misc{lyraBelle2023,
92
  author = {Kangjian Wu, Zhengtao Wang, Bin Wu},
93
  title = {lyraBELLE: Accelerating BELLE by 100x+},
94
  howpublished = {\url{https://huggingface.co/TMElyralab/lyraBELLE},