Update README.md
Browse files
README.md
CHANGED
@@ -8,13 +8,13 @@ tags:
|
|
8 |
---
|
9 |
## Model Card for lyraBELLE
|
10 |
|
11 |
-
|
12 |
|
13 |
-
The inference speed of
|
14 |
|
15 |
Among its main features are:
|
16 |
|
17 |
-
- weights:
|
18 |
- device: Nvidia Ampere architechture or newer (e.g A100)
|
19 |
|
20 |
Note that:
|
@@ -23,20 +23,27 @@ Note that:
|
|
23 |
- **int8 mode**: not supported yet, please always set it to 0
|
24 |
- **data type**: only `fp16` available.
|
25 |
|
|
|
|
|
26 |
## Speed
|
27 |
|
28 |
### test environment
|
29 |
|
|
|
|
|
30 |
- device: Nvidia A100 40G
|
31 |
- warmup: 10 rounds
|
32 |
- percision:fp16
|
33 |
-
- batch size for our version:
|
34 |
-
- batch size for original: xx (maximum under A100 40G)
|
|
|
|
|
35 |
|
36 |
|version|batch size|speed|
|
37 |
-
|
38 |
-
|original|
|
39 |
-
|lyraBELLE|
|
|
|
40 |
|
41 |
|
42 |
|
@@ -81,7 +88,7 @@ print(output_texts)
|
|
81 |
|
82 |
## Citation
|
83 |
``` bibtex
|
84 |
-
@Misc{
|
85 |
author = {Kangjian Wu, Zhengtao Wang, Bin Wu},
|
86 |
title = {lyraBELLE: Accelerating BELLE by 100x+},
|
87 |
howpublished = {\url{https://huggingface.co/TMElyralab/lyraBELLE},
|
|
|
8 |
---
|
9 |
## Model Card for lyraBELLE
|
10 |
|
11 |
+
lyraBelle is currently the **fastest BELLE model** available. To the best of our knowledge, it is the **first accelerated version of BELLE**.
|
12 |
|
13 |
+
The inference speed of lyraBelle has achieved **100x** acceleration upon the ealry original version. We are still working hard to further improve the performance.
|
14 |
|
15 |
Among its main features are:
|
16 |
|
17 |
+
- weights: original BELLE-7B-2M weights released by BelleGroup.
|
18 |
- device: Nvidia Ampere architechture or newer (e.g A100)
|
19 |
|
20 |
Note that:
|
|
|
23 |
- **int8 mode**: not supported yet, please always set it to 0
|
24 |
- **data type**: only `fp16` available.
|
25 |
|
26 |
+
|
27 |
+
|
28 |
## Speed
|
29 |
|
30 |
### test environment
|
31 |
|
32 |
+
**it takes few minutes to load model and much longer time when you use original version. Just be patient..**
|
33 |
+
|
34 |
- device: Nvidia A100 40G
|
35 |
- warmup: 10 rounds
|
36 |
- percision:fp16
|
37 |
+
- batch size for our version: 96 (almost maximum under A100 40G)
|
38 |
+
- batch size for the original: xx (almost maximum under A100 40G)
|
39 |
+
- language: chinese, keep same in a batch.
|
40 |
+
|
41 |
|
42 |
|version|batch size|speed|
|
43 |
+
|:-:|:-:|:-:|
|
44 |
+
|original|50|xx|
|
45 |
+
|lyraBELLE|50|xx|
|
46 |
+
|lyraBELLE|96|3507.00/sec|
|
47 |
|
48 |
|
49 |
|
|
|
88 |
|
89 |
## Citation
|
90 |
``` bibtex
|
91 |
+
@Misc{lyraBelle2023,
|
92 |
author = {Kangjian Wu, Zhengtao Wang, Bin Wu},
|
93 |
title = {lyraBELLE: Accelerating BELLE by 100x+},
|
94 |
howpublished = {\url{https://huggingface.co/TMElyralab/lyraBELLE},
|