bigmoyan commited on
Commit
194cf5c
1 Parent(s): bdd576c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -9
README.md CHANGED
@@ -4,12 +4,11 @@ language:
4
  - en
5
  tags:
6
  - LLM
7
- - tensorRT
8
- - Belle
9
  ---
10
- ## Model Card for lyraBelle
11
 
12
- lyraBelle is currently the **fastest BELLE model** available. To the best of our knowledge, it is the **first accelerated version of Belle**.
13
 
14
  The inference speed of lyraBelle has achieved **10x** acceleration upon the ealry original version. We are still working hard to further improve the performance.
15
 
@@ -17,7 +16,6 @@ Among its main features are:
17
 
18
  - weights: original BELLE-7B-2M weights released by BelleGroup.
19
  - device: Nvidia Ampere architechture or newer (e.g A100)
20
- - batch_size: compiled with dynamic batch size, max batch_size = 8
21
 
22
  Note that:
23
  **Some interface/code were set for future uses(see demo below).**
@@ -30,7 +28,15 @@ Note that:
30
  ### test environment
31
 
32
  - device: Nvidia A100 40G
33
- - batch size: 8
 
 
 
 
 
 
 
 
34
 
35
 
36
 
@@ -77,12 +83,12 @@ print(output_texts)
77
  ``` bibtex
78
  @Misc{lyraBelle2023,
79
  author = {Kangjian Wu, Zhengtao Wang, Bin Wu},
80
- title = {lyraChatGLM: Accelerating Belle by 10x+},
81
- howpublished = {\url{https://huggingface.co/TMElyralab/lyraBelle},
82
  year = {2023}
83
  }
84
  ```
85
 
86
  ## Report bug
87
- - start a discussion to report any bugs!--> https://huggingface.co/TMElyralab/lyraBelle/discussions
88
  - report bug with a `[bug]` mark in the title.
 
4
  - en
5
  tags:
6
  - LLM
7
+ - BELLE
 
8
  ---
9
+ ## Model Card for lyraBELLE
10
 
11
+ lyraBelle is currently the **fastest BELLE model** available. To the best of our knowledge, it is the **first accelerated version of BELLE**.
12
 
13
  The inference speed of lyraBelle has achieved **10x** acceleration upon the ealry original version. We are still working hard to further improve the performance.
14
 
 
16
 
17
  - weights: original BELLE-7B-2M weights released by BelleGroup.
18
  - device: Nvidia Ampere architechture or newer (e.g A100)
 
19
 
20
  Note that:
21
  **Some interface/code were set for future uses(see demo below).**
 
28
  ### test environment
29
 
30
  - device: Nvidia A100 40G
31
+ - warmup: 10 rounds
32
+ - percision:fp16
33
+ - batch size for our version: 64 (maximum under A100 40G)
34
+ - batch size for original: xx (maximum under A100 40G)
35
+
36
+ |version|batch size|speed|
37
+ |:-:|:-:|
38
+ |original|xxx|
39
+ |lyraBELLE|80|3030.36/sec|
40
 
41
 
42
 
 
83
  ``` bibtex
84
  @Misc{lyraBelle2023,
85
  author = {Kangjian Wu, Zhengtao Wang, Bin Wu},
86
+ title = {lyraBELLE: Accelerating BELLE by 10x+},
87
+ howpublished = {\url{https://huggingface.co/TMElyralab/lyraBELLE},
88
  year = {2023}
89
  }
90
  ```
91
 
92
  ## Report bug
93
+ - start a discussion to report any bugs!--> https://huggingface.co/TMElyralab/lyraBELLE/discussions
94
  - report bug with a `[bug]` mark in the title.