English
LLM
BELLE
File size: 2,161 Bytes
3eff70a
 
 
 
 
 
 
 
 
51eca0d
3eff70a
3be3aca
3eff70a
 
 
 
 
 
3be3aca
3eff70a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3be3aca
 
 
 
 
 
 
 
 
 
 
 
 
3eff70a
 
 
 
3be3aca
3eff70a
 
3be3aca
3eff70a
 
 
 
 
 
 
3be3aca
3eff70a
3be3aca
abd81f7
3eff70a
 
 
 
 
abd81f7
3eff70a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
license: creativeml-openrail-m
language:
- en
tags:
- LLM
- tensorRT
- Belle
---
## Model Card for lyraBELLE

lyraBelle is currently the **fastest BELLE model** available. To the best of our knowledge, it is the **first accelerated version of Belle**.

The inference speed of lyraChatGLM has achieved **10x** acceleration upon the ealry original version. We are still working hard to further improve the performance.

Among its main features are:

- weights: original BELLE-7B-2M weights released by BelleGroup.
- device: Nvidia Ampere architechture or newer (e.g A100)
- batch_size: compiled with dynamic batch size, max batch_size = 8

## Speed

### test environment

- device: Nvidia A100 40G
- batch size: 8




## Model Sources

- **Repository:** [https://huggingface.co/BelleGroup/BELLE-7B-2M?clone=true]



## Uses

```python

from lyraBelle import LyraBelle

data_type = "fp16"
prompts = "今天天气大概 25度,有点小雨,吹着风,我想去户外散步,应该穿什么样的衣服裤子鞋子搭配。"
model_dir = "./model"
model_name = "1-gpu-fp16.h5"
max_output_length = 512


model = LyraBelle(model_dir, model_name, data_type, 0)
output_texts = model.generate(prompts, output_length=max_output_length,top_k=30, top_p=0.85, temperature=0.35, repetition_penalty=1.2, do_sample=True)
print(output_texts)
```
## Demo output

### input
今天天气大概 25度,有点小雨,吹着风,我想去户外散步,应该穿什么样的衣服裤子鞋子搭配。

### output
建议穿着一件轻便的衬衫或T恤、一条牛仔裤和一双运动鞋或休闲鞋。如果下雨了可以带上一把伞。

### TODO:

We plan to implement a FasterTransformer version to publish a much faster release. Stay tuned!

## Citation
``` bibtex
@Misc{lyraBelle2023,
  author =       {Kangjian Wu, Zhengtao Wang, Bin Wu},
  title =        {lyraChatGLM: Accelerating Belle by 10x+},
  howpublished = {\url{https://huggingface.co/TMElyralab/lyraBelle},
  year =         {2023}
}
```

## Report bug
- start a discussion to report any bugs!--> https://huggingface.co/TMElyralab/lyraBelle/discussions
- report bug with a `[bug]` mark in the title.