--- license: apache-2.0 language: - en tags: - LLM - BELLE --- ## Model Card for lyraBELLE lyraBELLE is currently the **fastest BELLE model** available. To the best of our knowledge, it is the **first accelerated version of BELLE**. The inference speed of lyraBELLE has achieved **3.3x+** acceleration upon the original version. Among its main features are: - weights: the original BELLE-7B-2M weights released by BelleGroup. - device: Nvidia Ampere architechture or newer (e.g., A100) Note that: **Some interface/code were set for future uses(see demo below).** - **int8 mode**: not supported yet, please always set it at 0 - **data type**: only `fp16` available. ## Speed ### test environment - device: Nvidia A100 40G - warmup: 10 rounds - percision: fp16 - batch size: 64 - language: Chinese, keep the same in a batch. - do_sample: True, the model will generate slightly different answsers to the same questions. |version|speed| |:-:|:-:| |original|826.34 tokens/sec| |lyraBELLE|2701.71 tokens/sec| ## Model Sources - **Repository:** [https://huggingface.co/BelleGroup/BELLE-7B-2M?clone=true] ## Environment - **docker image available** at [https://hub.docker.com/repository/docker/bigmoyan/lyrallm/general], pull image by: ``` docker pull bigmoyan/lyrallm:v0.1 ``` ## Uses ```python from lyraBelle import LyraBelle data_type = "fp16" prompts = "今天天气大概 25度,有点小雨,吹着风,我想去户外散步,应该穿什么样的衣服裤子鞋子搭配。" model_dir = "./model" model_name = "1-gpu-fp16.h5" max_output_length = 512 # int8 mode not supported, data_type only support fp16 model = LyraBelle(model_dir, model_name, data_type, 0) output_texts = model.generate(prompts, output_length=max_output_length,top_k=30, top_p=0.85, temperature=0.35, repetition_penalty=1.2, do_sample=True) print(output_texts) ``` ## Demo output ### input 今天天气大概 25度,有点小雨,吹着风,我想去户外散步,应该穿什么样的衣服裤子鞋子搭配。 ### output 建议穿着一件轻便的衬衫或T恤、一条牛仔裤和一双运动鞋或休闲鞋。如果下雨了可以带上一把伞。 ## Citation ``` bibtex @Misc{lyraBELLE2023, author = {Kangjian Wu, Zhengtao Wang, Bin Wu}, title = {lyraBELLE: Accelerating BELLE by 3x+}, howpublished = {\url{https://huggingface.co/TMElyralab/lyraBELLE}, year = {2023} } ``` ## Report bug - start a discussion to report any bugs!--> https://huggingface.co/TMElyralab/lyraBELLE/discussions - report bug with a `[bug]` mark in the title.