YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
该模型使用llama-13b,使用UltraChat数据集进行指令微调,约140万多轮对话数据。仅需一张显卡即可完成训练。
firefly-llama-13b在🤗Hugging Face的Open LLM榜单上进行了客观的评测。
在榜单上,firefly-llama-13b取得了不错的效果,比vicuna-13b-1.1略高0.2分,比llama-2-13b-chat略低0.5分,比vicuna-13b-v1.3略低0.6分。从评测分数来看,firefly-llama-13b与vicuna-13b、llama-2-13b-chat的水平非常接近😎。
模型 | Average | ARC | HellaSwag | MMLU | TruthfulQA (MC) |
---|---|---|---|---|---|
Llama-2-70b-chat-hf | 66.8 | 64.6 | 85.9 | 63.9 | 52.8 |
vicuna-13b-v1.3 | 60 | 54.6 | 80.4 | 52.9 | 52.1 |
Llama-2-13b-chat-hf | 59.9 | 59 | 81.9 | 54.6 | 44.1 |
firefly-llama-13b | 59.4 | 59 | 79.7 | 49.1 | 49.6 |
vicuna-13b-1.1 | 59.2 | 52.7 | 80.1 | 51.9 | 52.1 |
guanaco-13B-HF | 59.1 | 57.8 | 83.8 | 48.3 | 46.7 |
值得注意的是,vicuna-13b模型采用的是全量参数微调,对训练资源的要求十分高。而firefly-llama-13b采用的则是QLoRA微调,最少仅需16G显存,即可对13B的模型进行微调。
详细介绍见文章:Firefly单卡复刻Vicuna-13B,Open LLM榜单🤗略高0.2分
更多详情见Firefly项目
- Downloads last month
- 714
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.