File size: 1,281 Bytes
ed1615e
29f5049
ed1615e
192228d
ed1615e
29f5049
 
 
 
d4604b8
29f5049
a45132c
 
 
29f5049
 
9b1e490
29f5049
 
9b1e490
29f5049
 
9b1e490
29f5049
 
 
 
 
9b1e490
29f5049
 
 
9b1e490
29f5049
9b1e490
 
29f5049
9b1e490
 
29f5049
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
library_name: transformers
license: apache-2.0
basemodel: google/gemma-7b-it
---

## Model Card for Firefly-Gemma

[gemma-7B-it-firefly](https://huggingface.co/yys/gemma-7B-it-firefly) is trained based on [gemma-7b-it](https://huggingface.co/google/gemma-7b-it) to act as a helpful and harmless AI assistant. 
we trained the model on [firefly-train-1.1M](https://huggingface.co/datasets/YeungNLP/firefly-train-1.1M) dataset using LoRA.


<img src="gemma-7B-it-firefly.jpg" width="250">


## Performance
we evaluate the model on [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)

## Usage
The chat template of our chat models is same as Official gemma-7b-it:
```text
<bos><start_of_turn>user
Write a hello world program<end_of_turn>
<start_of_turn>model
```

You can also use the following code:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name_or_path = "yys/gemma-7B-it-firefly"
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = AutoModelForCausalLM.from_pretrained(model_name_or_path)

input_text = "给我写一首关于机器学习的诗歌。"
input_ids = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))

```