|
--- |
|
datasets: |
|
- OpenAssistant/oasst1 |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# Falcon-7b-chat-oasst1 |
|
|
|
Falcon-7b-chat-oasst1 is a chatbot-like model for dialogue generation. It was built by fine-tuning [Falcon-7B](https://huggingface.co/tiiuae/falcon-7b) on the [OpenAssistant/oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1) dataset. |
|
This model was fine-tuned in 8-bit using 🤗 [peft](https://github.com/huggingface/peft) adapters, [transformers](https://github.com/huggingface/transformers), and [bitsandbytes](https://github.com/TimDettmers/bitsandbytes). |
|
- The training relied on a recent method called "Low Rank Adapters" ([LoRA](https://arxiv.org/pdf/2106.09685.pdf)), instead of fine-tuning the entire model you just have to fine-tune adapters and load them properly inside the model. |
|
- Training took approximately 6 hours and was executed on a workstation with a single NVIDIA A100-SXM 40GB GPU (via Google Colab). |
|
- See attached [Notebook](https://huggingface.co/intellio-NLP/falcon-7b-chat-oasst1/blob/main/finetune_falcon7b_oasst1_with_bnb_peft.ipynb) for the code (and hyperparams) used to train the model. |
|
|
|
## Model Summary |
|
|
|
- **Model Type:** Causal decoder-only |
|
- **Language(s) (NLP):** English (primarily) |
|
- **Base Model:** [Falcon-7B](https://huggingface.co/tiiuae/falcon-7b) (License: [TII Falcon LLM License](https://huggingface.co/tiiuae/falcon-7b#license), commercial use ok-ed) |
|
- **Dataset:** [OpenAssistant/oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1) (License: [Apache 2.0](https://huggingface.co/datasets/OpenAssistant/oasst1/blob/main/LICENSE), commercial use ok-ed) |
|
|
|
### Model Date |
|
|
|
May 30, 2023 |
|
|
|
## Quick Start |
|
|
|
To prompt the chat model, use the following format: |
|
|
|
``` |
|
<human>: [Instruction] |
|
<bot>: |
|
``` |
|
|
|
### Example Dialogue |
|
|
|
**Prompter**: |
|
|
|
``` |
|
"""<human>: My name is Daniel. Write a short email to my closest friends inviting them to come to my home on Friday for a dinner party, I will make the food but tell them to BYOB. |
|
<bot>:""" |
|
``` |
|
|
|
**Falcon-7b-chat-oasst1**: |
|
|
|
``` |
|
Dear friends, |
|
|
|
I am so excited to host a dinner party at my home this Friday! I will be making a delicious meal, but I would love for you to bring your favorite bottle of wine to share with everyone. |
|
|
|
Please let me know if you can make it and if you have any dietary restrictions I should be aware of. I look forward to seeing you soon! |
|
|
|
Best, |
|
Daniel |
|
``` |
|
|
|
**Prompter**: |
|
``` |
|
<human>: Create a list of things to do in San Francisco.\n |
|
<bot>: |
|
``` |
|
|
|
**Falcon-7b-chat-oasst1**: |
|
>Coming |
|
|
|
### Direct Use |
|
|
|
This model has been finetuned on conversation trees from [OpenAssistant/oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1) and should only be used on data of a similar nature. |
|
|
|
### Out-of-Scope Use |
|
|
|
Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful. |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
This model is mostly trained on English data, and will not generalize appropriately to other languages. Furthermore, as it is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online. |
|
|
|
### Recommendations |
|
|
|
We recommend users of this model to develop guardrails and to take appropriate precautions for any production use. |
|
|
|
## How to Get Started with the Model |
|
|
|
### Setup |
|
```python |
|
# Install and import packages |
|
!pip install -q -U bitsandbytes loralib einops |
|
!pip install -q -U git+https://github.com/huggingface/transformers.git |
|
!pip install -q -U git+https://github.com/huggingface/peft.git |
|
!pip install -q -U git+https://github.com/huggingface/accelerate.git |
|
|
|
import torch |
|
from peft import PeftModel, PeftConfig |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
# Login to HF |
|
from huggingface_hub import notebook_login |
|
|
|
notebook_login() # use personal HF token for access to intellio-nlp |
|
``` |
|
|
|
### GPU Inference in 8-bit |
|
|
|
This requires a GPU with at least 12GB memory. |
|
|
|
```python |
|
# load the model |
|
peft_model_id = "intellio-NLP/falcon-7b-chat-oasst1" |
|
config = PeftConfig.from_pretrained(peft_model_id) |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
config.base_model_name_or_path, |
|
return_dict=True, |
|
load_in_8bit=True, |
|
device_map="auto", |
|
use_auth_token=True, |
|
trust_remote_code=True, |
|
) |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path) |
|
tokenizer.pad_token = tokenizer.eos_token |
|
|
|
model = PeftModel.from_pretrained(model, peft_model_id) |
|
``` |
|
|
|
```python |
|
# run the model |
|
prompt = """<human>: My name is Daniel. Write a long email to my closest friends inviting them to come to my home on Friday for a dinner party, I will make the food but tell them to BYOB. |
|
<bot>:""" |
|
|
|
batch = tokenizer( |
|
prompt, |
|
padding=True, |
|
truncation=True, |
|
return_tensors='pt' |
|
) |
|
batch = batch.to('cuda:0') |
|
|
|
with torch.cuda.amp.autocast(): |
|
output_tokens = model.generate( |
|
input_ids = batch.input_ids, |
|
max_new_tokens=200, |
|
temperature=0.7, |
|
top_p=0.7, |
|
num_return_sequences=1, |
|
pad_token_id=tokenizer.eos_token_id, |
|
eos_token_id=tokenizer.eos_token_id, |
|
) |
|
|
|
# Inspect outputs |
|
print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True)) |
|
``` |
|
|
|
## Reproducibility |
|
|
|
- See attached [Notebook](https://huggingface.co/intellio-NLP/falcon-40b-chat-oasst1/blob/main/finetune_falcon40b_oasst1_with_bnb_peft.ipynb) for the code (and hyperparams) used to train the model. |
|
|
|
### CUDA Info |
|
|
|
- CUDA Version: 12.0 |
|
- GPU Name: NVIDIA A100-SXM |
|
- Max Memory: {0: "37GB"} |
|
- Device Map: {"": 0} |
|
|
|
### Package Versions Employed |
|
|
|
- `torch`==2.0.1+cu118 |
|
- `transformers`==4.30.0.dev0 |
|
- `peft`==0.4.0.dev0 |
|
- `accelerate`==0.19.0 |
|
- `bitsandbytes`==0.39.0 |
|
- `einops`==0.6.1 |
|
|