|
--- |
|
datasets: |
|
- Abirate/english_quotes |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# How to run in Google Colab |
|
|
|
Note: must be run in GPU |
|
```python |
|
!pip install -q -U bitsandbytes |
|
!pip install -q -U git+https://github.com/huggingface/transformers.git |
|
!pip install -q -U git+https://github.com/huggingface/peft.git |
|
``` |
|
```python |
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig |
|
|
|
model_id = "EleutherAI/gpt-neox-20b" |
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_use_double_quant=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_compute_dtype=torch.bfloat16 |
|
) |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map={"":0}) |
|
``` |
|
```python |
|
from peft import LoraConfig, get_peft_model |
|
|
|
lora_config = LoraConfig.from_pretrained('suarkadipa/gpt-neox-20b-english-quotes') |
|
model = get_peft_model(model, lora_config) |
|
``` |
|
```python |
|
text = "Yaya Toure " |
|
device = "cuda:0" |
|
|
|
inputs = tokenizer(text, return_tensors="pt").to(device) |
|
outputs = model.generate(**inputs, max_new_tokens=20) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
|
#output example: Yaya Touré was born in the Ivory Coast, but moved to France at the age |
|
``` |