|
--- |
|
library_name: transformers |
|
license: mit |
|
language: |
|
- en |
|
--- |
|
|
|
# Rolema 7B |
|
|
|
Rolema 7B is a large language model that works effectively under a 4-bit quantization process. |
|
Rolema 7B is based on the backbone of the Gemma-7B model by Google. |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
|
- **Developed by:** Min Si Thu |
|
- **Model type:** Text Generation Large Language Model |
|
- **Language(s) (NLP):** English |
|
- **License:** MIT |
|
|
|
### How to use |
|
|
|
Installing Libraries |
|
|
|
```bash |
|
%%capture |
|
%pip install -U bitsandbytes |
|
%pip install -U transformers |
|
%pip install -U peft |
|
%pip install -U accelerate |
|
%pip install -U trl |
|
%pip install -U datasets |
|
``` |
|
|
|
Code Implementation |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig |
|
from peft import PeftModel, PeftConfig |
|
|
|
base_model = "google/gemma-7b-it" |
|
adapter_model = "jojo-ai-mst/rolema-7b-it" |
|
|
|
# Load base model(Gemma 7B-it) |
|
bnbConfig = BitsAndBytesConfig( |
|
load_in_4bit = True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_compute_dtype=torch.bfloat16, |
|
) |
|
|
|
model = AutoModelForCausalLM.from_pretrained(base_model,quantization_config=bnbConfig,) # device_map="auto" autosplit for cuda |
|
model = PeftModel.from_pretrained(model, adapter_model) |
|
tokenizer = AutoTokenizer.from_pretrained(base_model) |
|
|
|
model = model.to("cuda") |
|
|
|
inputs = tokenizer("How to learn programming", return_tensors="pt") |
|
|
|
inputs = inputs.to("cuda") |
|
|
|
outputs = model.generate(input_ids=inputs["input_ids"], max_new_tokens=1000) |
|
print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0]) |
|
``` |
|
|
|
|