|
--- |
|
license: apache-2.0 |
|
tags: |
|
- jamba |
|
datasets: |
|
- teknium/OpenHermes-2.5 |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# This is highly experimental and should be viewed as purely testing right now. Jamba has been very hard to train but I wanted to see how it did on one of the best datasets we have access to. I believe in transparent development so all *best* working iterations, even if they are a bit wonky, will be pushed here |
|
|
|
--- |
|
## Training |
|
|
|
|
|
### Open-Hermes-2.0 (Only first 1500 examples): **[ 1530/125193 4:46:45 < 386:48:08, 0.09 it/s, Epoch 0.01/1]** |
|
|
|
|
|
```py |
|
from trl import SFTTrainer |
|
import torch |
|
from peft import LoraConfig |
|
from transformers import AutoTokenizer, TrainingArguments |
|
from transformers import BitsAndBytesConfig |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
# Initialize or load your tokenizer and model here |
|
tokenizer = AutoTokenizer.from_pretrained("ai21labs/Jamba-v0.1") |
|
tokenizer.padding_side = 'right' |
|
tokenizer.padding_side = 'left' |
|
|
|
max_seq_length = 4096 |
|
|
|
lora_config = LoraConfig( |
|
r=8, |
|
lora_alpha=16, |
|
target_modules=["embed_tokens", "x_proj", "in_proj", "out_proj"], |
|
lora_dropout=0.2, |
|
task_type="CAUSAL_LM", |
|
bias="none" |
|
) |
|
|
|
trainer = SFTTrainer( |
|
model=model, |
|
train_dataset=train_dataset, |
|
dataset_text_field="text", |
|
max_seq_length=max_seq_length, |
|
tokenizer=tokenizer, |
|
args=TrainingArguments( |
|
num_train_epochs=1, |
|
lr_scheduler_type='linear', |
|
learning_rate=2e-5, |
|
per_device_train_batch_size=1, |
|
gradient_accumulation_steps=8, |
|
gradient_checkpointing=True, |
|
warmup_steps=10, |
|
weight_decay=0.2, |
|
fp16=not torch.cuda.is_bf16_supported(), |
|
bf16=torch.cuda.is_bf16_supported(), |
|
logging_steps=1, |
|
save_steps=100, |
|
output_dir="outputs", |
|
optim="paged_adamw_8bit", |
|
seed=42, |
|
), |
|
) |
|
|
|
# Set environment variables for PyTorch memory management |
|
import os |
|
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128,expandable_segments:True" |
|
``` |
|
|