TechxGenus/Mini-Jamba-v2 · [Request] Potential Release Of Training Code?

Lyte

Apr 1, 2024

Hello there! I hope you're doing well today. I am planning to run training tests on different data sets, and I was wondering if you could share your training code with me. I haven't started yet, so if it's impossible, I totally understand, and it's no problem.

TechxGenus

Owner Apr 2, 2024

It is compatible with Huggingface's tools and can be trained like other LLMs, just like the example in the official repository:

from datasets import load_dataset
from trl import SFTTrainer
from peft import LoraConfig
from transformers import AutoTokenizer, AutoModelForCausalLM, TrainingArguments, AutoConfig

model_id = <Your Path Here>
tokenizer = AutoTokenizer.from_pretrained(model_id)
# model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True, device_map='auto')
config = AutoConfig.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_config(
    config=config,
    trust_remote_code=True,
    device_map='auto'
)

dataset = load_dataset("Abirate/english_quotes", split="train")
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    logging_dir='./logs',
    logging_steps=10,
    learning_rate=2e-3
)
# lora_config = LoraConfig(
#     r=8,
#     target_modules=["embed_tokens", "x_proj", "in_proj", "out_proj"],
#     task_type="CAUSAL_LM",
#     bias="none"
# )
trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    args=training_args,
    # peft_config=lora_config,
    train_dataset=dataset,
    dataset_text_field="quote",
)

trainer.train()

I modified config.json to implement different configurations.

Lyte

Apr 2, 2024

Thanks for sharing, in my mind at the time I was thinking you (re)wrote the architecture manually lol.