Severian
/

Jamba-Hercules

Text Generation

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Jamba-Hercules / README.md

Severian's picture

Update README.md

88d4132 verified 5 months ago

|

No virus

2.04 kB

	---
	license: apache-2.0
	tags:
	- jamba
	datasets:
	- teknium/OpenHermes-2.5
	pipeline_tag: text-generation
	---

	# This is highly experimental and should be viewed as purely testing right now. Jamba has been very hard to train but I wanted to see how it did on one of the best datasets we have access to. I believe in transparent development so all best working iterations, even if they are a bit wonky, will be pushed here

	---
	## Training


	### Open-Hermes-2.0 (Only first 1500 examples): [ 1530/125193 4:46:45 < 386:48:08, 0.09 it/s, Epoch 0.01/1]


	```py
	from trl import SFTTrainer
	import torch
	from peft import LoraConfig
	from transformers import AutoTokenizer, TrainingArguments
	from transformers import BitsAndBytesConfig
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Initialize or load your tokenizer and model here
	tokenizer = AutoTokenizer.from_pretrained("ai21labs/Jamba-v0.1")
	tokenizer.padding_side = 'right'
	tokenizer.padding_side = 'left'

	max_seq_length = 4096

	lora_config = LoraConfig(
	r=8,
	lora_alpha=16,
	target_modules=["embed_tokens", "x_proj", "in_proj", "out_proj"],
	lora_dropout=0.2,
	task_type="CAUSAL_LM",
	bias="none"
	)

	trainer = SFTTrainer(
	model=model,
	train_dataset=train_dataset,
	dataset_text_field="text",
	max_seq_length=max_seq_length,
	tokenizer=tokenizer,
	args=TrainingArguments(
	num_train_epochs=1,
	lr_scheduler_type='linear',
	learning_rate=2e-5,
	per_device_train_batch_size=1,
	gradient_accumulation_steps=8,
	gradient_checkpointing=True,
	warmup_steps=10,
	weight_decay=0.2,
	fp16=not torch.cuda.is_bf16_supported(),
	bf16=torch.cuda.is_bf16_supported(),
	logging_steps=1,
	save_steps=100,
	output_dir="outputs",
	optim="paged_adamw_8bit",
	seed=42,
	),
	)

	# Set environment variables for PyTorch memory management
	import os
	os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128,expandable_segments:True"
	```