--- license: apache-2.0 --- Slightly modified mpt-30b, which has some updates to allow gradient checkpointing/etc., to be compatible with qlora training code. Original model: https://huggingface.co/mosaicml/mpt-30b My fork of qlora with mpt-30b support: https://github.com/jondurbin/qlora Differences in the qlora scripts: - requires adding `--mpt True` for mpt-based models - uses `--num_train_epochs` instead of `--max_steps` - uses airoboros prompt format (mostly 1:1 with vicuna) rather than alpaca, and expects an input file in JSONL format with "instruction" and "response" Full example of tuning (used for airoboros-mpt-30b-gpt4-1.4): ``` source /workspace/venv/bin/activate export WANDB_API_KEY=[redacted] export WANDB_PROJECT=airoboros-mpt-30b-gpt4-1.4 python qlora.py \ --model_name_or_path ./mpt-30b \ --output_dir ./$WANDB_PROJECT-checkpoints \ --num_train_epochs 3 \ --logging_steps 1 \ --save_strategy steps \ --data_seed 11422 \ --save_steps 75 \ --save_total_limit 3 \ --evaluation_strategy "no" \ --eval_dataset_size 2 \ --max_new_tokens 8192 \ --dataloader_num_workers 3 \ --logging_strategy steps \ --remove_unused_columns False \ --do_train \ --lora_r 64 \ --lora_alpha 16 \ --lora_modules all \ --double_quant \ --quant_type nf4 \ --bf16 \ --bits 4 \ --warmup_ratio 0.03 \ --lr_scheduler_type constant \ --dataset ./instructions.jsonl \ --dataset_format airoboros \ --model_max_len 8192 \ --gradient_checkpointing \ --per_device_train_batch_size 6 \ --gradient_accumulation_steps 16 \ --learning_rate 0.0001 \ --adam_beta2 0.999 \ --max_grad_norm 0.3 \ --lora_dropout 0.05 \ --weight_decay 0.0 \ --seed 11422 \ --trust_remote_code \ --mpt True \ --report_to wandb ```