--- license: other ---
TheBlokeAI

Chat & support: my new Discord server

Want to contribute? Patreon coming soon!

# OpenAssistant LLaMA 30B SFT 7 HF This in HF format repo of [OpenAssistant's LLaMA 30B SFT 7](https://huggingface.co/OpenAssistant/oasst-sft-7-llama-30b-xor). It is the result of merging the XORs from the above repo with the original Llama 30B weights. This is epoch 7 of OpenAssistant's training of a Llama 30B model. ## Want to support my work? I've had a lot of people ask if they can contribute. I love providing models and helping people, but it is starting to rack up pretty big cloud computing bills. So if you're able and willing to contribute, it'd be most gratefully received and will help me to keep providing models, and work on various AI projects. Donaters will get priority support on any and all AI/LLM/model questions, and I'll gladly quantise any model you'd like to try. * Patreon: coming soon! (just awaiting approval) * Ko-Fi: https://ko-fi.com/TheBlokeAI * Discord: https://discord.gg/UBgz4VXf # Original model card ``` llama-30b-sft-7: dtype: fp16 log_dir: "llama_log_30b" learning_rate: 1e-5 model_name: /home/ubuntu/Open-Assistant/model/model_training/.saved/llama-30b-super-pretrain/checkpoint-3500 #model_name: OpenAssistant/llama-30b-super-pretrain output_dir: llama_model_30b deepspeed_config: configs/zero3_config_sft.json weight_decay: 0.0 residual_dropout: 0.0 max_length: 2048 use_flash_attention: true warmup_steps: 20 gradient_checkpointing: true gradient_accumulation_steps: 12 per_device_train_batch_size: 2 per_device_eval_batch_size: 3 eval_steps: 101 save_steps: 485 num_train_epochs: 4 save_total_limit: 3 use_custom_sampler: true sort_by_length: false #save_strategy: steps save_strategy: epoch datasets: - oasst_export: lang: "bg,ca,cs,da,de,en,es,fr,hr,hu,it,nl,pl,pt,ro,ru,sl,sr,sv,uk" input_file_path: 2023-04-12_oasst_release_ready_synth.jsonl.gz val_split: 0.05 - vicuna: val_split: 0.05 max_val_set: 800 fraction: 1.0 - dolly15k: val_split: 0.05 max_val_set: 300 - grade_school_math_instructions: val_split: 0.05 - code_alpaca: val_split: 0.05 max_val_set: 250 ``` - **OASST dataset paper:** https://arxiv.org/abs/2304.07327