Get qlora mistral-7b fine tuning working on a single 4090 (#708) 295b266 unverified lukemarsden commited on Oct 10, 2023
Fix: Higher vram usage for mistral and sample_packing (#691) 669f1d0 unverified Nanobit commited on Oct 6, 2023
prepared dataset caching, other misc fixes (#665) e50a64e unverified winglian commited on Oct 3, 2023
eval_table isn't quite stable enough to be in default llama configs (#637) d887ad8 unverified winglian commited on Sep 26, 2023
more sane defaults for openllama 3b used for quickstarts (#602) 674c576 unverified winglian commited on Sep 19, 2023
btlm and falcon monkey patches for flash attn (#566) 6b9b229 unverified winglian commited on Sep 17, 2023
Add training callback to send predictions to WandB table (#521) 5b67ea9 unverified Glavin001 commited on Sep 13, 2023
recommend padding when using sample packing (#531) 3437149 unverified winglian commited on Sep 6, 2023
Add support for GPTQ using native transformers/peft (#468) 3355706 unverified winglian commited on Sep 5, 2023
pad_to_worst_case_seq_len boolean, for testing memory limits (#498) 8e197f6 unverified Birch-san tmm1 commited on Aug 28, 2023
Feat(cfg): Add code-llama configs for all sizes (#479) 3513071 unverified mhenrichsen mhenrichsen commited on Aug 27, 2023
new llama-2 default settings (#370) fdffef5 unverified mhenrichsen Mads Henrichsen commited on Aug 14, 2023
Add wandb_entity to wandb options, update example configs, update README (#361) 7019509 unverified Morgan McGuire Morgan McGuire winglian commited on Aug 12, 2023
Merge pull request #92 from OpenAccess-AI-Collective/flash-optimum 16bb627 unverified winglian commited on Jun 14, 2023
Merge pull request #193 from OpenAccess-AI-Collective/config-fixes-20230612 94f310c unverified winglian commited on Jun 12, 2023
Merge pull request #132 from utensil/falcon-7b-qlora c8242de unverified Nanobit commited on Jun 8, 2023