Add training callback to send predictions to WandB table (#521) 5b67ea9 unverified Glavin001 commited on Sep 13, 2023
Fix pretraining with iterable/streaming Dataset (#556) 2f586d1 unverified Jan Philipp Harries Jan Philipp Harries commited on Sep 13, 2023
recommend padding when using sample packing (#531) 3437149 unverified winglian commited on Sep 6, 2023
Add support for GPTQ using native transformers/peft (#468) 3355706 unverified winglian commited on Sep 5, 2023
move is_llama_derived_model into normalize_config (#524) 44454ae unverified tmm1 commited on Sep 4, 2023
ReLoRA implementation (with quantization) (#322) bde3c5a unverified chargoddard winglian commited on Aug 24, 2023
recast loralayer, norm, lmhead + embed token weights per original qlora (#393) 96deb6b unverified winglian commited on Aug 21, 2023
Fix(config): Update handling of deepspeed config (#404) c01015f unverified Nanobit commited on Aug 15, 2023
try to detect accelerate and only use device_map=None in that case (#373) 094fc2c unverified tmm1 commited on Aug 13, 2023