Mixtral: More correct MoE, lower loss (#932) 86487c2 unverified casperhansen commited on Dec 10, 2023
update to latest transformers for mixstral support (#929) 35f9b0f unverified winglian commited on Dec 10, 2023
fixing prompt template of chatml by removal of linebreak (#922) 03c6318 unverified timlim123 Timothy Lim commited on Dec 9, 2023
fix(tokenizer): handle fast tokenizer properly for bos/eos (#914) fde091c unverified Nanobit commited on Dec 8, 2023
feat: add check for quantized model (#913) a581e9f unverified Nanobit winglian commited on Dec 4, 2023
Support device_map=sequential & max_memory config parameters (#903) 992e742 unverified Bryan Thornbury winglian commited on Dec 4, 2023
feature: loss watchdog for terminating training runs that are failing (#899) 58ec8b1 unverified user735 Karl-Johan Alm commited on Dec 4, 2023
Remove learning rate scheduler in deepspeed config to avoid conflict (#909) 476a205 unverified Haoxiang-Wang commited on Dec 4, 2023
ensure merged model matches the training dtype (#902) 1d21aa6 unverified winglian commited on Nov 29, 2023
Determine FSDP/deepspeed settings on device select. (#883) 71b7ea3 unverified user735 Karl-Johan Alm winglian commited on Nov 29, 2023
update datasets version to cut down the warnings due to pyarrow arg change (#897) 6a4562a unverified winglian commited on Nov 25, 2023
fix: warning should not show if eval_batch_size not provided (#896) 7ee3c4c unverified Nanobit commited on Nov 25, 2023
chore(doc): Add info on changing role in sharegpt (#886) 9fc29e0 unverified Nanobit commited on Nov 22, 2023
try #2: pin hf transformers and accelerate to latest release, don't reinstall pytorch (#867) 0de1457 unverified winglian commited on Nov 16, 2023
allow overriding of model_config parameters from the YML (#853) 1bc1186 unverified winglian commited on Nov 16, 2023
add e2e tests for checking functionality of resume from checkpoint (#865) b3a61e8 unverified winglian commited on Nov 16, 2023
lint fix that didn't get caught by linter (#866) 332984d unverified winglian commited on Nov 15, 2023
Update data.py for signature generation (#851) 48630f5 unverified MilesQLi winglian commited on Nov 15, 2023
Docs: add instructions to 1-click launching on public clouds (#862) b33c1d5 unverified zongheng commited on Nov 15, 2023
feat(doc): add more info on train_on_split (#855) 306fe19 unverified Nanobit commited on Nov 15, 2023
include the suffix modified string in ascii art (#852) 614cff4 unverified fpreiss commited on Nov 15, 2023
don't compile deepspeed or bitsandbytes from source (#837) f544ab2 unverified winglian commited on Nov 9, 2023