Spaces:

Dovakiins
/

qwerrwe

Build error

App Files Files Community

qwerrwe / src /axolotl /utils /config.py

Commit History

E2e device cuda (#575)

2414673
unverified

winglian commited on Sep 15, 2023

Model parallel (#538)

f6060a6
unverified

winglian commited on Sep 13, 2023

Add training callback to send predictions to WandB table (#521)

5b67ea9
unverified

Glavin001 commited on Sep 13, 2023

Fix pretraining with iterable/streaming Dataset (#556)

2f586d1
unverified

Jan Philipp Harries Jan Philipp Harries commited on Sep 13, 2023

Early stopping metric (#537)

e30f1e3
unverified

winglian commited on Sep 8, 2023

recommend padding when using sample packing (#531)

3437149
unverified

winglian commited on Sep 6, 2023

Add support for GPTQ using native transformers/peft (#468)

3355706
unverified

winglian commited on Sep 5, 2023

move is_llama_derived_model into normalize_config (#524)

44454ae
unverified

tmm1 commited on Sep 4, 2023

ReLoRA implementation (with quantization) (#322)

bde3c5a
unverified

chargoddard

winglian commited on Aug 24, 2023

recast loralayer, norm, lmhead + embed token weights per original qlora (#393)

96deb6b
unverified

winglian commited on Aug 21, 2023

Fix(config): Update handling of deepspeed config (#404)

c01015f
unverified

Nanobit commited on Aug 15, 2023

try to detect accelerate and only use device_map=None in that case (#373)

094fc2c
unverified

tmm1 commited on Aug 13, 2023

improve GPU logging to break out pytorch cache and system mem

7b55fe6

tmm1 commited on Aug 13, 2023

extract module for working with cfg

8cec513

tmm1 commited on Aug 13, 2023

Commit History

E2e device cuda (#575) 2414673 unverified

Model parallel (#538) f6060a6 unverified

Add training callback to send predictions to WandB table (#521) 5b67ea9 unverified

Fix pretraining with iterable/streaming Dataset (#556) 2f586d1 unverified

Early stopping metric (#537) e30f1e3 unverified

recommend padding when using sample packing (#531) 3437149 unverified

Add support for GPTQ using native transformers/peft (#468) 3355706 unverified

move is_llama_derived_model into normalize_config (#524) 44454ae unverified

ReLoRA implementation (with quantization) (#322) bde3c5a unverified

recast loralayer, norm, lmhead + embed token weights per original qlora (#393) 96deb6b unverified

Fix(config): Update handling of deepspeed config (#404) c01015f unverified

try to detect accelerate and only use device_map=None in that case (#373) 094fc2c unverified

improve GPU logging to break out pytorch cache and system mem 7b55fe6

extract module for working with cfg 8cec513

E2e device cuda (#575)

2414673
unverified

Model parallel (#538)

f6060a6
unverified

Add training callback to send predictions to WandB table (#521)

5b67ea9
unverified

Fix pretraining with iterable/streaming Dataset (#556)

2f586d1
unverified

Early stopping metric (#537)

e30f1e3
unverified

recommend padding when using sample packing (#531)

3437149
unverified

Add support for GPTQ using native transformers/peft (#468)

3355706
unverified

move is_llama_derived_model into normalize_config (#524)

44454ae
unverified

ReLoRA implementation (with quantization) (#322)

bde3c5a
unverified

recast loralayer, norm, lmhead + embed token weights per original qlora (#393)

96deb6b
unverified

Fix(config): Update handling of deepspeed config (#404)

c01015f
unverified

try to detect accelerate and only use device_map=None in that case (#373)

094fc2c
unverified

improve GPU logging to break out pytorch cache and system mem

7b55fe6

extract module for working with cfg

8cec513