Commits · Dovakiins/qwerrwe

don't use load and push together (#1284)

ea00dd0
unverified

winglian commited on Feb 9

support for true batches with multipack (#1230)

00568c1
unverified

winglian commited on Feb 1

Peft deepspeed resume (#1227)

c67fb71
unverified

winglian commited on Jan 31

workaround for transformers bug requireing do_sample for saveing pretrained (#1206)

ba944e6
unverified

winglian commited on Jan 25

Mixtral fixes 20240124 (#1192) [skip ci]

54d2ac1
unverified

winglian commited on Jan 24

DPO cleanup (#1126)

7523d1f
unverified

winglian

plaguss HF staff commited on Jan 23

keep gate in fp32 for 16 bit loras (#1105)

da97285
unverified

winglian commited on Jan 12

feat: enable trl's autounwrap (#1060)

b432889
unverified

Nanobit commited on Jan 11

fix model card upload for PEFT models (#1043)

31d2350
unverified

hamel commited on Jan 5

RL/DPO (#935)

f243c21

winglian commited on Jan 4

add config to model card (#1005)

85dd4d5
unverified

hamel commited on Dec 28, 2023

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

dg-kalle commited on Dec 13, 2023

Fix Deepspeed loading (#950)

5ea3aa3
unverified

winglian commited on Dec 13, 2023

support for mamba (#915)

40a6362
unverified

winglian commited on Dec 9, 2023

use accelerate logging for zero/main loggin only

b2430ce

winglian commited on Nov 6, 2023

cleanup verbosity a bit

4c834bf

winglian commited on Oct 28, 2023

refactor neft patch to be more re-usable similar to trl's impl (#796)

827ec3d
unverified

winglian commited on Oct 29, 2023

Implement fused modules (#747)

15d3a65
unverified

casperhansen

winglian commited on Oct 21, 2023

Fix DeepSpeed Zero 3 Saving (#709)

e4d1585
unverified

tokestermw

winglian commited on Oct 19, 2023

create a model card with axolotl badge (#624)

501958b
unverified

winglian commited on Sep 22, 2023

set fsdp state dict (#584)

be75668
unverified

Jan Philipp Harries Jan Philipp Harries commited on Sep 15, 2023

let hf trainer handle torch compile (#516)

a4e1bb6
unverified

winglian

tmm1 commited on Sep 13, 2023

misc fixes/improvements (#513)

a546ca2
unverified

winglian commited on Sep 5, 2023

split train from other cli options (#503)

b21e4a2
unverified

winglian commited on Aug 31, 2023

Spaces:

Dovakiins
/

qwerrwe

Build error

Commit History

don't use load and push together (#1284)

ea00dd0
unverified

support for true batches with multipack (#1230)

00568c1
unverified

Peft deepspeed resume (#1227)

c67fb71
unverified

workaround for transformers bug requireing do_sample for saveing pretrained (#1206)

ba944e6
unverified

Mixtral fixes 20240124 (#1192) [skip ci]

54d2ac1
unverified

DPO cleanup (#1126)

7523d1f
unverified

keep gate in fp32 for 16 bit loras (#1105)

da97285
unverified

feat: enable trl's autounwrap (#1060)

b432889
unverified

fix model card upload for PEFT models (#1043)

31d2350
unverified

RL/DPO (#935)

f243c21

add config to model card (#1005)

85dd4d5
unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

Fix Deepspeed loading (#950)

5ea3aa3
unverified

support for mamba (#915)

40a6362
unverified

use accelerate logging for zero/main loggin only

b2430ce

cleanup verbosity a bit

4c834bf

refactor neft patch to be more re-usable similar to trl's impl (#796)

827ec3d
unverified

Implement fused modules (#747)

15d3a65
unverified

Fix DeepSpeed Zero 3 Saving (#709)

e4d1585
unverified

create a model card with axolotl badge (#624)

501958b
unverified

set fsdp state dict (#584)

be75668
unverified

let hf trainer handle torch compile (#516)

a4e1bb6
unverified

misc fixes/improvements (#513)

a546ca2
unverified

split train from other cli options (#503)

b21e4a2
unverified

Commit History

don't use load and push together (#1284) ea00dd0 unverified

support for true batches with multipack (#1230) 00568c1 unverified

Peft deepspeed resume (#1227) c67fb71 unverified

workaround for transformers bug requireing do_sample for saveing pretrained (#1206) ba944e6 unverified

Mixtral fixes 20240124 (#1192) [skip ci] 54d2ac1 unverified

DPO cleanup (#1126) 7523d1f unverified

keep gate in fp32 for 16 bit loras (#1105) da97285 unverified

feat: enable trl's autounwrap (#1060) b432889 unverified

fix model card upload for PEFT models (#1043) 31d2350 unverified

RL/DPO (#935) f243c21

add config to model card (#1005) 85dd4d5 unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941) ef24342 unverified

Fix Deepspeed loading (#950) 5ea3aa3 unverified

support for mamba (#915) 40a6362 unverified

use accelerate logging for zero/main loggin only b2430ce

cleanup verbosity a bit 4c834bf

refactor neft patch to be more re-usable similar to trl's impl (#796) 827ec3d unverified

Implement fused modules (#747) 15d3a65 unverified

Fix DeepSpeed Zero 3 Saving (#709) e4d1585 unverified

create a model card with axolotl badge (#624) 501958b unverified

set fsdp state dict (#584) be75668 unverified

let hf trainer handle torch compile (#516) a4e1bb6 unverified

misc fixes/improvements (#513) a546ca2 unverified

split train from other cli options (#503) b21e4a2 unverified

don't use load and push together (#1284)

ea00dd0
unverified

support for true batches with multipack (#1230)

00568c1
unverified

Peft deepspeed resume (#1227)

c67fb71
unverified

workaround for transformers bug requireing do_sample for saveing pretrained (#1206)

ba944e6
unverified

Mixtral fixes 20240124 (#1192) [skip ci]

54d2ac1
unverified

DPO cleanup (#1126)

7523d1f
unverified

keep gate in fp32 for 16 bit loras (#1105)

da97285
unverified

feat: enable trl's autounwrap (#1060)

b432889
unverified

fix model card upload for PEFT models (#1043)

31d2350
unverified

RL/DPO (#935)

f243c21

add config to model card (#1005)

85dd4d5
unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

Fix Deepspeed loading (#950)

5ea3aa3
unverified

support for mamba (#915)

40a6362
unverified

use accelerate logging for zero/main loggin only

b2430ce

cleanup verbosity a bit

4c834bf

refactor neft patch to be more re-usable similar to trl's impl (#796)

827ec3d
unverified

Implement fused modules (#747)

15d3a65
unverified

Fix DeepSpeed Zero 3 Saving (#709)

e4d1585
unverified

create a model card with axolotl badge (#624)

501958b
unverified

set fsdp state dict (#584)

be75668
unverified

let hf trainer handle torch compile (#516)

a4e1bb6
unverified

misc fixes/improvements (#513)

a546ca2
unverified

split train from other cli options (#503)

b21e4a2
unverified