Commits · Dovakiins/qwerrwe

feat: validate sample packing requires flash_attention (#1465)

bf4cd67
unverified

Nanobit commited on Apr 5

Support loading datasets saved via save_to_disk (#1432)

e634118
unverified

fozziethebeat commited on Mar 29

make sure to capture non-null defaults from config validation (#1415)

601b77b
unverified

winglian commited on Mar 26

fix(dataset): normalize tokenizer config and change hash from tokenizer class to tokenizer path (#1298)

ff939d8
unverified

Nanobit commited on Mar 25

strip out hacky qlora-fsdp workarounds now that qlora-fsdp fixes are upstreamed (#1428)

2a1589f
unverified

winglian commited on Mar 21

Feat: Add sharegpt multirole (#1137)

40a88e8
unverified

Nanobit commited on Mar 19

ORPO (#1419)

2ea70eb
unverified

winglian commited on Mar 18

Train parameters exclusively in specific ranges (#1390)

05bcc9e
unverified

seungduk commited on Mar 14

Add Glaive conversation format support (#1365)

b7d8a7d
unverified

Brian Fitzgerald

winglian commited on Mar 11

plain input/output prompt strategy w/o chat templates (#1346)

4d09b42
unverified

winglian commited on Mar 4

run tests again on Modal (#1289) [skip ci]

0001862
unverified

winglian commited on Feb 29

fix for protected model_ namespace w pydantic (#1345)

6b3b271
unverified

winglian commited on Feb 28

more fixes 20240228 (#1342) [skip ci]

0f985e1
unverified

winglian commited on Feb 28

Pydantic 2.x cfg (#1239)

cc3cebf
unverified

winglian commited on Feb 26

make mlflow optional (#1317)

5894f0e
unverified

winglian commited on Feb 26

Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? (#1273)

8430db2
unverified

jinwonkim93 commited on Feb 13

Pretrain transforms (#1261)

c7cf381
unverified

winglian commited on Feb 6

relora: magnitude pruning of the optimizer (#1245)

8c2e05a
unverified

winglian commited on Feb 6

support for true batches with multipack (#1230)

00568c1
unverified

winglian commited on Feb 1

Support for additional_special_tokens (#1221) [skip ci]

25e037f
unverified

DreamGenX

winglian commited on Jan 31

Peft lotfq (#1222)

4cb7900
unverified

winglian commited on Jan 28

ADD: warning if hub_model_id ist set but not any save strategy (#1202)

af29d81
unverified

JohanWork

winglian commited on Jan 26

Feat/chatml add system message (#1117)

98b4762
unverified

mhenrichsen Mads Henrichsen

winglian commited on Jan 25

Phi2 multipack (#1173)

814aee6
unverified

winglian commited on Jan 23

DPO cleanup (#1126)

7523d1f
unverified

winglian

plaguss HF staff commited on Jan 23

Feat(test): Add tests for alpaca chatml prompt tokenizer (#1088)

5439707
unverified

JohanWork

Nanobit commited on Jan 23

Falcon embeddings (#1149) [skip docker]

e799e08
unverified

winglian commited on Jan 23

set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122) [skip ci]

782b6a4
unverified

winglian

Nanobit commited on Jan 22

Deprecate max packed sequence len (#1141)

2ce5c0d
unverified

winglian commited on Jan 20

Multipack simplify for Mixtral (#1142)

6910e6a
unverified

winglian commited on Jan 18

Add shifted sparse attention (#973) [skip-ci]

1d70f24
unverified

jrc joecummings

winglian commited on Jan 18

Add `layers_to_transform` for `lora_config` (#1118)

8487b97
unverified

xzuyn commited on Jan 16

Enable or disable bf16 support based on availability (#1116)

0865613
unverified

Simon Hällqvist commited on Jan 14

keep gate in fp32 for 16 bit loras (#1105)

da97285
unverified

winglian commited on Jan 12

add gptneox embeddings, fix phi2 inputs, also fix the casting (#1083)

78c5b19
unverified

winglian commited on Jan 11

update sharegpt conversations when chatml chat template is set (#1075) [skip ci]

0ce1a65
unverified

winglian commited on Jan 10

fix: `train_on_inputs: true` ignored for sharegpt (#1045) [skip ci]

043c386
unverified

Nanobit

winglian commited on Jan 10

be more robust about checking embedding modules for lora finetunes (#1074) [skip ci]

0f10080
unverified

winglian commited on Jan 10

attempt to also run e2e tests that needs gpus (#1070)

788649f
unverified

winglian commited on Jan 10

fix double eos token for chatml (#1054) [skip ci]

651b7a3
unverified

winglian commited on Jan 9

Phi2 rewrite (#1058)

732851f
unverified

winglian commited on Jan 8

streaming multipack for pretraining dataset (#959)

553c80f
unverified

jinwonkim93 jinwonkim93@github.com

winglian commited on Jan 6

RL/DPO (#935)

f243c21

winglian commited on Jan 4

bump transformers and update attention class map name (#1023)

bcc78d8
unverified

winglian commited on Jan 3

Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787)

1ffa386
unverified

Nanobit commited on Dec 22, 2023

fix mistral prompt assembly (#982)

7bbaac9
unverified

hamel commited on Dec 21, 2023

Fix prompt assembly for llama (#952)

5ada140
unverified

hamel

tokestermw commited on Dec 14, 2023

Respect sequence_len in config for `type: llama2_chat` (#926)

f1de29d
unverified

hamel commited on Dec 12, 2023

support for mamba (#915)

40a6362
unverified

winglian commited on Dec 9, 2023

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

Nanobit commited on Dec 4, 2023

Commit History

feat: validate sample packing requires flash_attention (#1465) bf4cd67 unverified

Support loading datasets saved via save_to_disk (#1432) e634118 unverified

make sure to capture non-null defaults from config validation (#1415) 601b77b unverified

fix(dataset): normalize tokenizer config and change hash from tokenizer class to tokenizer path (#1298) ff939d8 unverified

strip out hacky qlora-fsdp workarounds now that qlora-fsdp fixes are upstreamed (#1428) 2a1589f unverified

Feat: Add sharegpt multirole (#1137) 40a88e8 unverified

ORPO (#1419) 2ea70eb unverified

Train parameters exclusively in specific ranges (#1390) 05bcc9e unverified

Add Glaive conversation format support (#1365) b7d8a7d unverified

plain input/output prompt strategy w/o chat templates (#1346) 4d09b42 unverified

run tests again on Modal (#1289) [skip ci] 0001862 unverified

fix for protected model_ namespace w pydantic (#1345) 6b3b271 unverified

more fixes 20240228 (#1342) [skip ci] 0f985e1 unverified

Pydantic 2.x cfg (#1239) cc3cebf unverified

make mlflow optional (#1317) 5894f0e unverified

Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? (#1273) 8430db2 unverified

Pretrain transforms (#1261) c7cf381 unverified

relora: magnitude pruning of the optimizer (#1245) 8c2e05a unverified

support for true batches with multipack (#1230) 00568c1 unverified

Support for additional_special_tokens (#1221) [skip ci] 25e037f unverified

Peft lotfq (#1222) 4cb7900 unverified

ADD: warning if hub_model_id ist set but not any save strategy (#1202) af29d81 unverified

Feat/chatml add system message (#1117) 98b4762 unverified

Phi2 multipack (#1173) 814aee6 unverified

DPO cleanup (#1126) 7523d1f unverified

Feat(test): Add tests for alpaca chatml prompt tokenizer (#1088) 5439707 unverified

Falcon embeddings (#1149) [skip docker] e799e08 unverified

set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122) [skip ci] 782b6a4 unverified

Deprecate max packed sequence len (#1141) 2ce5c0d unverified

Multipack simplify for Mixtral (#1142) 6910e6a unverified

Add shifted sparse attention (#973) [skip-ci] 1d70f24 unverified

Add `layers_to_transform` for `lora_config` (#1118) 8487b97 unverified

Enable or disable bf16 support based on availability (#1116) 0865613 unverified

keep gate in fp32 for 16 bit loras (#1105) da97285 unverified

add gptneox embeddings, fix phi2 inputs, also fix the casting (#1083) 78c5b19 unverified

update sharegpt conversations when chatml chat template is set (#1075) [skip ci] 0ce1a65 unverified

fix: `train_on_inputs: true` ignored for sharegpt (#1045) [skip ci] 043c386 unverified

be more robust about checking embedding modules for lora finetunes (#1074) [skip ci] 0f10080 unverified

attempt to also run e2e tests that needs gpus (#1070) 788649f unverified

fix double eos token for chatml (#1054) [skip ci] 651b7a3 unverified

Phi2 rewrite (#1058) 732851f unverified

streaming multipack for pretraining dataset (#959) 553c80f unverified

RL/DPO (#935) f243c21

bump transformers and update attention class map name (#1023) bcc78d8 unverified

Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787) 1ffa386 unverified

fix mistral prompt assembly (#982) 7bbaac9 unverified

Fix prompt assembly for llama (#952) 5ada140 unverified

Respect sequence_len in config for `type: llama2_chat` (#926) f1de29d unverified

support for mamba (#915) 40a6362 unverified

Feat(wandb): Refactor to be more flexible (#767) a1da39c unverified

feat: validate sample packing requires flash_attention (#1465)

bf4cd67
unverified

Support loading datasets saved via save_to_disk (#1432)

e634118
unverified

make sure to capture non-null defaults from config validation (#1415)

601b77b
unverified

fix(dataset): normalize tokenizer config and change hash from tokenizer class to tokenizer path (#1298)

ff939d8
unverified

strip out hacky qlora-fsdp workarounds now that qlora-fsdp fixes are upstreamed (#1428)

2a1589f
unverified

Feat: Add sharegpt multirole (#1137)

40a88e8
unverified

ORPO (#1419)

2ea70eb
unverified

Train parameters exclusively in specific ranges (#1390)

05bcc9e
unverified

Add Glaive conversation format support (#1365)

b7d8a7d
unverified

plain input/output prompt strategy w/o chat templates (#1346)

4d09b42
unverified

run tests again on Modal (#1289) [skip ci]

0001862
unverified

fix for protected model_ namespace w pydantic (#1345)

6b3b271
unverified

more fixes 20240228 (#1342) [skip ci]

0f985e1
unverified

Pydantic 2.x cfg (#1239)

cc3cebf
unverified

make mlflow optional (#1317)

5894f0e
unverified

Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? (#1273)

8430db2
unverified

Pretrain transforms (#1261)

c7cf381
unverified

relora: magnitude pruning of the optimizer (#1245)

8c2e05a
unverified

support for true batches with multipack (#1230)

00568c1
unverified

Support for additional_special_tokens (#1221) [skip ci]

25e037f
unverified

Peft lotfq (#1222)

4cb7900
unverified

ADD: warning if hub_model_id ist set but not any save strategy (#1202)

af29d81
unverified

Feat/chatml add system message (#1117)

98b4762
unverified

Phi2 multipack (#1173)

814aee6
unverified

DPO cleanup (#1126)

7523d1f
unverified

Feat(test): Add tests for alpaca chatml prompt tokenizer (#1088)

5439707
unverified

Falcon embeddings (#1149) [skip docker]

e799e08
unverified

set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122) [skip ci]

782b6a4
unverified

Deprecate max packed sequence len (#1141)

2ce5c0d
unverified

Multipack simplify for Mixtral (#1142)

6910e6a
unverified

Add shifted sparse attention (#973) [skip-ci]

1d70f24
unverified

Add `layers_to_transform` for `lora_config` (#1118)

8487b97
unverified

Enable or disable bf16 support based on availability (#1116)

0865613
unverified

keep gate in fp32 for 16 bit loras (#1105)

da97285
unverified

add gptneox embeddings, fix phi2 inputs, also fix the casting (#1083)

78c5b19
unverified

update sharegpt conversations when chatml chat template is set (#1075) [skip ci]

0ce1a65
unverified

fix: `train_on_inputs: true` ignored for sharegpt (#1045) [skip ci]

043c386
unverified

be more robust about checking embedding modules for lora finetunes (#1074) [skip ci]

0f10080
unverified

attempt to also run e2e tests that needs gpus (#1070)

788649f
unverified

fix double eos token for chatml (#1054) [skip ci]

651b7a3
unverified

Phi2 rewrite (#1058)

732851f
unverified

streaming multipack for pretraining dataset (#959)

553c80f
unverified

RL/DPO (#935)

f243c21

bump transformers and update attention class map name (#1023)

bcc78d8
unverified

Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787)

1ffa386
unverified

fix mistral prompt assembly (#982)

7bbaac9
unverified

Fix prompt assembly for llama (#952)

5ada140
unverified

Respect sequence_len in config for `type: llama2_chat` (#926)

f1de29d
unverified

support for mamba (#915)

40a6362
unverified

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified