Commits · Dovakiins/qwerrwe

Pretrain transforms (#1261)

c7cf381
unverified

winglian commited on Feb 6, 2024

Peft lotfq (#1222)

4cb7900
unverified

winglian commited on Jan 28, 2024

Update qlora.yml - remove `max_packed_sequence_len` (#1210) [skip ci]

5407ddd
unverified

7flash commited on Jan 26, 2024

add colab example (#1196) [skip ci]

ee0b5f6
unverified

JohanWork commited on Jan 25, 2024

Mixtral fixes 20240124 (#1192) [skip ci]

54d2ac1
unverified

winglian commited on Jan 24, 2024

Phi2 multipack (#1173)

814aee6
unverified

winglian commited on Jan 23, 2024

Fine-Tuning Mistral-7b for Real-World Chatbot Applications Using Axolotl (Lora used) (#1155)

cc25039
unverified

Tilemachos Chatzipapas twenty8th

winglian commited on Jan 23, 2024

Falcon embeddings (#1149) [skip docker]

e799e08
unverified

winglian commited on Jan 23, 2024

set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122) [skip ci]

782b6a4
unverified

winglian

Nanobit commited on Jan 22, 2024

Add shifted sparse attention (#973) [skip-ci]

1d70f24
unverified

jrc joecummings

winglian commited on Jan 18, 2024

pin model_revision for phi2 (#1123)

c1b741d
unverified

winglian commited on Jan 14, 2024

Phi2 rewrite (#1058)

732851f
unverified

winglian commited on Jan 8, 2024

streaming multipack for pretraining dataset (#959)

553c80f
unverified

jinwonkim93 jinwonkim93@github.com

winglian commited on Jan 6, 2024

fix: lint (#1037)

8ba27f3
unverified

Nanobit commited on Jan 3, 2024

added tiny llama examples for lora and qlora (#1027)

c75f916
unverified

Tim Dolan commited on Jan 3, 2024

Set eval_sample_packing to false in mistral config.yaml (#1003)

384b817
unverified

Kevin Sydney commited on Dec 28, 2023

Add an example config for finetuning a 34B model on a 24GB GPU (#1000)

6ef46f8
unverified

Evan Griffiths commited on Dec 25, 2023

set output_router_logits for mixtral config: (#995)

628b754
unverified

winglian commited on Dec 22, 2023

change val size (#992)

93ebec1
unverified

mhenrichsen commited on Dec 22, 2023

Fix Deepspeed loading (#950)

5ea3aa3
unverified

winglian commited on Dec 13, 2023

new evals_per_epoch and saves_per_epoch to make things cleaner (#944)

5f79b82
unverified

winglian commited on Dec 12, 2023

Mixtral official (#942)

7fabc4d
unverified

winglian commited on Dec 12, 2023

update to latest transformers for mixstral support (#929)

35f9b0f
unverified

winglian commited on Dec 10, 2023

Mixtral multipack (#928)

68b227a
unverified

winglian commited on Dec 10, 2023

support for mamba (#915)

40a6362
unverified

winglian commited on Dec 9, 2023

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

Nanobit commited on Dec 4, 2023

feature: loss watchdog for terminating training runs that are failing (#899)

58ec8b1
unverified

user735 Karl-Johan Alm commited on Dec 4, 2023

fix: remove FA for qwen examples (#900)

a48dbf6
unverified

Nanobit commited on Nov 27, 2023

Feat: Add Qwen (#894)

1115c50
unverified

Nanobit commited on Nov 25, 2023

Phi update 202311 (#876)

9bf854e
unverified

winglian commited on Nov 17, 2023

various bugfixes (#856)

1470650
unverified

winglian commited on Nov 15, 2023

don't compile deepspeed or bitsandbytes from source (#837)

f544ab2
unverified

winglian commited on Nov 9, 2023

fix eval_steps to be a sane default (#797)

8b79ff0
unverified

winglian commited on Oct 28, 2023

disable eval table w sample packing in examples (#778)

9b43e7e
unverified

winglian commited on Oct 23, 2023

simplify by removing duplicate base_model_config (#772)

2d8def6
unverified

winglian commited on Oct 23, 2023

Implement fused modules (#747)

15d3a65
unverified

casperhansen

winglian commited on Oct 21, 2023

Fix: lowercase `True` values in config (#713)

ace70b3
unverified

atgctg commited on Oct 10, 2023

Get qlora mistral-7b fine tuning working on a single 4090 (#708)

295b266
unverified

lukemarsden commited on Oct 10, 2023

fix unneeded space (#699)

f91db19
unverified

mhenrichsen commited on Oct 7, 2023

lint

83a950b
unverified

mhenrichsen commited on Oct 7, 2023

new lr, sample pack

4c8ddf2

mhenrichsen commited on Oct 6, 2023

Fix: Higher vram usage for mistral and sample_packing (#691)

669f1d0
unverified

Nanobit commited on Oct 6, 2023

Adding qlora config for Mistral (#675)

d4a88e4
unverified

Abhishek Mishra commited on Oct 6, 2023

prepared dataset caching, other misc fixes (#665)

e50a64e
unverified

winglian commited on Oct 3, 2023

Update mistral/README.md (#647)

b88f515
unverified

Adarsh Shirawalmath commited on Sep 28, 2023

Feat: Add example for Mistral (#644)

eb41f76
unverified

Nanobit commited on Sep 28, 2023

eval_table isn't quite stable enough to be in default llama configs (#637)

d887ad8
unverified

winglian commited on Sep 26, 2023

Feat: Add support for upstream FA2 (#626)

19a600a
unverified

Nanobit commited on Sep 26, 2023

default model changed

4fecbfe

mhenrichsen commited on Sep 24, 2023

support to disable exllama for gptq (#604)

faecff9
unverified

winglian commited on Sep 19, 2023

Commit History

Pretrain transforms (#1261) c7cf381 unverified

Peft lotfq (#1222) 4cb7900 unverified

Update qlora.yml - remove `max_packed_sequence_len` (#1210) [skip ci] 5407ddd unverified

add colab example (#1196) [skip ci] ee0b5f6 unverified

Mixtral fixes 20240124 (#1192) [skip ci] 54d2ac1 unverified

Phi2 multipack (#1173) 814aee6 unverified

Fine-Tuning Mistral-7b for Real-World Chatbot Applications Using Axolotl (Lora used) (#1155) cc25039 unverified

Falcon embeddings (#1149) [skip docker] e799e08 unverified

set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122) [skip ci] 782b6a4 unverified

Add shifted sparse attention (#973) [skip-ci] 1d70f24 unverified

pin model_revision for phi2 (#1123) c1b741d unverified

Phi2 rewrite (#1058) 732851f unverified

streaming multipack for pretraining dataset (#959) 553c80f unverified

fix: lint (#1037) 8ba27f3 unverified

added tiny llama examples for lora and qlora (#1027) c75f916 unverified

Set eval_sample_packing to false in mistral config.yaml (#1003) 384b817 unverified

Add an example config for finetuning a 34B model on a 24GB GPU (#1000) 6ef46f8 unverified

set output_router_logits for mixtral config: (#995) 628b754 unverified

change val size (#992) 93ebec1 unverified

Fix Deepspeed loading (#950) 5ea3aa3 unverified

new evals_per_epoch and saves_per_epoch to make things cleaner (#944) 5f79b82 unverified

Mixtral official (#942) 7fabc4d unverified

update to latest transformers for mixstral support (#929) 35f9b0f unverified

Mixtral multipack (#928) 68b227a unverified

support for mamba (#915) 40a6362 unverified

Feat(wandb): Refactor to be more flexible (#767) a1da39c unverified

feature: loss watchdog for terminating training runs that are failing (#899) 58ec8b1 unverified

fix: remove FA for qwen examples (#900) a48dbf6 unverified

Feat: Add Qwen (#894) 1115c50 unverified

Phi update 202311 (#876) 9bf854e unverified

various bugfixes (#856) 1470650 unverified

don't compile deepspeed or bitsandbytes from source (#837) f544ab2 unverified

fix eval_steps to be a sane default (#797) 8b79ff0 unverified

disable eval table w sample packing in examples (#778) 9b43e7e unverified

simplify by removing duplicate base_model_config (#772) 2d8def6 unverified

Implement fused modules (#747) 15d3a65 unverified

Fix: lowercase `True` values in config (#713) ace70b3 unverified

Get qlora mistral-7b fine tuning working on a single 4090 (#708) 295b266 unverified

fix unneeded space (#699) f91db19 unverified

lint 83a950b unverified

new lr, sample pack 4c8ddf2

Fix: Higher vram usage for mistral and sample_packing (#691) 669f1d0 unverified

Adding qlora config for Mistral (#675) d4a88e4 unverified

prepared dataset caching, other misc fixes (#665) e50a64e unverified

Update mistral/README.md (#647) b88f515 unverified

Feat: Add example for Mistral (#644) eb41f76 unverified

eval_table isn't quite stable enough to be in default llama configs (#637) d887ad8 unverified

Feat: Add support for upstream FA2 (#626) 19a600a unverified

default model changed 4fecbfe

support to disable exllama for gptq (#604) faecff9 unverified

Pretrain transforms (#1261)

c7cf381
unverified

Peft lotfq (#1222)

4cb7900
unverified

Update qlora.yml - remove `max_packed_sequence_len` (#1210) [skip ci]

5407ddd
unverified

add colab example (#1196) [skip ci]

ee0b5f6
unverified

Mixtral fixes 20240124 (#1192) [skip ci]

54d2ac1
unverified

Phi2 multipack (#1173)

814aee6
unverified

Fine-Tuning Mistral-7b for Real-World Chatbot Applications Using Axolotl (Lora used) (#1155)

cc25039
unverified

Falcon embeddings (#1149) [skip docker]

e799e08
unverified

set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122) [skip ci]

782b6a4
unverified

Add shifted sparse attention (#973) [skip-ci]

1d70f24
unverified

pin model_revision for phi2 (#1123)

c1b741d
unverified

Phi2 rewrite (#1058)

732851f
unverified

streaming multipack for pretraining dataset (#959)

553c80f
unverified

fix: lint (#1037)

8ba27f3
unverified

added tiny llama examples for lora and qlora (#1027)

c75f916
unverified

Set eval_sample_packing to false in mistral config.yaml (#1003)

384b817
unverified

Add an example config for finetuning a 34B model on a 24GB GPU (#1000)

6ef46f8
unverified

set output_router_logits for mixtral config: (#995)

628b754
unverified

change val size (#992)

93ebec1
unverified

Fix Deepspeed loading (#950)

5ea3aa3
unverified

new evals_per_epoch and saves_per_epoch to make things cleaner (#944)

5f79b82
unverified

Mixtral official (#942)

7fabc4d
unverified

update to latest transformers for mixstral support (#929)

35f9b0f
unverified

Mixtral multipack (#928)

68b227a
unverified

support for mamba (#915)

40a6362
unverified

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

feature: loss watchdog for terminating training runs that are failing (#899)

58ec8b1
unverified

fix: remove FA for qwen examples (#900)

a48dbf6
unverified

Feat: Add Qwen (#894)

1115c50
unverified

Phi update 202311 (#876)

9bf854e
unverified

various bugfixes (#856)

1470650
unverified

don't compile deepspeed or bitsandbytes from source (#837)

f544ab2
unverified

fix eval_steps to be a sane default (#797)

8b79ff0
unverified

disable eval table w sample packing in examples (#778)

9b43e7e
unverified

simplify by removing duplicate base_model_config (#772)

2d8def6
unverified

Implement fused modules (#747)

15d3a65
unverified

Fix: lowercase `True` values in config (#713)

ace70b3
unverified

Get qlora mistral-7b fine tuning working on a single 4090 (#708)

295b266
unverified

fix unneeded space (#699)

f91db19
unverified

lint

83a950b
unverified

new lr, sample pack

4c8ddf2

Fix: Higher vram usage for mistral and sample_packing (#691)

669f1d0
unverified

Adding qlora config for Mistral (#675)

d4a88e4
unverified

prepared dataset caching, other misc fixes (#665)

e50a64e
unverified

Update mistral/README.md (#647)

b88f515
unverified

Feat: Add example for Mistral (#644)

eb41f76
unverified

eval_table isn't quite stable enough to be in default llama configs (#637)

d887ad8
unverified

Feat: Add support for upstream FA2 (#626)

19a600a
unverified

default model changed

4fecbfe

support to disable exllama for gptq (#604)

faecff9
unverified