Fac256xc / src

Commit History

feat: release action
4381fec

boris commited on

feat: use default from_pretrained function
4ac66e4

boris commited on

feat(train): use new HF _do_init api
6b84155

boris commited on

fix: model compatible with do_init
f3a8cbb

boris commited on

feat: layernorm > rmsnorm in long runs
0f2cf98

boris commited on

fix: use correctly cache during inference + allow unscan (#170)
42968cf
unverified

boris commited on

feat: vmap optimizer (#166)
b993d27
unverified

boris commited on

feat: scan layers + gradient checkpointing (#161)
07a6f9a
unverified

boris commited on

Merge branch 'main' of https://github.com/borisdayma/dalle-mini into main
bcd360f

boris commited on

feat: better multi-node support (#158)
728a3c3
unverified

boris commited on

feat(text): support emojis (#154)
7ef7bd9
unverified

boris commited on

fix: smelu
7f2f8ed

boris commited on

fix: sinkformer
2c583b3

boris commited on

fix: support smelu
a2dcee4

boris commited on

feat: allow relative position (#156)
769d20a
unverified

boris commited on

feat: sinkhorn in lse mode (#155)
00d4661
unverified

boris commited on

fix: sinkformer gradient
eed4896

boris commited on

feat(model): allow bias (#152)
361a994
unverified

boris commited on

feat: add sinkformer + custom final ln + pre-ln (#151)
f139b0b
unverified

boris commited on

feat: placeholders for more config
69bcbeb

boris commited on

feat: force final ln in encoder
32f4ba5

boris commited on

feat: allow more configurations
5bd4c20

boris commited on

fix: DeepNet doesn't scale weights of embedding/output layers (#150)
503d6b4
unverified

Shuming Ma Shuming Ma commited on

feat: remove unecessary LN
02824a7

boris commited on

feat: add cogview
472c4cc

boris commited on

fix(textnormalizer): consider utf8 on windows (#148)
3b8d8cb
unverified

illtellyoulater commited on

feat: implement transformer variants (#144)
542378c
unverified

boris commited on

feat(data): super conditioning (#141)
7939874
unverified

boris commited on

feat: support pod (#139)
803ccbf
unverified

boris commited on

feat: handle gradient checkpointing
5173ec7

boris commited on

feat: load from bucket
1c4e839

boris commited on

feat: reduce artifact space + offset step
34cf91c

boris commited on

feat: restore weights on CPU
5f954fc

boris commited on

fix: position embedding for generate method
ebac379

boris commited on

fix: typo
68cc185

boris commited on

fix: load from checkpoint
44b7c3e

boris commited on

feat(modeling): simplify abstract_init
fa72aa7

boris commited on

feat(train) - handle multiple nodes (#130)
0952927
unverified

boris commited on

feat: handle model parallel
1bb3269

boris commited on

fix: style
386f839

boris commited on

style(tokenizer): remove unused variables
605df32

boris commited on

feat: use fast tokenizer
767d78a

boris commited on

feat(train): improve pjit speed
f254058

boris commited on

fix(train): consider correct batch size
b7c7458

boris commited on

feat(train): distributed_shampoo with pjit
cc34d07

boris commited on

style: unsused import
7a176b9

boris commited on

feat(model): clean way to load on cpu
12f323d

boris commited on

feat(train): no batch dimension with pjit
df1fe19

boris commited on

feat(train): progress on pjit
49597a2

boris commited on

feat: use_artifact if run existing
a5ed112

boris commited on