dalle-mini / src /dalle_mini

Commit History

fix: sinkformer
2c583b3

boris commited on

fix: support smelu
a2dcee4

boris commited on

feat: allow relative position (#156)
769d20a
unverified

boris commited on

feat: sinkhorn in lse mode (#155)
00d4661
unverified

boris commited on

fix: sinkformer gradient
eed4896

boris commited on

feat(model): allow bias (#152)
361a994
unverified

boris commited on

feat: add sinkformer + custom final ln + pre-ln (#151)
f139b0b
unverified

boris commited on

feat: placeholders for more config
69bcbeb

boris commited on

feat: force final ln in encoder
32f4ba5

boris commited on

feat: allow more configurations
5bd4c20

boris commited on

fix: DeepNet doesn't scale weights of embedding/output layers (#150)
503d6b4
unverified

Shuming Ma Shuming Ma commited on

feat: remove unecessary LN
02824a7

boris commited on

feat: add cogview
472c4cc

boris commited on

fix(textnormalizer): consider utf8 on windows (#148)
3b8d8cb
unverified

illtellyoulater commited on

feat: implement transformer variants (#144)
542378c
unverified

boris commited on

feat(data): super conditioning (#141)
7939874
unverified

boris commited on

feat: support pod (#139)
803ccbf
unverified

boris commited on

feat: handle gradient checkpointing
5173ec7

boris commited on

feat: load from bucket
1c4e839

boris commited on

feat: reduce artifact space + offset step
34cf91c

boris commited on

feat: restore weights on CPU
5f954fc

boris commited on

fix: position embedding for generate method
ebac379

boris commited on

fix: typo
68cc185

boris commited on

fix: load from checkpoint
44b7c3e

boris commited on

feat(modeling): simplify abstract_init
fa72aa7

boris commited on

feat(train) - handle multiple nodes (#130)
0952927
unverified

boris commited on

feat: handle model parallel
1bb3269

boris commited on

fix: style
386f839

boris commited on

style(tokenizer): remove unused variables
605df32

boris commited on

feat: use fast tokenizer
767d78a

boris commited on

feat(train): improve pjit speed
f254058

boris commited on

fix(train): consider correct batch size
b7c7458

boris commited on

feat(train): distributed_shampoo with pjit
cc34d07

boris commited on

style: unsused import
7a176b9

boris commited on

feat(model): clean way to load on cpu
12f323d

boris commited on

feat(train): no batch dimension with pjit
df1fe19

boris commited on

feat(train): progress on pjit
49597a2

boris commited on

feat: use_artifact if run existing
a5ed112

boris commited on

Load from wandb artifact (#121)
f69b21b
unverified

boris commited on

Style (isort).
f9d51f7

Pedro Cuenca commited on

Tokenizer, config, model can be loaded from wandb.
7e48337

Pedro Cuenca commited on

feat(data): support accumulation in non-streaming
88c8e06

boris commited on

feat: custom gradient accumulation
2d07559

boris commited on

Change import order again.
2b2be9b

Pedro Cuenca commited on

Fix import order to make isort happy.
64d99b2

Pedro Cuenca commited on

Accept changes suggested by linter.
9f522b8

Pedro Cuenca commited on

Never consider local dirs as remote wandb references.
08dd098

Pedro Cuenca commited on

Store resolved path after loading model.
55a631d

Pedro Cuenca commited on

Override from_pretrained to support wandb artifacts.
1023afa

Pedro Cuenca commited on

fix(data): no shuffling of validation data
ddcbc6a

boris commited on