Commit History

feat: allow relative position (#156)
769d20a
unverified

boris commited on

feat: sinkhorn in lse mode (#155)
00d4661
unverified

boris commited on

feat(demo): update model
b9a1a7d

boris commited on

fix: sinkformer gradient
eed4896

boris commited on

feat(model): allow bias (#152)
361a994
unverified

boris commited on

feat(train): google-cloud-storage is optional
02b2308

boris commited on

feat(train): rename logged config
955dc20

boris commited on

feat: add sinkformer + custom final ln + pre-ln (#151)
f139b0b
unverified

boris commited on

feat: placeholders for more config
69bcbeb

boris commited on

feat: add mini_glu config
a7e5050

boris commited on

feat: force final ln in encoder
32f4ba5

boris commited on

feat: allow more configurations
5bd4c20

boris commited on

fix: DeepNet doesn't scale weights of embedding/output layers (#150)
503d6b4
unverified

Shuming Ma Shuming Ma commited on

feat: remove unecessary LN
02824a7

boris commited on

feat: update mini config
d9a16f2

boris commited on

feat: add cogview
472c4cc

boris commited on

fix(textnormalizer): consider utf8 on windows (#148)
3b8d8cb
unverified

illtellyoulater commited on

feat: implement transformer variants (#144)
542378c
unverified

boris commited on

feat(train): log norm and histograms (#143)
b7b619a
unverified

boris commited on

feat(data): super conditioning (#141)
7939874
unverified

boris commited on

feat: support pod (#139)
803ccbf
unverified

boris commited on

fix: no gradient checkpointing for new model
2e02683

boris commited on

feat: no gradient checkpointing for params init
b798ed3

boris commited on

feat: update configs
79557f9

boris commited on

feat(dev): require datasets
3d64598

boris commited on

fix(train): consider schedule offset
bc4734f

boris commited on

feat(train): local jax cache
9f5e879

boris commited on

feat: add bucket reference to artifact
d368fb6

boris commited on

style: lint
d5d442a

boris commited on

feat: handle gradient checkpointing
5173ec7

boris commited on

feat: load from bucket
1c4e839

boris commited on

feat(train): save to bucket
50498e6

boris commited on

feat: reduce artifact space + offset step
34cf91c

boris commited on

feat(demo): update reference
e558000

boris commited on

feat: restore weights on CPU
5f954fc

boris commited on

feat(train): simplify tokenizer loading
4cb21dd

boris commited on

doc: update README
db5a22a

boris commited on

feat: cleanup notebook
5a390e8

boris commited on

feat: wandb required for checkpoints
38c2c4e

boris commited on

feat(demo): uncomment pip install
094e178

boris commited on

feat: improve inference demo
35fe578

boris commited on

fix: position embedding for generate method
ebac379

boris commited on

feat(train): use compilation cache
da9367c

boris commited on

fix: typo
68cc185

boris commited on

fix: load from checkpoint
44b7c3e

boris commited on

fix: style
d483294

boris commited on

Merge branch 'main' of https://github.com/borisdayma/dalle-mini into main
0a691de

boris commited on

feat: log num_parameters early
7cfe576

boris commited on

fix: distributed shampoo class
696422e

boris commited on

feat: update distributed_shampoo
5996680

boris commited on