Commit History
FIxed language and team members.
02513c3
versae
commited on
Model at 210k steps, mlm acc 0.6537
7ec1ea9
versae
commited on
Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
4cad39c
versae
commited on
Adding pad_to_multiple_of=16
986ff4e
versae
commited on
Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
037bc96
versae
commited on
Model at 182k steps, mlm acc 0.6494
73f6af5
versae
commited on
Changed print to logger
1c5d797
versae
commited on
Preparing code for final runs
ea0132b
versae
commited on
Improved version of conversion script Flax → PyTorch
346a10a
versae
commited on
Fixed widget example
3f4b8d4
versae
commited on
Fix config for checkpoint
3950061
versae
commited on
Changed and added vocab and tokenizer
29e26bb
versae
commited on
Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
61f6971
versae
commited on
New Flax model
300e533
versae
commited on
Fixes to mc4 fork
8bd9e95
versae
commited on
Fixes treatment of jsonl
7b22f12
versae
commited on
Fix format for filepaths
7d6bbb2
versae
commited on
Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
13757a8
versae
commited on
Adding reading streaming files from local disk
4e4228c
versae
commited on
Base model at 105k steps
f7ba030
versae
commited on
Update .gitattributes
b020d07
patrickvonplaten
commited on
Fixes and defaults
a5b19d7
versae
commited on
Adding Numpy random number generator
f562f06
versae
commited on
Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
f965ae3
versae
commited on
Adding random sampling
60b6f6b
versae
commited on
Adding config and models for the hub widget
d75240e
versae
commited on
Adding missing import
79555ba
versae
commited on
Adding base config and organizing configs
9c5541b
versae
commited on
Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
36b7dde
versae
commited on
Adding sampling to mc4
3f09f56
versae
commited on
Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
9072c50
versae
commited on
New tokenizer
eb4e77c
versae
commited on
Adjust batch size for extrating tokens
8b9ba87
versae
commited on
Scripts for perplexity sampling and fixes
853cd83
versae
commited on
Remove unused imports
d5cede4
edugp
commited on
Merge branch 'main' of https://huggingface.co/flax-community/bertin-roberta-large-spanish into main
840171b
edugp
commited on
Add script to generate dataset of embeddings and perplexities. Add script to generate t-SNE plot for embedding and perplexity visualization.
a81e575
edugp
commited on
Adding correct models 10k steps
fe7ff35
versae
commited on
Updating run script
a1f93c9
versae
commited on
Adding checkpointing, wandb, and new mlm script
d988382
versae
commited on
Epoch 1 Flax model
48f8c78
versae
commited on
Changed batch size
a95f7b8
versae
commited on
Changed execution mode
40f69ff
versae
commited on
Initial test with BETO's corpus
2835721
versae
commited on
:sparkles: Added test_script and a folder for scripts
2a963f0
Pablo
commited on
:see_no_evil: Added .gitignore file
de633ab
Pablo
commited on
Update README
e14e482
versae
commited on
README
af3cb2c
versae
commited on