Edit model card

PreTraining

Architecture Weights PreTraining Loss PreTraining Perplexity
roberta-base huggingface/hub 0.3488 3.992
bert-base-uncased huggingface/hub 0.3909 6.122
electra-large huggingface/hub 0.723 6.394
albert-base huggingface/hub 0.7343 7.76
electra-small huggingface/hub 0.9226 11.098
electra-base huggingface/hub 0.9468 8.783
distilbert-base-uncased huggingface/hub 1.082 7.963
Downloads last month
17
Safetensors
Model size
125M params
Tensor type
I64
·
F32
·