PreTraining

Architecture	Weights	PreTraining Loss	PreTraining Perplexity
roberta-base	huggingface/hub	0.3488	3.992
bert-base-uncased	huggingface/hub	0.3909	6.122
electra-large	huggingface/hub	0.723	6.394
albert-base	huggingface/hub	0.7343	7.76
electra-small	huggingface/hub	0.9226	11.098
electra-base	huggingface/hub	0.9468	8.783
distilbert-base-uncased	huggingface/hub	1.082	7.963

Downloads last month: 8

Safetensors

Model size

125M params

Tensor type

I64

F32

Fill-Mask

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.