ModernBERT-Embed-Unsupervised

modernbert-embed-unsupervised is the unsupervised checkpoint trained with the contrastors library for 1 epoch over the 235M weakly-supervised contrastive pairs curated in Nomic Embed.

We suggest using moderbert-embed for embedding tasks.

Performance

The modernbert-unsupervised model performs similarly to the nomic-embed-text-v1_unsup model

Model	Average (56)	Classification (12)	Clustering (11)	Pair Classification (3)	Reranking (4)	Retrieval (15)	STS (10)	Overall
nomic-embed-text-v1_unsup	59.9	71.2	42.5	83.7	55.0	48.0	80.8	30.7
modernbert-embed-unsupervised	60.03	72.11	44.34	82.78	55.0	47.05	80.33	31.2

Downloads last month: 716

Safetensors

Model size

149M params

Tensor type

F32

Inference Examples

Sentence Similarity

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nomic-ai/modernbert-embed-base-unsupervised

Base model

answerdotai/ModernBERT-base

Finetuned

(80)

this model

Evaluation results

accuracy on MTEB AmazonCounterfactualClassification (en)
test set self-reported

76.209
ap on MTEB AmazonCounterfactualClassification (en)
test set self-reported

39.251
f1 on MTEB AmazonCounterfactualClassification (en)
test set self-reported

70.152
accuracy on MTEB AmazonPolarityClassification
test set self-reported

91.661
ap on MTEB AmazonPolarityClassification
test set self-reported

88.673
f1 on MTEB AmazonPolarityClassification
test set self-reported

91.653
accuracy on MTEB AmazonReviewsClassification (en)
test set self-reported

46.768
f1 on MTEB AmazonReviewsClassification (en)
test set self-reported

46.153
map_at_1 on MTEB ArguAna
test set self-reported

24.964
map_at_10 on MTEB ArguAna
test set self-reported

39.891

View on Papers With Code