nomic-embed-text-v1-unsupervised: A Reproducible Long Context (8192) Text Embedder

nomic-embed-text-v1-unsupervised is 8192 context length text encoder. This is a checkpoint after contrastive pretraining from multi-stage contrastive training of the final model. The purpose of releasing this checkpoint is to open-source training artifacts from our Nomic Embed Text tech report here

If you want to use a model to extract embeddings, we suggest using nomic-embed-text-v1.

Join the Nomic Community

Downloads last month: 1,667

Inference Providers NEW

Sentence Similarity

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Model tree for nomic-ai/nomic-embed-text-v1-unsupervised

Quantizations

1 model

Spaces using nomic-ai/nomic-embed-text-v1-unsupervised 4

Collection including nomic-ai/nomic-embed-text-v1-unsupervised

Nomic Embed

Collection

Open Source Long Context Text Embedders • 8 items • Updated Feb 14, 2024 • 21

Evaluation results

accuracy on MTEB AmazonCounterfactualClassification (en)
test set self-reported

76.985
ap on MTEB AmazonCounterfactualClassification (en)
test set self-reported

39.472
f1 on MTEB AmazonCounterfactualClassification (en)
test set self-reported

70.592
accuracy on MTEB AmazonPolarityClassification
test set self-reported

87.540
ap on MTEB AmazonPolarityClassification
test set self-reported

83.161
f1 on MTEB AmazonPolarityClassification
test set self-reported

87.523
accuracy on MTEB AmazonReviewsClassification (en)
test set self-reported

46.808
f1 on MTEB AmazonReviewsClassification (en)
test set self-reported

46.263
map_at_1 on MTEB ArguAna
test set self-reported

30.583
map_at_10 on MTEB ArguAna
test set self-reported

46.170

View on Papers With Code