nomic-embed-text-v1-ablated: A Reproducible Long Context (8192) Text Embedder

nomic-embed-text-v1-ablated is 8192 context length text encoder. This is a checkpoint trained after modifying the training dataset to be different from the dataset used to train our final model. The purpose of releasing this checkpoint is to understand the impact that subsets of our training data had on model outcomes. This release is part of our commitment to open-source training artifacts from our Nomic Embed Text tech report here

If you want to use a model to extract embeddings, we suggest using nomic-embed-text-v1.

Join the Nomic Community

Downloads last month: 210

Inference Providers NEW

Sentence Similarity

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nomic-ai/nomic-embed-text-v1-ablated

Quantizations

1 model

Spaces using nomic-ai/nomic-embed-text-v1-ablated 7

Collection including nomic-ai/nomic-embed-text-v1-ablated

Nomic Embed

Collection

Open Source Long Context Text Embedders • 8 items • Updated Feb 14, 2024 • 21

Evaluation results

accuracy on MTEB AmazonCounterfactualClassification (en)
test set self-reported

78.672
ap on MTEB AmazonCounterfactualClassification (en)
test set self-reported

42.738
f1 on MTEB AmazonCounterfactualClassification (en)
test set self-reported

72.800
accuracy on MTEB AmazonPolarityClassification
test set self-reported

90.414
ap on MTEB AmazonPolarityClassification
test set self-reported

87.088
f1 on MTEB AmazonPolarityClassification
test set self-reported

90.392
accuracy on MTEB AmazonReviewsClassification (en)
test set self-reported

47.808
f1 on MTEB AmazonReviewsClassification (en)
test set self-reported

47.257
map_at_1 on MTEB ArguAna
test set self-reported

30.370
map_at_10 on MTEB ArguAna
test set self-reported

45.748

View on Papers With Code