SentenceTransformer based on redis/langcache-embed-v1

This is a sentence-transformers model finetuned from redis/langcache-embed-v1 on the triplet dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: redis/langcache-embed-v1
Maximum Sequence Length: 8192 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity
Training Dataset:
- triplet

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("redis/langcache-embed-v2")
# Run inference
sentences = [
    'What are some examples of crimes understood as a moral turpitude?',
    'What are some examples of crimes of moral turpitude?',
    'What are some examples of crimes understood as a legal aptitude?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Dataset: triplet
Size: 36,864 training samples
Columns: anchor, positive, negative_1, negative_2, and negative_3

Samples:

anchor	positive	negative_1	negative_2	negative_3
`Is life really what I make of it?`	`Life is what you make it?`	`Is life hardly what I take of it?`	`Life is not entirely what I make of it.`	`Is life not what I make of it?`
`When you visit a website, can a person running the website see your IP address?`	`Does every website I visit knows my public ip address?`	`When you avoid a website, can a person hiding the website see your MAC address?`	`When you send an email, can the recipient see your physical location?`	`When you visit a website, a person running the website cannot see your IP address.`
`What are some cool features about iOS 10?`	`What are the best new features of iOS 10?`	`iOS 10 received criticism for its initial bugs and performance issues, and some users found the redesigned apps less intuitive compared to previous versions.`	`What are the drawbacks of using Android 14?`	`iOS 10 was widely criticized for its bugs, removal of beloved features, and generally being a downgrade from previous versions.`

Loss: MatryoshkaLoss with these parameters:

{
    "loss": "CachedMultipleNegativesRankingLoss",
    "matryoshka_dims": [768,512,256,128,64],
    "matryoshka_weights": [1,1,1,1,1],
    "n_dims_per_step": -1
}

Evaluation

Citation

Redis Langcache-embed Models

We encourage you to cite our work if you use our models or build upon our findings.

@inproceedings{langcache-embed-v1,
    title = "Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data",
    author = "Gill, Cechmanek, Hutcherson, Rajamohan, Agarwal, Gulzar, Singh, Dion",
    month = "04",
    year = "2025",
    url = "https://arxiv.org/abs/2504.02268",
}

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

redis
/

langcache-embed-v2