RockyEmbed-Marco

RockyEmbed-Marco is a lightweight, high-performance text embedding model built by contrastively fine-tuning RockyEmbed on the MS MARCO dataset. It is designed for efficient real-world retrieval tasks such as semantic search, RAG (Retrieval-Augmented Generation), and question answering systems.


Overview

Modern embedding models often rely on large-scale architectures with billions of parameters. RockyEmbed-Marco takes a different approach:

Compact (~90M parameters)

Efficient (CPU-friendly inference)

Task-optimized (fine-tuned for retrieval)

Production-ready (designed for real-world RAG systems)

This model builds on RockyEmbed, which is pre-trained using distillation and optimized training strategies, and enhances it through contrastive learning on MS MARCO to improve retrieval quality.


Model Architecture

Base Model: RockyEmbed

Parameters: ~90M

Embedding Dimension: (add your dimension here, e.g., 768 or 1024)

Training Strategy:

Stage 1: Distillation-based pretraining

Stage 2: Contrastive fine-tuning (MS MARCO)


Training Details

Dataset

MS MARCO Passage Ranking Dataset

Large-scale dataset for training retrieval systems

Contains real-world queries and relevant passages

Objective Function

InfoNCE (Contrastive Loss)

The model learns to:

Pull semantically similar query-passage pairs closer

Push irrelevant pairs apart in embedding space


Evaluation

RockyEmbed-Marco is evaluated on both benchmark datasets and real-world RAG scenarios:

MTEB Quora Subset (Massive Text Embedding Benchmark)(Quora)

Main Score: 0.64

RAGAS Evaluation Quora subset (RAG-specific metrics) (Quora)

Context Precision: 0.0583

Answer Correctness: 0.4717

These evaluations demonstrate:

Strong retrieval capability relative to model size

Practical effectiveness in downstream RAG pipelines


Usage

Installation

pip install torch transformers sentence-transformers


Loading the Model

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("your-username/rockyembed-marco")

embeddings = model.encode([ "What is contrastive learning?", "Explain retrieval augmented generation" ])


Example: Semantic Search

from sentence_transformers import util

query = "What is RAG?" documents = [ "RAG stands for Retrieval-Augmented Generation.", "Transformers are deep learning models.", "Contrastive learning improves embeddings." ]

query_emb = model.encode(query, convert_to_tensor=True) doc_emb = model.encode(documents, convert_to_tensor=True)

scores = util.cos_sim(query_emb, doc_emb)

print(scores)


Use Cases

🔍 Semantic Search

📚 Document Retrieval

🤖 Retrieval-Augmented Generation (RAG)

💬 Question Answering Systems

🧠 Embedding-based Clustering


Design Philosophy

RockyEmbed-Marco is built with the following principles:

Efficiency over scale → smaller models, competitive performance

Practicality → optimized for real-world pipelines

Stability → improved training techniques to avoid gradient issues

Accessibility → usable on limited hardware (CPU-friendly)


Key Insights

Contrastive fine-tuning significantly improves retrieval quality

Smaller models can compete with larger ones when trained effectively

Evaluation on RAG tasks is essential—not just benchmarks


Future Work

Multi-domain fine-tuning

Hard-negative mining improvements

Multilingual support

Integration with lightweight LLM pipelines


Contributing

Contributions, ideas, and improvements are welcome. Feel free to open issues or submit pull requests.


License

(Add your license here, e.g., MIT / Apache 2.0)


Contact

Pranav Upadhyaya 📧 pranavupadhyaya52@gmail.com 🔗 ORCID: https://orcid.org/0009-0008-8887-4349


Downloads last month
15
Safetensors
Model size
90.9M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fermacsys/RockyEmbed_Marco

Finetuned
(1)
this model

Dataset used to train fermacsys/RockyEmbed_Marco

Collection including fermacsys/RockyEmbed_Marco