Instructions to use fermacsys/RockyEmbed_Marco with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use fermacsys/RockyEmbed_Marco with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="fermacsys/RockyEmbed_Marco", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("fermacsys/RockyEmbed_Marco", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
RockyEmbed-Marco
RockyEmbed-Marco is a lightweight, high-performance text embedding model built by contrastively fine-tuning RockyEmbed on the MS MARCO dataset. It is designed for efficient real-world retrieval tasks such as semantic search, RAG (Retrieval-Augmented Generation), and question answering systems.
Overview
Modern embedding models often rely on large-scale architectures with billions of parameters. RockyEmbed-Marco takes a different approach:
Compact (~90M parameters)
Efficient (CPU-friendly inference)
Task-optimized (fine-tuned for retrieval)
Production-ready (designed for real-world RAG systems)
This model builds on RockyEmbed, which is pre-trained using distillation and optimized training strategies, and enhances it through contrastive learning on MS MARCO to improve retrieval quality.
Model Architecture
Base Model: RockyEmbed
Parameters: ~90M
Embedding Dimension: (add your dimension here, e.g., 768 or 1024)
Training Strategy:
Stage 1: Distillation-based pretraining
Stage 2: Contrastive fine-tuning (MS MARCO)
Training Details
Dataset
MS MARCO Passage Ranking Dataset
Large-scale dataset for training retrieval systems
Contains real-world queries and relevant passages
Objective Function
InfoNCE (Contrastive Loss)
The model learns to:
Pull semantically similar query-passage pairs closer
Push irrelevant pairs apart in embedding space
Evaluation
RockyEmbed-Marco is evaluated on both benchmark datasets and real-world RAG scenarios:
MTEB Quora Subset (Massive Text Embedding Benchmark)(Quora)
Main Score: 0.64
RAGAS Evaluation Quora subset (RAG-specific metrics) (Quora)
Context Precision: 0.0583
Answer Correctness: 0.4717
These evaluations demonstrate:
Strong retrieval capability relative to model size
Practical effectiveness in downstream RAG pipelines
Usage
Installation
pip install torch transformers sentence-transformers
Loading the Model
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("your-username/rockyembed-marco")
embeddings = model.encode([ "What is contrastive learning?", "Explain retrieval augmented generation" ])
Example: Semantic Search
from sentence_transformers import util
query = "What is RAG?" documents = [ "RAG stands for Retrieval-Augmented Generation.", "Transformers are deep learning models.", "Contrastive learning improves embeddings." ]
query_emb = model.encode(query, convert_to_tensor=True) doc_emb = model.encode(documents, convert_to_tensor=True)
scores = util.cos_sim(query_emb, doc_emb)
print(scores)
Use Cases
🔍 Semantic Search
📚 Document Retrieval
🤖 Retrieval-Augmented Generation (RAG)
💬 Question Answering Systems
🧠 Embedding-based Clustering
Design Philosophy
RockyEmbed-Marco is built with the following principles:
Efficiency over scale → smaller models, competitive performance
Practicality → optimized for real-world pipelines
Stability → improved training techniques to avoid gradient issues
Accessibility → usable on limited hardware (CPU-friendly)
Key Insights
Contrastive fine-tuning significantly improves retrieval quality
Smaller models can compete with larger ones when trained effectively
Evaluation on RAG tasks is essential—not just benchmarks
Future Work
Multi-domain fine-tuning
Hard-negative mining improvements
Multilingual support
Integration with lightweight LLM pipelines
Contributing
Contributions, ideas, and improvements are welcome. Feel free to open issues or submit pull requests.
License
(Add your license here, e.g., MIT / Apache 2.0)
Contact
Pranav Upadhyaya 📧 pranavupadhyaya52@gmail.com 🔗 ORCID: https://orcid.org/0009-0008-8887-4349
- Downloads last month
- 15
Model tree for fermacsys/RockyEmbed_Marco
Base model
fermacsys/rocky-embed