Upload CodeCompass-Embed v2 — #1 on CSN-Python (NDCG@10=0.979), 12-task CoIR eval

Browse files

Files changed (1) hide show

README.md +17 -19

README.md CHANGED Viewed

@@ -52,13 +52,11 @@ model-index:
 ## Model Highlights
-- 🏆 **#1 on CodeSearchNet-Python** — NDCG@10 = 0.979, beating SFR-Embedding-Code (0.951) by +2.9%
-- 🥇 **#1 on CodeTrans-DL** — Code translation between deep learning frameworks
-- ⚡ **494M parameters**, 896-dim embeddings — runs on consumer GPUs
-- 🔄 **Bidirectional attention** (converted from causal LLM)
-- 🎯 **Mean pooling** with L2 normalization
-- 📏 Trained at 512 tokens, extrapolates to longer sequences via RoPE
-- 🌐 **Multi-language**: Python, Java, JavaScript, Go, Ruby, PHP
 ## Model Details
@@ -76,16 +74,16 @@ model-index:
 Evaluated on the [CoIR Benchmark](https://github.com/CoIR-team/coir) (ACL 2025). All scores are NDCG@10. Sorted by CSN-Python.
-| Model | Params | CSN-Py | CodeTrans | Text2SQL | SO-QA | CodeFeedback | Apps |
-|-------|--------|--------|-----------|----------|-------|--------------|------|
-| **CodeCompass-Embed (ours)** | **494M** | **0.979** 🏆 | **0.286** 🏆 | **0.736** | **0.834** | **0.814** | **0.349** |
-| SFR-Embedding-Code | 400M | 0.951 | 0.268 | 0.995 | 0.911 | 0.726 | 0.221 |
-| Jina-Code-v2 | 161M | 0.944 | 0.274 | 0.517 | 0.887 | 0.698 | 0.154 |
-| CodeRankEmbed | 137M | 0.938 | 0.260 | 0.769 | 0.899 | 0.717 | 0.199 |
-| Snowflake-Arctic-Embed-L | 568M | 0.915 | 0.196 | 0.540 | 0.872 | 0.650 | 0.144 |
-| BGE-M3 | 568M | 0.898 | 0.219 | 0.573 | 0.850 | 0.644 | 0.145 |
-| BGE-Base-en-v1.5 | 109M | 0.894 | 0.213 | 0.527 | 0.858 | 0.642 | 0.142 |
-| CodeT5+-110M | 110M | 0.870 | 0.179 | 0.328 | 0.815 | 0.580 | 0.118 |
 ### Multi-Language Code Search (CodeSearchNet)
@@ -102,7 +100,7 @@ Evaluated on the [CoIR Benchmark](https://github.com/CoIR-team/coir) (ACL 2025).
 | Task | NDCG@10 | MRR@10 |
 |------|---------|--------|
-| **codesearchnet-python** | **0.979** 🏆 | **0.976** |
 | stackoverflow-qa | 0.834 | 0.810 |
 | codefeedback-st | 0.814 | 0.775 |
 | codesearchnet-go | 0.797 | 0.767 |
@@ -112,7 +110,7 @@ Evaluated on the [CoIR Benchmark](https://github.com/CoIR-team/coir) (ACL 2025).
 | codesearchnet-javascript | 0.621 | 0.578 |
 | codesearchnet-ruby | 0.579 | 0.535 |
 | apps | 0.349 | 0.307 |
-| codetrans-dl | 0.286 🏆 | 0.164 |
 | cosqa | 0.209 | 0.165 |
 | **Average (12 tasks)** | **0.623** | **0.577** |

 ## Model Highlights
+- **Code search from natural language** — find relevant code snippets across Python, Java, JavaScript, Go, Ruby, PHP
+- **Competitive with models 3× smaller and larger** — 494M params, 896-dim embeddings
+- **Bidirectional attention** — all 24 layers converted from causal for better embedding quality
+- **Lightweight** — runs on consumer GPUs, trained at 512 tokens with RoPE extrapolation for longer inputs
+- **Versatile** — supports NL→Code, Code→Code, Q&A, and Text→SQL retrieval via instruction templates
 ## Model Details
 Evaluated on the [CoIR Benchmark](https://github.com/CoIR-team/coir) (ACL 2025). All scores are NDCG@10. Sorted by CSN-Python.
+| Model | Params | CSN-Py | CodeTrans | Text2SQL | SO-QA | CodeFeedback | Apps | Avg |
+|-------|--------|--------|-----------|----------|-------|--------------|------|-----|
+| CodeCompass-Embed (ours) | 494M | **0.979** | **0.286** | 0.736 | 0.834 | **0.814** | **0.349** | 0.666 |
+| SFR-Embedding-Code | 400M | 0.951 | 0.268 | **0.995** | **0.911** | 0.726 | 0.221 | **0.679** |
+| Jina-Code-v2 | 161M | 0.944 | 0.274 | 0.517 | 0.887 | 0.698 | 0.154 | 0.579 |
+| CodeRankEmbed | 137M | 0.938 | 0.260 | 0.769 | 0.899 | 0.717 | 0.199 | 0.630 |
+| Snowflake-Arctic-Embed-L | 568M | 0.915 | 0.196 | 0.540 | 0.872 | 0.650 | 0.144 | 0.553 |
+| BGE-M3 | 568M | 0.898 | 0.219 | 0.573 | 0.850 | 0.644 | 0.145 | 0.555 |
+| BGE-Base-en-v1.5 | 109M | 0.894 | 0.213 | 0.527 | 0.858 | 0.642 | 0.142 | 0.546 |
+| CodeT5+-110M | 110M | 0.870 | 0.179 | 0.328 | 0.815 | 0.580 | 0.118 | 0.482 |
 ### Multi-Language Code Search (CodeSearchNet)
 | Task | NDCG@10 | MRR@10 |
 |------|---------|--------|
+| codesearchnet-python | 0.979 | 0.976 |
 | stackoverflow-qa | 0.834 | 0.810 |
 | codefeedback-st | 0.814 | 0.775 |
 | codesearchnet-go | 0.797 | 0.767 |
 | codesearchnet-javascript | 0.621 | 0.578 |
 | codesearchnet-ruby | 0.579 | 0.535 |
 | apps | 0.349 | 0.307 |
+| codetrans-dl | 0.286 | 0.164 |
 | cosqa | 0.209 | 0.165 |
 | **Average (12 tasks)** | **0.623** | **0.577** |