faisalmumtaz commited on
Commit
7ba274d
·
verified ·
1 Parent(s): 1ccb9d7

Upload CodeCompass-Embed v2 — #1 on CSN-Python (NDCG@10=0.979), 12-task CoIR eval

Browse files
Files changed (1) hide show
  1. README.md +17 -19
README.md CHANGED
@@ -52,13 +52,11 @@ model-index:
52
 
53
  ## Model Highlights
54
 
55
- - 🏆 **#1 on CodeSearchNet-Python** — NDCG@10 = 0.979, beating SFR-Embedding-Code (0.951) by +2.9%
56
- - 🥇 **#1 on CodeTrans-DL** — Code translation between deep learning frameworks
57
- - **494M parameters**, 896-dim embeddings runs on consumer GPUs
58
- - 🔄 **Bidirectional attention** (converted from causal LLM)
59
- - 🎯 **Mean pooling** with L2 normalization
60
- - 📏 Trained at 512 tokens, extrapolates to longer sequences via RoPE
61
- - 🌐 **Multi-language**: Python, Java, JavaScript, Go, Ruby, PHP
62
 
63
  ## Model Details
64
 
@@ -76,16 +74,16 @@ model-index:
76
 
77
  Evaluated on the [CoIR Benchmark](https://github.com/CoIR-team/coir) (ACL 2025). All scores are NDCG@10. Sorted by CSN-Python.
78
 
79
- | Model | Params | CSN-Py | CodeTrans | Text2SQL | SO-QA | CodeFeedback | Apps |
80
- |-------|--------|--------|-----------|----------|-------|--------------|------|
81
- | **CodeCompass-Embed (ours)** | **494M** | **0.979** 🏆 | **0.286** 🏆 | **0.736** | **0.834** | **0.814** | **0.349** |
82
- | SFR-Embedding-Code | 400M | 0.951 | 0.268 | 0.995 | 0.911 | 0.726 | 0.221 |
83
- | Jina-Code-v2 | 161M | 0.944 | 0.274 | 0.517 | 0.887 | 0.698 | 0.154 |
84
- | CodeRankEmbed | 137M | 0.938 | 0.260 | 0.769 | 0.899 | 0.717 | 0.199 |
85
- | Snowflake-Arctic-Embed-L | 568M | 0.915 | 0.196 | 0.540 | 0.872 | 0.650 | 0.144 |
86
- | BGE-M3 | 568M | 0.898 | 0.219 | 0.573 | 0.850 | 0.644 | 0.145 |
87
- | BGE-Base-en-v1.5 | 109M | 0.894 | 0.213 | 0.527 | 0.858 | 0.642 | 0.142 |
88
- | CodeT5+-110M | 110M | 0.870 | 0.179 | 0.328 | 0.815 | 0.580 | 0.118 |
89
 
90
  ### Multi-Language Code Search (CodeSearchNet)
91
 
@@ -102,7 +100,7 @@ Evaluated on the [CoIR Benchmark](https://github.com/CoIR-team/coir) (ACL 2025).
102
 
103
  | Task | NDCG@10 | MRR@10 |
104
  |------|---------|--------|
105
- | **codesearchnet-python** | **0.979** 🏆 | **0.976** |
106
  | stackoverflow-qa | 0.834 | 0.810 |
107
  | codefeedback-st | 0.814 | 0.775 |
108
  | codesearchnet-go | 0.797 | 0.767 |
@@ -112,7 +110,7 @@ Evaluated on the [CoIR Benchmark](https://github.com/CoIR-team/coir) (ACL 2025).
112
  | codesearchnet-javascript | 0.621 | 0.578 |
113
  | codesearchnet-ruby | 0.579 | 0.535 |
114
  | apps | 0.349 | 0.307 |
115
- | codetrans-dl | 0.286 🏆 | 0.164 |
116
  | cosqa | 0.209 | 0.165 |
117
  | **Average (12 tasks)** | **0.623** | **0.577** |
118
 
 
52
 
53
  ## Model Highlights
54
 
55
+ - **Code search from natural language** — find relevant code snippets across Python, Java, JavaScript, Go, Ruby, PHP
56
+ - **Competitive with models 3× smaller and larger** — 494M params, 896-dim embeddings
57
+ - **Bidirectional attention** all 24 layers converted from causal for better embedding quality
58
+ - **Lightweight** runs on consumer GPUs, trained at 512 tokens with RoPE extrapolation for longer inputs
59
+ - **Versatile** supports NL→Code, Code→Code, Q&A, and Text→SQL retrieval via instruction templates
 
 
60
 
61
  ## Model Details
62
 
 
74
 
75
  Evaluated on the [CoIR Benchmark](https://github.com/CoIR-team/coir) (ACL 2025). All scores are NDCG@10. Sorted by CSN-Python.
76
 
77
+ | Model | Params | CSN-Py | CodeTrans | Text2SQL | SO-QA | CodeFeedback | Apps | Avg |
78
+ |-------|--------|--------|-----------|----------|-------|--------------|------|-----|
79
+ | CodeCompass-Embed (ours) | 494M | **0.979** | **0.286** | 0.736 | 0.834 | **0.814** | **0.349** | 0.666 |
80
+ | SFR-Embedding-Code | 400M | 0.951 | 0.268 | **0.995** | **0.911** | 0.726 | 0.221 | **0.679** |
81
+ | Jina-Code-v2 | 161M | 0.944 | 0.274 | 0.517 | 0.887 | 0.698 | 0.154 | 0.579 |
82
+ | CodeRankEmbed | 137M | 0.938 | 0.260 | 0.769 | 0.899 | 0.717 | 0.199 | 0.630 |
83
+ | Snowflake-Arctic-Embed-L | 568M | 0.915 | 0.196 | 0.540 | 0.872 | 0.650 | 0.144 | 0.553 |
84
+ | BGE-M3 | 568M | 0.898 | 0.219 | 0.573 | 0.850 | 0.644 | 0.145 | 0.555 |
85
+ | BGE-Base-en-v1.5 | 109M | 0.894 | 0.213 | 0.527 | 0.858 | 0.642 | 0.142 | 0.546 |
86
+ | CodeT5+-110M | 110M | 0.870 | 0.179 | 0.328 | 0.815 | 0.580 | 0.118 | 0.482 |
87
 
88
  ### Multi-Language Code Search (CodeSearchNet)
89
 
 
100
 
101
  | Task | NDCG@10 | MRR@10 |
102
  |------|---------|--------|
103
+ | codesearchnet-python | 0.979 | 0.976 |
104
  | stackoverflow-qa | 0.834 | 0.810 |
105
  | codefeedback-st | 0.814 | 0.775 |
106
  | codesearchnet-go | 0.797 | 0.767 |
 
110
  | codesearchnet-javascript | 0.621 | 0.578 |
111
  | codesearchnet-ruby | 0.579 | 0.535 |
112
  | apps | 0.349 | 0.307 |
113
+ | codetrans-dl | 0.286 | 0.164 |
114
  | cosqa | 0.209 | 0.165 |
115
  | **Average (12 tasks)** | **0.623** | **0.577** |
116