jamie8johnson
/

CodeRankEmbed-onnx

+---
+license: mit
+language:
+- en
+- code
+tags:
+- code-search
+- embeddings
+- onnx
+- sentence-similarity
+- cqs
+library_name: sentence-transformers
+pipeline_tag: sentence-similarity
+base_model: nomic-ai/CodeRankEmbed
+---
+# CodeRankEmbed (ONNX export)
+ONNX export of [nomic-ai/CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed) — a 137M-parameter code search embedder built on `Snowflake/snowflake-arctic-embed-m-long`. Exported for use with [cqs](https://github.com/jamie8johnson/cqs)'s ONNX Runtime embedding pipeline; no PyTorch dependency required.
+This is a faithful conversion of the upstream weights — no fine-tuning, no quantization. License and behavior match the upstream model.
+## Specs
+- **Base:** `nomic-ai/CodeRankEmbed` (137M params, 768-dim, 8192 max seq)
+- **Format:** ONNX (FP32)
+- **Pooling:** Mean
+- **Query prefix:** `Represent this query for searching relevant code: ` (required — see usage)
+- **Document prefix:** none
+## Production Eval (cqs v3.v2 fixture, 2026-05-01)
+Run against cqs's production fixture (218 queries: 109 test + 109 dev) on the cqs codebase itself. Numbers are with cqs's full hybrid-search stack (dense + FTS + SPLADE blend, name-boost, type-boost, MMR-off):
+| split | metric | BGE-large (1024-dim) | **CodeRankEmbed (768-dim)** | v9-200k (768-dim) |
+|-------|--------|---------------------:|----------------------------:|------------------:|
+| test  | R@1    | 43.1%                | 42.2%                       | 45.9%             |
+| test  | R@5    | 69.7%                | **67.9%**                   | 70.6%             |
+| test  | R@20   | **83.5%**            | 79.8%                       | 80.7%             |
+| dev   | R@1    | 45.9%                | **47.7%**                   | 46.8%             |
+| dev   | R@5    | **77.1%**            | 69.7%                       | 68.8%             |
+| dev   | R@20   | **86.2%**            | 81.7%                       | 81.7%             |
+**Verdict:** edges out BGE-large on dev R@1, otherwise close on test and behind on dev R@5/R@20. Best fit when you want a code-specialist embedder at 1/3 the BGE-large parameter count without trading off too much on diverse natural-language queries. cqs ships it as an opt-in preset (not the default) — set `CQS_EMBEDDING_MODEL=nomic-coderank` or use `cqs slot create coderank --model nomic-coderank`.
+## Usage
+### With cqs
+```bash
+# Full reindex with this model
+export CQS_EMBEDDING_MODEL=nomic-coderank
+cqs index --force
+# Or, for slot-based comparisons:
+cqs slot create coderank --model nomic-coderank
+cqs index --slot coderank --force
+```
+cqs handles the query-prefix wiring automatically. Documents are encoded without a prefix per the upstream convention.
+### Direct ONNX
+```python
+import onnxruntime as ort
+from transformers import AutoTokenizer
+import numpy as np
+session = AutoTokenizer.from_pretrained("jamie8johnson/CodeRankEmbed-onnx")
+ort_session = ort.InferenceSession("model.onnx")
+tokenizer = AutoTokenizer.from_pretrained("nomic-ai/CodeRankEmbed")
+# Query prefix is REQUIRED
+query = "Represent this query for searching relevant code: find functions that validate email addresses"
+code  = "def validate_email(addr): ..."   # no prefix on documents
+q_inputs = tokenizer(query, return_tensors="np", padding=True, truncation=True, max_length=8192)
+q_out = ort_session.run(None, dict(q_inputs))
+# Mean-pool over the token dimension and L2-normalize for cosine similarity.
+```
+## License
+MIT, inherited from the upstream `nomic-ai/CodeRankEmbed` model.
+## Citation
+Please cite the upstream model:
+```
+@misc{nomic-coderank-embed,
+  author = {Nomic AI},
+  title = {CodeRankEmbed},
+  year = {2024},
+  publisher = {HuggingFace},
+  url = {https://huggingface.co/nomic-ai/CodeRankEmbed}
+}
+```