minishlab
/

potion-code-16M

@@ -62,6 +62,8 @@ Results on the [CoIR benchmark](https://github.com/CoIR-team/coir) (NDCG@10, `mt
 | potion-retrieval-32M | 32M | 32.10 | 4.22 | 31.80 | 36.71 | 45.11 | 38.64 | 29.97 | 32.62 | 8.70 | 56.26 | 36.93 |
 | potion-base-32M | 32M | 31.42 | 3.37 | 29.58 | 34.77 | 42.69 | 37.88 | 28.51 | 30.55 | 14.61 | 53.36 | 38.88 |
 ## Model Details
 | Property | Value |

 | potion-retrieval-32M | 32M | 32.10 | 4.22 | 31.80 | 36.71 | 45.11 | 38.64 | 29.97 | 32.62 | 8.70 | 56.26 | 36.93 |
 | potion-base-32M | 32M | 31.42 | 3.37 | 29.58 | 34.77 | 42.69 | 37.88 | 28.51 | 30.55 | 14.61 | 53.36 | 38.88 |
+Not all CoIR tasks are equally relevant for the primary use case of retrieving code context given a natural language query. **CosQA** and **CodeFeedback (ST/MT)** are the most representative — they match a developer-style NL query against a code corpus with a genuine semantic gap. **COIRCodeSearchNetRetrieval** goes in the opposite direction (code query → text), and the **CodeTransOcean** tasks cover cross-language translation, a distinct problem. The hybrid row uses min-max score normalization with equal weighting (alpha=0.5) between dense and BM25 retrieval.
 ## Model Details
 | Property | Value |