baa.ai · Merino-Large-v2

One model that does both halves of RAG retrieval — bi-encoder embedding and cross-encoder reranking — over a single shared word-embedding table. A 1024-dimensional multilingual model, ~872M parameters, by BAA AI (Black Sheep AI).

Get the optimal model for your data

Merino-Large-v2 is a strong, cost-efficient default. But the best embedder + reranker is corpus-specific — the ideal choice depends on your documents and your notion of relevance. baa.ai offers exclusive tooling that identifies the optimal embedding and reranking models for your specific data, so you ship the smallest models that maximize document recovery on your corpus. For a tailored recommendation, reach out to baa.ai.

What it is

A two-role retrieval model over a shared input word-embedding matrix (stored once). The bi-encoder embedder and a large cross-encoder reranker are built on the same xlm-roberta-large backbone, so their word-embedding table is stored a single time and injected into the reranker at load — a smaller download at no measured quality loss, with no retraining.

Embed role: bi-encoder, 1024-d, L2-normalized. Prepend "query: " to queries.
Rerank role: cross-encoder, single relevance logit per (query, document) pair.
Router: call .embed(...) or .rerank(...).

Usage

from modeling_baa import BaaEmbeddingReranker   # included in this repo

m = BaaEmbeddingReranker("baa-ai/Merino-Large-v2")
qv = m.embed(["how does a cross-encoder reranker work?"], is_query=True)[0]
dv = m.embed(["a cross-encoder scores a (query, document) pair jointly",
              "bi-encoders embed query and document separately for fast retrieval"])
ranked = m.rerank("how does a cross-encoder reranker work?",
                  ["a cross-encoder scores a (query, document) pair jointly",
                   "the mitochondria is the powerhouse of the cell"])
# -> [(doc, score), ...] sorted best-first

Specs


Embedding dim	1024
Parameters	~872M (embedder + reranker, shared word-embedding table)
Languages	multilingual
Max sequence length	512
Hardware	CPU / edge / GPU

License & attribution

BAA Contributions (shared-embedding architecture, router/loader code, packaging, weights, docs) are proprietary to BAA AI (Black Sheep AI) — see LICENSE.
Incorporates the xlm-roberta-large backbone under the MIT License — see LICENSE-xlm-roberta-large.txt.

Downloads last month: -

Collection including baa-ai/Merino-Large-v2

Merino — unified embedding + reranker models

Collection

One model for both halves of RAG retrieval; a strong default per size. Contact baa.ai for the optimal pick for your corpus. • 10 items • Updated about 8 hours ago