DREAM-0.5B

DREAM (DREAM-0.5B) is a LoRA adapter for dense retrieval embeddings trained with autoregressive language-model supervision, introduced in DREAM: Dense Retrieval Embeddings via Autoregressive Modeling.

This repository contains the PEFT adapter only. Load it together with the base model: Qwen/Qwen2.5-0.5B.

The official code repository is available at yixuantt/DREAM.

How to Inference

import torch
import torch.nn.functional as F
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

model_id = "yixuantt/DREAM-0.5B"
base_id = "Qwen/Qwen2.5-0.5B"

tokenizer = AutoTokenizer.from_pretrained(model_id)
base = AutoModelForCausalLM.from_pretrained(
    base_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base, model_id)
model.eval()

@torch.no_grad()
def encode(texts, max_length=512):
    inputs = tokenizer(
        texts,
        padding=True,
        truncation=True,
        max_length=max_length,
        return_tensors="pt",
    ).to(model.device)
    outputs = model(**inputs, output_hidden_states=True, use_cache=False)
    hidden = outputs.hidden_states[-1]
    # Pool the last non-padding token. This works for both left and right padding.
    last_idx = inputs["attention_mask"].size(1) - 1 - inputs["attention_mask"].flip(dims=[1]).argmax(dim=1)
    emb = hidden[torch.arange(hidden.size(0), device=hidden.device), last_idx]
    return F.normalize(emb.float(), p=2, dim=-1)

queries = encode(["What is DREAM?"])
docs = encode(["DREAM trains dense retrievers with autoregressive supervision."])
scores = queries @ docs.T
print(scores)
Downloads last month
38
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yixuantt/DREAM-0.5B

Adapter
(423)
this model

Collection including yixuantt/DREAM-0.5B

Paper for yixuantt/DREAM-0.5B