Instructions to use anonymoussubmission111/mpe-checkpoints with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use anonymoussubmission111/mpe-checkpoints with PEFT:
Task type is invalid.
- Transformers
How to use anonymoussubmission111/mpe-checkpoints with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("anonymoussubmission111/mpe-checkpoints", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Checkpoints for "Improving Long-Context Retrieval with Multi-Prefix Embedding"
This repository contains model checkpoints and pre-computed embeddings for the anonymous submission Improving Long-Context Retrieval with Multi-Prefix Embedding.
Repository Structure
models/
fixed-64-epoch1/ # Ablation: fixed 64-token prefix length
maxp-train-epoch1/ # Baseline: MaxP trained model
nochunk-epoch1/ # Baseline: single-vector (no chunking)
prand-32to1024-epoch1/ # Proposed: random prefix lengths (32-1024 tokens)
encode/
browsecomp-plus/ # Pre-computed embeddings - BrowseComp-Plus
longembed/ # Pre-computed embeddings - LongEmbed (2WikiMQA,
| # NarrativeQA, QMSum, SummScreenFD)
mldr-en/ # Pre-computed embeddings - MLDR (English)
Each model folder contains a LoRA adapter (rank 16, alpha 64) fine-tuned from Qwen/Qwen3-Embedding-0.6B
for feature extraction, along with tokenizer files and a checkpoint-625/ subfolder with the
intermediate checkpoint at the end of epoch 1 (including optimizer state).
Usage
Load a LoRA adapter with PEFT:
from peft import PeftModel
from transformers import AutoModel
base = AutoModel.from_pretrained("Qwen/Qwen3-Embedding-0.6B")
model = PeftModel.from_pretrained(
base,
"anonymoussubmission111/mpe-checkpoints",
subfolder="models/prand-32to1024-epoch1",
)
Pre-computed embeddings in encode/ are stored as .pkl files (pickled numpy arrays)
and can be loaded directly to reproduce retrieval results without re-encoding.
- Downloads last month
- -