@dhuynh95 on Hugging Face: "🪟32k-context BERT for embedding and RAG on long corpus Monarch Mixer is a…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

dhuynh95

posted an update Jan 17, 2024

Post

🪟32k-context BERT for embedding and RAG on long corpus

Monarch Mixer is a new architecture to enable long context BERT for large corpus and can be fine-tuned for large context retrieval.

Quite interesting and important as BERT is still the most used LLM in production for "old school" tasks like classification, NER, embeddings, but is also a key component for RAG.

Paper: https://arxiv.org/abs/2310.12109
Blog: https://hazyresearch.stanford.edu/blog/2024-01-11-m2-bert-retrieval
GitHub: https://github.com/HazyResearch/m2

Tonic

Jan 18, 2024

i'm a fan of this community project : to train sector-specific 32K-context BERT embedding models 🤗

In this post

dhuynh95 Daniel Huynh
Tonic Joseph [open/acc] Pollack