@bokesyo on Hugging Face: "It's time to switch from bge to Memex! We introduce Memex: OCR-free Visual…"

Post

4357

It's time to switch from bge to Memex! We introduce Memex: OCR-free Visual Document Embedding Model as Your Personal Librarian.

The model only takes images as document-side inputs and produce vectors representing document pages. Memex is trained with over 200k query-visual document pairs, including textual document, visual document, arxiv figures, plots, charts, industry documents, textbooks, ebooks, and openly-available PDFs, etc. Its performance is on a par with our ablation text embedding model on text-oriented documents, and an advantages on visually-intensive documents.

Our model is capable of:

😋 Help you read a long visually-intensive or text-oriented PDF document and find the pages that answer your question.

🤗 Help you build a personal library and retireve book pages from a large collection of books.

🤩 It has only 2.8B parameters, and has the potential to run on your PC.

🐵 It works like human: read and comprehend with vision and remember multimodal information in hippocampus.

The model is open-sourced at RhapsodyAI/minicpm-visual-embedding-v0

Everyone is welcome to try our online demo at bokesyo/minicpm-visual-embeeding-v0-demo

Join the conversation