Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
bokesyoΒ 
posted an update 23 days ago
Post
4357
It's time to switch from bge to Memex! We introduce Memex: OCR-free Visual Document Embedding Model as Your Personal Librarian.

The model only takes images as document-side inputs and produce vectors representing document pages. Memex is trained with over 200k query-visual document pairs, including textual document, visual document, arxiv figures, plots, charts, industry documents, textbooks, ebooks, and openly-available PDFs, etc. Its performance is on a par with our ablation text embedding model on text-oriented documents, and an advantages on visually-intensive documents.

Our model is capable of:

πŸ˜‹ Help you read a long visually-intensive or text-oriented PDF document and find the pages that answer your question.

πŸ€— Help you build a personal library and retireve book pages from a large collection of books.

🀩 It has only 2.8B parameters, and has the potential to run on your PC.

🐡 It works like human: read and comprehend with vision and remember multimodal information in hippocampus.

The model is open-sourced at RhapsodyAI/minicpm-visual-embedding-v0

Everyone is welcome to try our online demo at bokesyo/minicpm-visual-embeeding-v0-demo
In this post