Post
4357
It's time to switch from bge to Memex! We introduce Memex: OCR-free Visual Document Embedding Model as Your Personal Librarian.
The model only takes images as document-side inputs and produce vectors representing document pages. Memex is trained with over 200k query-visual document pairs, including textual document, visual document, arxiv figures, plots, charts, industry documents, textbooks, ebooks, and openly-available PDFs, etc. Its performance is on a par with our ablation text embedding model on text-oriented documents, and an advantages on visually-intensive documents.
Our model is capable of:
π Help you read a long visually-intensive or text-oriented PDF document and find the pages that answer your question.
π€ Help you build a personal library and retireve book pages from a large collection of books.
π€© It has only 2.8B parameters, and has the potential to run on your PC.
π΅ It works like human: read and comprehend with vision and remember multimodal information in hippocampus.
The model is open-sourced at RhapsodyAI/minicpm-visual-embedding-v0
Everyone is welcome to try our online demo at bokesyo/minicpm-visual-embeeding-v0-demo
The model only takes images as document-side inputs and produce vectors representing document pages. Memex is trained with over 200k query-visual document pairs, including textual document, visual document, arxiv figures, plots, charts, industry documents, textbooks, ebooks, and openly-available PDFs, etc. Its performance is on a par with our ablation text embedding model on text-oriented documents, and an advantages on visually-intensive documents.
Our model is capable of:
π Help you read a long visually-intensive or text-oriented PDF document and find the pages that answer your question.
π€ Help you build a personal library and retireve book pages from a large collection of books.
π€© It has only 2.8B parameters, and has the potential to run on your PC.
π΅ It works like human: read and comprehend with vision and remember multimodal information in hippocampus.
The model is open-sourced at RhapsodyAI/minicpm-visual-embedding-v0
Everyone is welcome to try our online demo at bokesyo/minicpm-visual-embeeding-v0-demo