diasbalmash (Dias Balmash)

liked a dataset 3 months ago

farabi-lab/kazakh-stt

Viewer • Updated Sep 26, 2024 • 204k • 29 • 1

liked a model 3 months ago

nvidia/stt_kk_ru_fastconformer_hybrid_large

Automatic Speech Recognition • Updated Sep 17, 2024 • 236k • 1

liked a model 4 months ago

transiteration/stt_kz_quartznet15x5

Automatic Speech Recognition • Updated Jan 22, 2024 • 14 • 3

liked a dataset 4 months ago

AmanMussa/kazakh-instruction-v2

Viewer • Updated Nov 16, 2023 • 52.2k • 39 • 5

liked a model 5 months ago

IrbisAI/Irbis-7b-Instruct_lora

Text Generation • Updated Jun 29, 2024 • 5 • 5

reacted to merve's post with 🚀 6 months ago

Post

3249

Forget any document retrievers, use ColPali 💥💥

Document retrieval is done through OCR + layout detection, but you are losing a lot of information in between, stop doing that! 🤓

ColPali uses a vision language model, which is better in doc understanding 📑
ColPali: vidore/colpali (mit license!)
Blog post: https://huggingface.co/blog/manu/colpali
The authors also released a new benchmark for document retrieval:
ViDoRe Benchmark: vidore/vidore-benchmark-667173f98e70a1c0fa4db00d
ViDoRe Leaderboard: vidore/vidore-leaderboard

ColPali marries the idea of modern vision language models with retrieval 🤝

The authors apply contrastive fine-tuning to SigLIP on documents, and pool the outputs (they call it BiSigLip). Then they feed the patch embedding outputs to PaliGemma and create BiPali 🖇️
BiPali natively supports image patch embeddings to an LLM, which enables leveraging the ColBERT-like late interaction computations between text tokens and image patches (hence the name ColPali!) 🤩

The authors created the ViDoRe benchmark by collecting PDF documents and generate queries from Claude-3 Sonnet.
ColPali seems to be the most performant model on ViDoRe. Not only this, but is way faster than traditional PDF parsers too!

liked a model 7 months ago

IrbisAI/Irbis-7b-v0.1

Text Generation • Updated Jun 29, 2024 • 199 • 14

Dias Balmash

AI & ML interests

Recent Activity

Organizations

diasbalmash's activity