Post
1878
AudioRAG is becoming real! Just built a demo with ColQwen-Omni that does semantic search on raw audio, no transcription needed.
Drop in a podcast, ask your question, and it finds the exact chunks where it happens. You can also get a written answer.
What’s exciting: it skips transcription, making it faster and better at capturing emotion, ambient sound, and tone, surfacing results text search would miss.
- Demo: fdaudens/colqwen-omni-demo
- Blog post from ColQwen team: https://huggingface.co/blog/manu/colqwen-omni-omnimodal-retrieval
Drop in a podcast, ask your question, and it finds the exact chunks where it happens. You can also get a written answer.
What’s exciting: it skips transcription, making it faster and better at capturing emotion, ambient sound, and tone, surfacing results text search would miss.
- Demo: fdaudens/colqwen-omni-demo
- Blog post from ColQwen team: https://huggingface.co/blog/manu/colqwen-omni-omnimodal-retrieval