21 8 8

Miquel Farré

mfarre

AI & ML interests

I like everything video

Recent Activity

new activity 10 days ago

HuggingFaceFV/finevideo:Cleanup TTS

liked a Space 13 days ago

HuggingFaceH4/blogpost-scaling-test-time-compute

upvoted a paper 14 days ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

View all activity

Articles

Organizations

mfarre's activity

upvoted a paper 14 days ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published 16 days ago • 131

upvoted a paper 2 months ago

LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

Paper • 2410.17434 • Published Oct 22 • 25

upvoted an article 3 months ago

Article

FineVideo: behind the scenes

Sep 23

• 27

upvoted 2 articles 4 months ago

Article

Docmatix - a huge dataset for Document Visual Question Answering

Jul 18

• 71

Article

Scaling robotics datasets with video encoding

Aug 27

• 34

upvoted a paper 4 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 124

Miquel Farré

AI & ML interests

Recent Activity

Articles

SmolVLM - small yet mighty Vision Language Model

CinePile 2.0 - making stronger datasets with adversarial refinement

FineVideo: behind the scenes

Scaling robotics datasets with video encoding

Organizations

mfarre's activity

FineVideo: behind the scenes

Docmatix - a huge dataset for Document Visual Question Answering

Scaling robotics datasets with video encoding