8 123 155

Emanuele Vivoli

emanuelevivoli

https://emanuelevivoli.github.io

AI & ML interests

I work on Comics/Manga :)

Recent Activity

upvoted a paper 5 days ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

upvoted a collection 6 days ago

Comics Pick-A-Panel

authored a paper 9 days ago

HoloMine: A Synthetic Dataset for Buried Landmines Recognition using Microwave Holographic Imaging

View all activity

Organizations

emanuelevivoli's activity

upvoted a paper 5 days ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published 9 days ago • 14

upvoted a collection 6 days ago

Comics Pick-A-Panel

Collection

Dataset, Models and Paper from ComicsPAP: understanding comic strips by picking the correct panel • 4 items • Updated 9 days ago • 2

authored 2 papers 9 days ago

HoloMine: A Synthetic Dataset for Buried Landmines Recognition using Microwave Holographic Imaging

Paper • 2502.21054 • Published 23 days ago

ComicsPAP: understanding comic strips by picking the correct panel

Paper • 2503.08561 • Published 12 days ago • 1

updated a dataset 10 days ago

VLR-CVC/ComicsPAP

Viewer • Updated 9 days ago • 80.6k • 3.88k • 12

upvoted 3 papers 12 days ago

R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning

Paper • 2503.05379 • Published 16 days ago • 33

R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model

Paper • 2503.05132 • Published 16 days ago • 51

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

Paper • 2503.06749 • Published 14 days ago • 24

liked 2 models 18 days ago

HuggingFaceTB/SmolVLM2-2.2B-Instruct

Image-Text-to-Text • Updated 17 days ago • 510k • 125

microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • Updated 2 days ago • 751k • 1.2k

upvoted a paper 18 days ago

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published 20 days ago • 77

New activity in VLR-CVC/ComicsPAP 22 days ago

[bot] Conversion to Parquet

#1 opened about 1 month ago by

parquet-converter

replied to maxiw's post 22 days ago

this is very interesting!

liked a model 22 days ago

allenai/olmOCR-7B-0225-preview

Image-Text-to-Text • Updated 27 days ago • 525k • 572

upvoted 2 papers about 1 month ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published about 1 month ago • 132

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 166

liked a dataset about 1 month ago

VLR-CVC/ComicsPAP

Viewer • Updated 9 days ago • 80.6k • 3.88k • 12

liked a Space about 1 month ago

Grandma Secret Sauce

🍝

Fetch and display recipes from web URL

upvoted a paper about 1 month ago

LM2: Large Memory Models

Paper • 2502.06049 • Published Feb 9 • 30

liked a model about 1 month ago

laion/CLIP-ViT-H-14-laion2B-s32B-b79K

Zero-Shot Image Classification • Updated Jan 22 • 1.43M • 363