LiveCC Collection Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025) • 8 items • Updated about 19 hours ago • 3
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 6 items • Updated 11 days ago • 61
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 3 items • Updated 28 days ago • 90
Open-RS Collection Model weights & datasets in the paper "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn’t" • 8 items • Updated Mar 21 • 11
DeTikZify Collection Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ • 12 items • Updated Mar 19 • 25
💫StarVector Models Collection StarVector is a multimodal LLM for Scalable Vector Graphics (SVG) generation, producing structured SVG code directly from images and text. • 2 items • Updated Mar 20 • 93
Cosmos Transfer1 Collection Multimodal Conditional World Generation for World2World Transfer • 6 items • Updated about 9 hours ago • 14
EXAONE-Deep Collection EXAONE reasoning model series of 2.4B, 7.8B, and 32B, optimized for reasoning tasks including math and coding • 9 items • Updated Mar 18 • 86
Wan2.1 14B 480p I2V LoRAs Collection A collection of Remade's Wan2.1 14B 480p I2V LoRAs • 39 items • Updated 23 days ago • 104
Forgetting Transformer: Softmax Attention with a Forget Gate Paper • 2503.02130 • Published Mar 3 • 32