3 25 42

NAN

nan1248

AI & ML interests

None yet

Recent Activity

liked a dataset 6 days ago

nvidia/Llama-Nemotron-Post-Training-Dataset

upvoted a paper 7 days ago

GenX: Mastering Code and Test Generation with Execution Feedback

upvoted a paper 21 days ago

Improved Visual-Spatial Reasoning via R1-Zero-Like Training

View all activity

Organizations

None yet

nan1248's activity

upvoted a paper 7 days ago

GenX: Mastering Code and Test Generation with Execution Feedback

Paper • 2412.13464 • Published Dec 18, 2024 • 1

upvoted a paper 21 days ago

Improved Visual-Spatial Reasoning via R1-Zero-Like Training

Paper • 2504.00883 • Published 22 days ago • 62

upvoted a paper 2 months ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 182

upvoted 2 articles 2 months ago

Article

SmolVLM - small yet mighty Vision Language Model

Nov 26, 2024

• 241

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16, 2024

• 357

upvoted 2 papers 3 months ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 113

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 385

upvoted a paper 5 months ago

Large Language Model-Brained GUI Agents: A Survey

Paper • 2411.18279 • Published Nov 27, 2024 • 32

upvoted a collection 5 months ago

PixMo

Collection

A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 10 items • Updated Mar 13 • 68

upvoted a paper 6 months ago

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 124

upvoted a paper 7 months ago

InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning

Paper • 2409.12568 • Published Sep 19, 2024 • 51

upvoted 3 papers 8 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 130

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Paper • 2408.10188 • Published Aug 19, 2024 • 53

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16, 2024 • 101

upvoted 3 papers 9 months ago

EVLM: An Efficient Vision-Language Model for Visual Understanding

Paper • 2407.14177 • Published Jul 19, 2024 • 45

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 104

The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective

Paper • 2407.08583 • Published Jul 11, 2024 • 13

upvoted 3 papers 10 months ago

Unveiling Encoder-Free Vision-Language Models

Paper • 2406.11832 • Published Jun 17, 2024 • 55

FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

Paper • 2407.04051 • Published Jul 4, 2024 • 40

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3, 2024 • 96