Bhimraj Yadav's picture

Bhimraj Yadav PRO

bhimrazy

·

https://bhimraj.com.np

AI & ML interests

Computer Vision, Healthcare, Generative AI and NLP

Recent Activity

upvoted a paper 10 days ago

Qwen2.5-Omni Technical Report

upvoted a paper 23 days ago

Transformers without Normalization

upvoted an article about 1 month ago

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

View all activity

Organizations

bhimrazy's activity

upvoted a paper 10 days ago

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published 11 days ago • 120

upvoted a paper 23 days ago

Transformers without Normalization

Paper • 2503.10622 • Published 24 days ago • 153

upvoted an article about 1 month ago

Article

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

Mar 4

• 72

upvoted 2 papers about 1 month ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 179

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 140

upvoted 14 papers about 2 months ago

SearchRAG: Can Search Engines Be Helpful for LLM-based Medical Question Answering?

Paper • 2502.13233 • Published Feb 18 • 14

Craw4LLM: Efficient Web Crawling for LLM Pretraining

Paper • 2502.13347 • Published Feb 19 • 27

Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published Feb 18 • 58

Baichuan-M1: Pushing the Medical Capability of Large Language Models

Paper • 2502.12671 • Published Feb 18 • 1

Scaling Test-Time Compute Without Verification or RL is Suboptimal

Paper • 2502.12118 • Published Feb 17 • 1

Soundwave: Less is More for Speech-Text Alignment in LLMs

Paper • 2502.12900 • Published Feb 18 • 84

Is Noise Conditioning Necessary for Denoising Generative Models?

Paper • 2502.13129 • Published Feb 18 • 1

ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

Paper • 2501.10132 • Published Jan 17 • 20

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published Jan 22 • 90

Baichuan-Omni-1.5 Technical Report

Paper • 2501.15368 • Published Jan 26 • 62

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published Jan 26 • 68

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published Jan 29 • 57

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 219

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 149