Raja Biswas's picture

Raja Biswas

rbiswasfc

·

AI & ML interests

NLP, Generative AI

Recent Activity

liked a dataset 1 day ago

qihoo360/Light-R1-SFTData

upvoted a paper 2 days ago

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

upvoted an article 2 days ago

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

View all activity

Organizations

rbiswasfc's activity

upvoted a paper 2 days ago

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

Paper • 2503.07920 • Published 4 days ago • 89

upvoted an article 2 days ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

3 days ago

• 242

upvoted a collection 2 days ago

Gemma 3 Release

9 items • Updated about 20 hours ago • 235

upvoted 3 articles 3 days ago

Article

Open R1: Update #3

By

and 9 others •

3 days ago

• 214

Article

HuggingFace, IISc partner to supercharge model building on India's diverse languages

16 days ago

• 14

Article

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

11 days ago

• 65

upvoted 2 papers 4 days ago

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published 8 days ago • 79

EuroBERT: Scaling Multilingual Encoders for European Languages

Paper • 2503.05500 • Published 7 days ago • 72

upvoted 2 articles 20 days ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

By

•

Feb 7

• 70

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Dec 9, 2022

• 198

upvoted 2 collections 24 days ago

SimpleRL

The collection for the Project "Simple Reinforcement Learning for Reasoning" • 2 items • Updated 24 days ago • 5

CodeI/O

Collection for CodeI/O @ https://codei-o.github.io/ • 15 items • Updated 30 days ago • 6

upvoted a paper 26 days ago

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 84

upvoted an article 26 days ago

Article

How NuminaMath Won the 1st AIMO Progress Prize

Jul 11, 2024

• 118

upvoted a collection 26 days ago

NuminaMath

Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize • 7 items • Updated Feb 10 • 76

upvoted an article 29 days ago

Article

1 Billion Classifications

30 days ago

• 42

upvoted 4 papers about 1 month ago

Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2

Paper • 2502.03544 • Published Feb 5 • 43

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 124

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Paper • 2502.06781 • Published Feb 10 • 60

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 142