Naman Anand

naman5a

AI & ML interests

RAG , LLMs

Recent Activity

upvoted an article about 1 month ago

How to train a new language model from scratch using Transformers and Tokenizers

upvoted an article about 1 month ago

Introducing HELMET

upvoted an article about 1 month ago

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

View all activity

Organizations

upvoted 4 articles about 1 month ago

Article

How to train a new language model from scratch using Transformers and Tokenizers

•

Feb 14, 2020

• 43

Article

Introducing HELMET

and 6 others •

Apr 16

• 34

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

and 3 others •

Feb 4

• 167

Article

Finally, a Replacement for BERT: Introducing ModernBERT

and 14 others •

Dec 19, 2024

• 671

upvoted a paper about 2 months ago

SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26 • 87

liked a model 2 months ago

nvidia/parakeet-tdt-0.6b-v2

Automatic Speech Recognition • 0.6B • Updated Jun 26 • 708k • 1.25k

liked a model 3 months ago

ds4sd/SmolDocling-256M-preview

Image-Text-to-Text • 0.3B • Updated May 16 • 112k • 1.5k

upvoted a collection 3 months ago

GLM-4-0414

Collection

GLM-4-0414 series model • 8 items • Updated 29 days ago • 130

liked a model 3 months ago

nari-labs/Dia-1.6B

Text-to-Speech • Updated Jun 1 • 63k • • 2.66k

upvoted a paper 3 months ago

AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference

Paper • 2504.10326 • Published Apr 14 • 26

liked a Space 4 months ago

Llama-4-Maverick-03-26-Experimental Battles

🔥

Browse and compare model conversation outcomes

upvoted 2 articles 4 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 195

Article

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

and 8 others •

Mar 24

• 19

upvoted a collection 4 months ago

💫StarVector Models

Collection

StarVector is a multimodal LLM for Scalable Vector Graphics (SVG) generation, producing structured SVG code directly from images and text. • 2 items • Updated Mar 20 • 96

upvoted a paper 4 months ago

Cube: A Roblox View of 3D Intelligence

Paper • 2503.15475 • Published Mar 19 • 30

upvoted 2 articles 4 months ago

Article

From Files to Chunks: Improving Hugging Face Storage Efficiency

and 1 other •

Nov 20, 2024

• 63

Article

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

and 3 others •

Feb 12

• 70

upvoted an article 5 months ago

Article

Mixture of Experts Explained

and 5 others •

Dec 11, 2023

• 778

liked 2 models 5 months ago

amd/Instella-3B

Text Generation • 3B • Updated Jun 25 • 89 • 36

Zyphra/Zonos-v0.1-hybrid

Text-to-Speech • Updated Jun 3 • 27.9k • • 1.09k

Naman Anand

AI & ML interests

Recent Activity

Organizations

naman5a's activity

How to train a new language model from scratch using Transformers and Tokenizers

Introducing HELMET

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

Finally, a Replacement for BERT: Introducing ModernBERT

Llama-4-Maverick-03-26-Experimental Battles

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

From Files to Chunks: Improving Hugging Face Storage Efficiency

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

Mixture of Experts Explained