12 4 8

Nilabhra Roy Chowdhury

nilabhra

AI & ML interests

None yet

Recent Activity

liked a model 14 days ago

nvidia/Llama-3_1-Nemotron-Ultra-253B-v1

authored a paper 18 days ago

ZClip: Adaptive Spike Mitigation for LLM Pre-Training

upvoted a paper 18 days ago

ZClip: Adaptive Spike Mitigation for LLM Pre-Training

View all activity

Organizations

nilabhra's activity

liked a model 14 days ago

nvidia/Llama-3_1-Nemotron-Ultra-253B-v1

Text Generation • Updated 4 days ago • 18.1k • • 263

authored a paper 18 days ago

ZClip: Adaptive Spike Mitigation for LLM Pre-Training

Paper • 2504.02507 • Published 19 days ago • 76

upvoted a paper 18 days ago

ZClip: Adaptive Spike Mitigation for LLM Pre-Training

Paper • 2504.02507 • Published 19 days ago • 76

authored a paper 22 days ago

A Refined Analysis of Massive Activations in LLMs

Paper • 2503.22329 • Published 25 days ago • 14

upvoted a paper 22 days ago

A Refined Analysis of Massive Activations in LLMs

Paper • 2503.22329 • Published 25 days ago • 14

authored 2 papers 28 days ago

Variance Control via Weight Rescaling in LLM Pre-training

Paper • 2503.17500 • Published Mar 21 • 5

Falcon2-11B Technical Report

Paper • 2407.14885 • Published Jul 20, 2024

upvoted a paper 28 days ago

Variance Control via Weight Rescaling in LLM Pre-training

Paper • 2503.17500 • Published Mar 21 • 5

liked a Space 2 months ago

2.5k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked a model 4 months ago

tiiuae/Falcon3-Mamba-7B-Instruct

Text Generation • Updated Jan 2 • 10.5k • 28

New activity in tiiuae/falcon-7b-instruct 7 months ago

add chat template to tokenizer_config.json

#111 opened over 1 year ago by

epignatelli

liked a model 8 months ago

tiiuae/falcon-mamba-7b

Text Generation • Updated Dec 17, 2024 • 29.6k • 232

New activity in tiiuae/falcon-11B 10 months ago

Some weights of FalconForCausalLM were not initialized from the model checkpoint at tiiuae/falcon-11B and are newly initialized

#10 opened 10 months ago by

TillFetzer

liked a model 11 months ago

tiiuae/visper

Updated Jun 5, 2024 • 9

New activity in tiiuae/visper 11 months ago

Update README.md

#1 opened 11 months ago by

reach-vb

updated a model 11 months ago

tiiuae/visper

Updated Jun 5, 2024 • 9

published an article 11 months ago

Article

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens tokens and 11 languages

and 9 others •

May 24, 2024

• 25

New activity in tiiuae/falcon-11B-vlm 11 months ago

Update blogpost link

#1 opened 11 months ago by

ZennyKenny

New activity in tiiuae/falcon-11B 11 months ago

Template

#7 opened 11 months ago by

rhaymison

liked a model 11 months ago

tiiuae/falcon-11B-vlm

Image-Text-to-Text • Updated Jun 12, 2024 • 353 • 46