Abhay Gupta's picture

1 9

Abhay Gupta

abhaygupta

AI & ML interests

LLM Training & Inference; Sparsity and Quantization

Organizations

abhaygupta's activity

upvoted 5 papers 5 months ago

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts

Paper • 2408.08274 • Published Aug 15, 2024 • 12

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

Paper • 2408.08152 • Published Aug 15, 2024 • 52

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16, 2024 • 98

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Paper • 2408.10188 • Published Aug 19, 2024 • 51

MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models

Paper • 2408.02718 • Published Aug 5, 2024 • 60

upvoted a paper 8 months ago

Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment

Paper • 2405.03594 • Published May 6, 2024 • 7

upvoted a collection 9 months ago

DBRX

DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27, 2024 • 91

upvoted 2 collections 10 months ago

Sparse Foundational Llama 2 Models

Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras • 27 items • Updated Sep 26, 2024 • 9

Cerebras LLaVA

Cerebras implementation and training recipes related to multimodal LLaVA models • 4 items • Updated Aug 21, 2024 • 1