n3rdium (Ishan Gajbhiye)

upvoted a paper 2 months ago

video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model

Paper • 2502.11775 • Published Feb 17 • 8

upvoted a paper 6 months ago

Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens

Paper • 2410.13863 • Published Oct 17, 2024 • 38

upvoted a collection 7 months ago

Llama 3.2

Collection

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 598

upvoted 3 papers 8 months ago

MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

Paper • 2408.13257 • Published Aug 23, 2024 • 27

Scalable Autoregressive Image Generation with Mamba

Paper • 2408.12245 • Published Aug 22, 2024 • 27

Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering

Paper • 2408.09702 • Published Aug 19, 2024 • 11

upvoted 2 papers 11 months ago

Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24, 2024 • 56

Semantica: An Adaptable Image-Conditioned Diffusion Model

Paper • 2405.14857 • Published May 23, 2024 • 11

upvoted a collection 11 months ago

RecurrentGemma Release

Collection

8 items • Updated 20 days ago • 40

upvoted a collection 12 months ago

Top Mini LLM

Collection

Collection of top mini llms • 5 items • Updated Oct 1, 2024 • 16

upvoted an article 12 months ago

Article

Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face

May 3, 2024

• 13

upvoted a paper about 1 year ago

EdgeFusion: On-Device Text-to-Image Generation

Paper • 2404.11925 • Published Apr 18, 2024 • 23

upvoted 2 collections about 1 year ago

xLAM models

Collection

xLAM: A Family of Large Action Models to Empower AI Agent Systems: https://github.com/SalesforceAIResearch/xLAM • 21 items • Updated 5 days ago • 49

Qwen1.5

Collection

Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated Nov 28, 2024 • 209

upvoted 6 papers about 1 year ago

Machine Unlearning for Image-to-Image Generative Models

Paper • 2402.00351 • Published Feb 1, 2024 • 13

AToM: Amortized Text-to-Mesh using 2D Diffusion

Paper • 2402.00867 • Published Feb 1, 2024 • 11

SymbolicAI: A framework for logic-based approaches combining generative models and solvers

Paper • 2402.00854 • Published Feb 1, 2024 • 20

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24

Efficient Exploration for LLMs

Paper • 2402.00396 • Published Feb 1, 2024 • 23

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning

Paper • 2402.00769 • Published Feb 1, 2024 • 23

Ishan Gajbhiye

AI & ML interests

Organizations

n3rdium's activity

video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model

Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens

Llama 3.2

MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

Scalable Autoregressive Image Generation with Mamba

Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering

Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Semantica: An Adaptable Image-Conditioned Diffusion Model

RecurrentGemma Release

Top Mini LLM

Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face

EdgeFusion: On-Device Text-to-Image Generation

xLAM models

Qwen1.5

Machine Unlearning for Image-to-Image Generative Models

AToM: Amortized Text-to-Mesh using 2D Diffusion

SymbolicAI: A framework for logic-based approaches combining generative models and solvers

Can Large Language Models Understand Context?

Efficient Exploration for LLMs

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning