Krishna Kaasyap

KrishnaKaasyap

AI & ML interests

Test Time Training Multimodal & Inter-Modality Transfer Learning Mechanistic Interpretability Evolutionary Model Merging Swarm Intelligence of multiple models with different architectures and different algorithms MuZero approach to general tasks

Recent Activity

upvoted a collection 6 days ago

Llama 4

liked a model 16 days ago

Qwen/Qwen2.5-Omni-7B

liked a model 27 days ago

CohereForAI/c4ai-command-a-03-2025

View all activity

Organizations

KrishnaKaasyap's activity

upvoted a collection 6 days ago

Llama 4

Collection

Llama 4 release • 10 items • Updated 6 days ago • 414

upvoted 3 collections 4 months ago

upvoted a collection 6 months ago

Llama-3.1-Nemotron-70B

Collection

SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated about 14 hours ago • 155

upvoted a paper 8 months ago

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22, 2024 • 52

upvoted 3 collections 8 months ago

Jamba 1.5

Collection

The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated Mar 6 • 87

Magnum v2 123b

Collection

3 items • Updated Aug 21, 2024 • 6

DeepSeek-V2

Collection

8 items • Updated Jan 3 • 28

upvoted an article 8 months ago

Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

Jul 23, 2024

• 232

upvoted a paper 8 months ago

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Paper • 2408.05211 • Published Aug 9, 2024 • 49

upvoted a collection 8 months ago

Llama-3.1 Quantization

Collection

Neural Magic quantized Llama-3.1 models • 22 items • Updated Nov 22, 2024 • 44

upvoted a collection 9 months ago

Llama 3.1

Collection

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 658

upvoted a paper 9 months ago

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Paper • 2406.16860 • Published Jun 24, 2024 • 61

upvoted 2 collections 10 months ago

SSMs

Collection

A collection of Mamba-2-based research models with 8B parameters trained on 3.5T tokens for comparison with Transformers. • 5 items • Updated about 14 hours ago • 27

Nemotron 4 340B

Collection

Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated about 14 hours ago • 162