NK's picture

NK

NeuralKartMocker

·

AI & ML interests

Gen AI, GAN, LLMs, NLP, Gen Music

Recent Activity

upvoted an article 25 days ago

StackLLaMA: A hands-on guide to train LLaMA with RLHF

upvoted a paper about 1 month ago

Video Action Differencing

View all activity

Organizations

NeuralKartMocker's activity

upvoted an article 25 days ago

Article

StackLLaMA: A hands-on guide to train LLaMA with RLHF

Apr 5, 2023

• 33

upvoted 18 papers about 1 month ago

Video Action Differencing

Paper • 2503.07860 • Published Mar 10 • 32

Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

Paper • 2503.07572 • Published Mar 10 • 41

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

Paper • 2503.07920 • Published Mar 10 • 97

MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning

Paper • 2503.07365 • Published Mar 10 • 56

AlphaDrive: Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning

Paper • 2503.07608 • Published Mar 10 • 20

EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer

Paper • 2503.07027 • Published Mar 10 • 28

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 227

Taking Notes Brings Focus? Towards Multi-Turn Multimodal Dialogue Learning

Paper • 2503.07002 • Published Mar 10 • 39

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Paper • 2503.08638 • Published Mar 11 • 62

Identifying Sensitive Weights via Post-quantization Integral

Paper • 2503.01901 • Published Feb 28 • 7

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

Paper • 2503.03983 • Published Mar 6 • 22

LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation

Paper • 2503.02972 • Published Mar 4 • 23

EgoLife: Towards Egocentric Life Assistant

Paper • 2503.03803 • Published Mar 5 • 41

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 107

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6 • 93

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Paper • 2503.04724 • Published Mar 6 • 69

Remasking Discrete Diffusion Models with Inference-Time Scaling

Paper • 2503.00307 • Published Mar 1 • 9

TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding

Paper • 2502.19400 • Published Feb 26 • 48