Vaibhav Srivastav's picture

Vaibhav Srivastav PRO

reach-vb

·

https://vaibhavs10.github.io

AI & ML interests

TTS + LM performance prediction

Recent Activity

liked a Space about 2 hours ago

Qwen/Qwen2.5-VL-32B-Instruct

liked a model about 2 hours ago

Qwen/Qwen2.5-VL-32B-Instruct

liked a model about 4 hours ago

mlx-community/DeepSeek-V3-0324-4bit

View all activity

Organizations

reach-vb's activity

upvoted a collection 3 days ago

MoshiVis v0.1

MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 8 items • Updated 3 days ago • 14

upvoted a paper 4 days ago

Training and Inference Efficiency of Encoder-Decoder Speech Models

Paper • 2503.05931 • Published 17 days ago • 2

upvoted a collection 5 days ago

Orpheus TTS

TTS Towards Human-Sounding Speech • 2 items • Updated 6 days ago • 49

upvoted a collection 6 days ago

Cosmos Transfer1

World Foundation Model for Domain Transfer • 5 items • Updated 4 days ago • 11

upvoted an article 12 days ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

13 days ago

• 342

upvoted 2 collections 13 days ago

Gemma 3 Release

9 items • Updated 11 days ago • 290

Gemma 3

4 items • Updated 13 days ago • 15

upvoted 2 articles 13 days ago

Article

Welcome Llama 3 - Meta's new open LLM

Apr 18, 2024

• 286

Article

TTS Arena: Benchmarking Text-to-Speech Models in the Wild

Feb 27, 2024

• 57

upvoted an article 17 days ago

Article

LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone!

18 days ago

• 45

upvoted a collection 18 days ago

Jamba 1.6

The AI21 Jamba family of models are hybrid SSM-Transformer foundation models, outperforming open model competitors on quality and speed. • 2 items • Updated 18 days ago • 11

upvoted a collection 19 days ago

Lite-Whisper

https://github.com/efeslab/LiteASR • 7 items • Updated 21 days ago • 1

upvoted 2 collections 20 days ago

C4AI Aya Vision

Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated 20 days ago • 68

C4AI Aya Expanse

Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. • 4 items • Updated 22 days ago • 38

upvoted an article 20 days ago

Article

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

21 days ago

• 69

upvoted a collection 20 days ago

DiffRhythm

4 items • Updated 9 days ago • 13

upvoted a paper 20 days ago

DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion

Paper • 2503.01183 • Published 22 days ago • 26