515 66 262

Vaibhav Srivastav

reach-vb

reach_vb

Vaibhavs10

AI & ML interests

TTS + LM performance prediction

Articles

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

23 days ago

• 51

Organizations

reach-vb's activity

upvoted an article about 5 hours ago

Article

Making automatic speech recognition work on large files with Wav2Vec2 in 🤗 Transformers

Feb 1, 2022

• 2

upvoted an article 2 days ago

Article

Fine-Tune Whisper with 🤗 Transformers

Nov 3, 2022

• 32

upvoted 21 papers 2 days ago

Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion

Paper • 2405.09874 • Published 7 days ago • 14

TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction

Paper • 2405.10315 • Published 7 days ago • 9

Toon3D: Seeing Cartoons from a New Perspective

Paper • 2405.10320 • Published 7 days ago • 18

Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Paper • 2405.10300 • Published 7 days ago • 21

CAT3D: Create Anything in 3D with Multi-View Diffusion Models

Paper • 2405.10314 • Published 7 days ago • 34

Many-Shot In-Context Learning in Multimodal Foundation Models

Paper • 2405.09798 • Published 8 days ago • 24

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published 8 days ago • 65

Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published 8 days ago • 83

Dynamic data sampler for cross-language transfer learning in large language models

Paper • 2405.10626 • Published 6 days ago • 3

Observational Scaling Laws and the Predictability of Language Model Performance

Paper • 2405.10938 • Published 6 days ago • 9

Grounded 3D-LLM with Referent Tokens

Paper • 2405.10370 • Published 7 days ago • 6

Layer-Condensed KV Cache for Efficient Inference of Large Language Models

Paper • 2405.10637 • Published 6 days ago • 14

INDUS: Effective and Efficient Language Models for Scientific Applications

Paper • 2405.10725 • Published 6 days ago • 15

SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization

Paper • 2405.11582 • Published 4 days ago • 9

Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching

Paper • 2405.11252 • Published 5 days ago • 10

Octo: An Open-Source Generalist Robot Policy

Paper • 2405.12213 • Published 3 days ago • 21

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Paper • 2405.11157 • Published 6 days ago • 17

Imp: Highly Capable Large Multimodal Models for Mobile Devices

Paper • 2405.12107 • Published 3 days ago • 19

OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Paper • 2405.11143 • Published 4 days ago • 28

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published 3 days ago • 35

FIFO-Diffusion: Generating Infinite Videos from Text without Training

Paper • 2405.11473 • Published 4 days ago • 42

upvoted 2 collections 9 days ago

PaliGemma Release

Collection

Pretrained and mix checkpoints for PaliGemma • 11 items • Updated 6 days ago • 97

SFR-Instruct-LLaMA-3-8B-R

Collection

3 items • Updated 10 days ago • 13

upvoted a collection 11 days ago

Yi-1.5 (2024/05)

Collection

10 items • Updated 4 days ago • 70

upvoted a collection 17 days ago

Granite Code Models

Collection

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 14 items • Updated 1 day ago • 126

upvoted an article 21 days ago

Article

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

23 days ago

• 51

upvoted 2 collections 30 days ago

OpenELM Pretrained Models

Collection

4 items • Updated 30 days ago • 36

OpenELM Instruct Models

Collection

4 items • Updated Apr 12 • 96

upvoted a paper about 1 month ago

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22 • 235

upvoted 5 collections about 1 month ago

WizardLM

Collection

0 items • Updated 15 days ago • 97

OBELICS 📚🔍

Collection

Collection gathering artifacts related to OBELICS • 4 items • Updated Apr 15 • 5

🐶 IDEFICS 🐶

Collection

Collection assembling all the models and spaces related to IDEFICS • 6 items • Updated Apr 15 • 7

From screenshots to HTML

Collection

WebSight is a dataset of 823,000 HTML/CSS codes representing synthetically generated English websites, each accompanied by a corresponding screenshot. • 4 items • Updated Apr 15 • 15

Idefics2 🐶

Collection

Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated 17 days ago • 80

upvoted a paper about 1 month ago

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Paper • 2404.07839 • Published Apr 11 • 39

upvoted an article about 1 month ago

Article

CodeGemma - an official Google release for code LLMs

Apr 9

• 96

upvoted a paper about 1 month ago

MuPT: A Generative Symbolic Music Pretrained Transformer

Paper • 2404.06393 • Published Apr 9 • 14

upvoted a collection about 2 months ago

Gemma 1.1 GGUFs

Collection

4 items • Updated Apr 6 • 1

upvoted a paper 2 months ago

Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 45

upvoted 2 collections 2 months ago

VILA: On Pre-training for Visual Language Models

Collection

10 items • Updated 17 days ago • 26

StarChat2 15B

Collection

Model, datasets, and demo for StarChat2 15B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 10 items • Updated Apr 12 • 11

upvoted 2 papers 3 months ago

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6 • 173

Nemotron-4 15B Technical Report

Paper • 2402.16819 • Published Feb 26 • 40

upvoted 2 collections 3 months ago

OpenMath

Collection

A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" • 15 items • Updated Feb 19 • 28

Canary

Collection

A collection of multilingual and multitask speech to text models from NVIDIA NeMo 🐤 • 1 item • Updated Feb 19 • 14

upvoted 3 collections 4 months ago

Qwen1.5

Collection

Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated 11 days ago • 173

OLMo Suite

Collection

Artifacts for the first set of OLMo models. • 12 items • Updated 8 days ago • 36

MAGNeT

Collection

Masked Audio Generation using a Single Non-Autoregressive Transformer • 9 items • Updated Apr 4 • 30

upvoted a paper 5 months ago

Seamless: Multilingual Expressive and Streaming Speech Translation

Paper • 2312.05187 • Published Dec 8, 2023 • 8

upvoted a collection 5 months ago

Apple MLX-compatible 7B LLMs on the 🤗 Hub

Collection

This collection contains the model weights for 7B LLMs for Apple's MLX framework. Find more information at https://github.com/ml-explore/mlx • 8 items • Updated 16 days ago • 9

upvoted 3 collections 6 months ago

Distil-Whisper Models

Collection

The first version of the Distil-Whisper models released with the Distil-Whisper paper. • 4 items • Updated Mar 21 • 34

Seamless Communication

Collection

A significant step towards removing language barriers through expressive, fast and high-quality AI translation. • 16 items • Updated Jan 16 • 125

faster-whisper

Collection

Collection of faster-whisper models. • 10 items • Updated Sep 7, 2023 • 18

upvoted a paper 7 months ago

Controllable Music Production with Diffusion Models and Guidance Gradients

Paper • 2311.00613 • Published Nov 1, 2023 • 23

upvoted 4 collections 8 months ago

Audio Codecs Embeddings 🎙️

Collection

A collection of codec and embedding models supported in 🤗 Transformers. • 2 items • Updated Sep 16, 2023 • 1

Text to Music 🎧

Collection

A collection of music generation models supported in 🤗 Transformers and 🧨 Diffusers • 5 items • Updated Sep 16, 2023 • 2

Audio Classification 🔊

Collection

A collection of audio classification models supported in 🤗 Transformers • 3 items • Updated Sep 16, 2023 • 3

Text to Speech 🗣️

Collection

A collection of TTS models supported in 🤗 Transformers. • 4 items • Updated Sep 16, 2023 • 5

Vaibhav Srivastav

AI & ML interests

Articles

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

CodeGemma - an official Google release for code LLMs

TTS Arena: Benchmarking Text-to-Speech Models in the Wild

AI Watermarking 101: Tools and Techniques

Deploy MusicGen in no time with Inference Endpoints

Jupyter X Hugging Face

Swift Diffusers: Fast Stable Diffusion for Mac

Organizations

reach-vb's activity

Making automatic speech recognition work on large files with Wav2Vec2 in 🤗 Transformers

Fine-Tune Whisper with 🤗 Transformers

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

CodeGemma - an official Google release for code LLMs