Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities Paper • 2503.03983 • Published 9 days ago • 22
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks Paper • 2501.08326 • Published Jan 14 • 32
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models Paper • 2412.01822 • Published Dec 2, 2024 • 15
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Paper • 2411.19146 • Published Nov 28, 2024 • 17
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Paper • 2411.19146 • Published Nov 28, 2024 • 17
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data Paper • 2410.02056 • Published Oct 2, 2024 • 6
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 58
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models Paper • 2306.11698 • Published Jun 20, 2023 • 12
nvidia/stt_en_fastconformer_transducer_xlarge Automatic Speech Recognition • Updated 15 days ago • 40 • 24
nvidia/stt_ua_fastconformer_hybrid_large_pc Automatic Speech Recognition • Updated 24 days ago • 232 • 3
nvidia/stt_en_fastconformer_transducer_large Automatic Speech Recognition • Updated 15 days ago • 1.47k • 7