Running on T4 28 28 Parakeet-tdt_ctc-1.1b 🦜 Generate text transcripts with timestamps from audio or video
LLaVa-Interleave Collection LLaVa models that extends the model capabilities to Multi-image, Multi-frame (videos), Multi-patch (single-image) scenarios. • 3 items • Updated Jul 10, 2024 • 14