microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition β’ Updated 13 days ago β’ 618k β’ 1.31k
HuggingFaceTB/SmolVLM2-500M-Video-Instruct Image-Text-to-Text β’ Updated 14 days ago β’ 10.9k β’ 54
openai/clip-vit-large-patch14 Zero-Shot Image Classification β’ Updated Sep 15, 2023 β’ 51.3M β’ 1.71k
Runtime error 269 269 Edit Video By Editing Text β Audio-based video editing using AI-generated transcription