myshell-ai/OpenVoice
Text-to-Speech
•
Updated
•
404
A collection of Audio, Video and Visual LLMs.
GOT - OCR (from : UCAS, Beijing)
VLMEvalKit Eval Results in video understanding benchmark
Talk to Fixie.ai's Ultravox with WebRTC ⚡️
diffusion-based Image Restoration model
Prompt with Images in flux[dev]
A community project to create an image preferences dataset.
PaliGemma2 LoRA finetuned on VQAv2
VLMEvalKit Evaluation Results Collection
Fantasy story generator