Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
sagar007Β 
posted an update Aug 26
Post
612
πŸ“£ New Project Alert: Phi 3.5 Multimodal AI Demo πŸŽ‰
Excited to share my latest project that combines the power of Phi 3.5 text and vision models with text-to-speech capabilities!
πŸ”‘ Key Features:
1️⃣ Phi 3.5 Text Model for dynamic conversations
2️⃣ Phi 3.5 Vision Model for advanced image analysis
3️⃣ Text-to-Speech integration for an audio dimension
πŸ› οΈ Tech Stack:

Transformers
Gradio
PyTorch
Flash Attention 2
Parler TTS

This project demonstrates the potential of integrating multiple AI models to create a more comprehensive and interactive user experience. It's a step towards more natural and versatile AI assistants.
πŸ‘‰ Check out the demo and let me know your thoughts! How would you extend this project?
πŸ”— Demo Link: sagar007/Multimodal_App
#MultimodalAI #PhiModel #MachineLearning #AIDemo
This comment has been hidden
In this post