title: CineeeeAi
emoji: π
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Streamlit template space
CineGen AI Ultra+ π¬β¨
Visionary Cinematic Pre-Production Powered by AI
CineGen AI Ultra+ is a Streamlit web application designed to accelerate the creative pre-production process for films, animations, and other visual storytelling projects. It leverages the power of Large Language Models (Google's Gemini) and other AI tools to transform a simple story idea into a rich cinematic treatment, complete with scene breakdowns, visual style suggestions, AI-generated concept art/video clips, and a narrated animatic.
Features
- AI Creative Director: Input a core story idea, genre, and mood.
- Cinematic Treatment Generation:
- Gemini generates a detailed multi-scene treatment.
- Each scene includes:
- Title, Emotional Beat, Setting Description
- Characters Involved, Character Focus Moment
- Key Plot Beat, Suggested Dialogue Hook
- Proactive Director's Suggestions (κ°λ - Gamdok/Director): Visual Style, Camera Work, Sound Design.
- Asset Generation Aids: Suggested Asset Type (Image/Video Clip), Video Motion Description & Duration, Image/Video Keywords, Pexels Search Queries.
- Visual Asset Generation:
- Image Generation: Utilizes DALL-E 3 (via OpenAI API) to generate concept art for scenes based on derived prompts.
- Stock Footage Fallback: Uses Pexels API for relevant stock images/videos if AI generation is disabled or fails.
- Video Clip Generation (Placeholder): Integrated structure for text-to-video/image-to-video generation using RunwayML API (requires user to implement actual API calls in
core/visual_engine.py
). Placeholder generates dummy video clips.
- Character Definition: Define key characters with visual descriptions for more consistent AI-generated visuals.
- Global Style Overrides: Apply global stylistic keywords (e.g., "Hyper-Realistic Gritty Noir," "Vintage Analog Sci-Fi") to influence visual generation.
- AI-Powered Narration:
- Gemini crafts a narration script based on the generated treatment.
- ElevenLabs API synthesizes the narration into natural-sounding audio.
- Customizable voice ID and narration style.
- Iterative Refinement:
- Edit scene treatments and regenerate them with AI assistance.
- Refine DALL-E prompts based on feedback and regenerate visuals.
- Cinematic Animatic Assembly:
- Combines generated visual assets (images/video clips) and the synthesized narration into a downloadable
.mp4
animatic. - Customizable per-scene duration for pacing control.
- Ken Burns effect on still images and text overlays for scene context.
- Combines generated visual assets (images/video clips) and the synthesized narration into a downloadable
- Secrets Management: Securely loads API keys from Streamlit secrets or environment variables.
Project Structure
Use code with caution. Markdown CineGenAI/ βββ .streamlit/ β βββ secrets.toml # API Keys and configuration (DO NOT COMMIT if public) βββ assets/ β βββ fonts/ β βββ arial.ttf # Example font file (ensure it's available or update path) βββ core/ β βββ init.py β βββ gemini_handler.py # Manages interactions with the Gemini API β βββ visual_engine.py # Handles image/video generation (DALL-E, Pexels, RunwayML placeholder) and video assembly β βββ prompt_engineering.py # Contains functions to craft detailed prompts for Gemini βββ temp_cinegen_media/ # Temporary directory for generated media (add to .gitignore) βββ app.py # Main Streamlit application script βββ Dockerfile # For containerizing the application βββ Dockerfile.test # (Optional) For testing βββ requirements.txt # Python dependencies βββ README.md # This file βββ .gitattributes # For Git LFS if handling large font files
Setup and Installation
Clone the Repository:
git clone <repository_url> cd CineGenAI
Create a Virtual Environment (Recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
Install Dependencies:
pip install -r requirements.txt
Note:
MoviePy
might requireffmpeg
to be installed on your system. On Debian/Ubuntu:sudo apt-get install ffmpeg
. On macOS with Homebrew:brew install ffmpeg
.Set Up API Keys: You need API keys for the following services:
- Google Gemini API
- OpenAI API (for DALL-E)
- ElevenLabs API (and optionally a specific Voice ID)
- Pexels API
- RunwayML API (if implementing full video generation)
Store these keys securely. You have two primary options:
Streamlit Secrets (Recommended for Hugging Face Spaces / Streamlit Cloud): Create a file
.streamlit/secrets.toml
(make sure this file is in your.gitignore
if your repository is public!) with the following format:GEMINI_API_KEY = "your_gemini_api_key" OPENAI_API_KEY = "your_openai_api_key" ELEVENLABS_API_KEY = "your_elevenlabs_api_key" PEXELS_API_KEY = "your_pexels_api_key" ELEVENLABS_VOICE_ID = "your_elevenlabs_voice_id" # e.g., "Rachel" or a custom ID RUNWAY_API_KEY = "your_runwayml_api_key"
Environment Variables (for local development): Set the environment variables directly in your terminal or
.env
file (using a library likepython-dotenv
which is not included by default). The application will look for these if Streamlit secrets are not found.export GEMINI_API_KEY="your_gemini_api_key" export OPENAI_API_KEY="your_openai_api_key" # ... and so on for other keys
Font: Ensure the font file specified in
core/visual_engine.py
(e.g.,arial.ttf
) is accessible. The script tries common system paths, but you can place it inassets/fonts/
and adjust the path invisual_engine.py
if needed. If using Docker, ensure the font is copied into the image (seeDockerfile
).RunwayML Implementation (Important): The current integration for RunwayML in
core/visual_engine.py
(method_generate_video_clip_with_runwayml
) is a placeholder. You will need to:- Install the official RunwayML SDK if available.
- Implement the actual API calls to RunwayML for text-to-video or image-to-video generation.
- The placeholder currently creates a dummy video clip using MoviePy.
Running the Application
Local Development:
streamlit run app.py
The application should open in your web browser.
Using Docker (Optional):
- Build the Docker image:
docker build -t cinegen-ai .
- Run the Docker container (ensure API keys are passed as environment variables or handled via mounted secrets if not baked into the image for local testing):
Access the app atdocker run -p 8501:8501 \ -e GEMINI_API_KEY="your_key" \ -e OPENAI_API_KEY="your_key" \ # ... other env vars ... cinegen-ai
http://localhost:8501
.
- Build the Docker image:
How to Use
- Input Creative Seed: Provide your core story idea, select a genre, mood, number of scenes, and AI director style in the sidebar.
- Generate Treatment: Click "π Generate Cinematic Treatment". The AI will produce a multi-scene breakdown.
- Review & Refine:
- Examine each scene's details, including AI-generated visuals (or placeholders).
- Use the "βοΈ Edit Scene X Treatment" and "π¨ Edit Scene X Visual Prompt" popovers to provide feedback and regenerate specific parts of the treatment or visuals.
- Adjust per-scene "Dominant Shot Type" and "Scene Duration" for the animatic.
- Fine-Tuning (Sidebar):
- Define characters with visual descriptions.
- Apply global style overrides.
- Set narrator voice ID and narration style.
- Assemble Animatic: Once you're satisfied with the treatment, visuals, and narration script (generated automatically), click "π¬ Assemble Narrated Cinematic Animatic".
- View & Download: The generated animatic video will appear, and you can download it.
Key Technologies
- Python
- Streamlit: Web application framework.
- Google Gemini API: For core text generation (treatment, narration script, prompt refinement).
- OpenAI API (DALL-E 3): For AI image generation.
- ElevenLabs API: For text-to-speech narration.
- Pexels API: For stock image/video fallbacks.
- RunwayML API (Placeholder): For AI video clip generation.
- MoviePy: For video processing and animatic assembly.
- Pillow (PIL): For image manipulation.
Future Enhancements / To-Do
- Implement full, robust RunwayML API integration.
- Option to upload custom seed images for image-to-video generation.
- More sophisticated control over Ken Burns effect (pan direction, zoom intensity).
- Allow users to upload their own audio for narration or background music.
- Advanced shot list generation and export.
- Integration with other AI video/image models.
- User accounts and project saving.
- More granular error handling and user feedback in the UI.
- Refine JSON cleaning from Gemini to be even more robust.
Contributing
Contributions, bug reports, and feature requests are welcome! Please open an issue or submit a pull request.
License
This project is currently under [Specify License Here - e.g., MIT, Apache 2.0, or state as private/proprietary].
This README provides a comprehensive overview. Ensure all paths, commands, and API key instructions match your specific project setup.