Spaces:

nishchandel
/

deepdefend-api

Sleeping

App Files Files Community

nishchandel commited on Oct 11

Commit

60efa5a

0 Parent(s):

Initial deployment without models

Browse files

Files changed (23) hide show

.dockerignore +25 -0
.env.example +1 -0
.gitignore +16 -0
Dockerfile +25 -0
README.md +226 -0
analysis/__init__.py +0 -0
analysis/audio_analyser.py +165 -0
analysis/llm_analyser.py +108 -0
analysis/prompt.py +70 -0
analysis/video_analyser.py +183 -0
extraction/__init__.py +0 -0
extraction/media_extractor.py +122 -0
extraction/timeline_generator.py +38 -0
main.py +438 -0
models/.gitkeep +0 -0
models/__init__.py +0 -0
models/audio_model/.gitkeep +0 -0
models/download_model.py +33 -0
models/load_models.py +49 -0
models/video_model/.gitkeep +0 -0
pipeline.py +35 -0
requirements.txt +35 -0
uploads/.gitkeep +0 -0

.dockerignore ADDED Viewed

	@@ -0,0 +1,25 @@

+__pycache__/
+*.pyc
+*.pyo
+*.pyd
+.Python
+*.so
+*.egg
+*.egg-info/
+dist/
+build/
+*.log
+.git/
+.gitignore
+uploads/*
+!uploads/.gitkeep
+*.mp4
+*.avi
+*.mov
+venv/
+env/
+.env
+.vscode/
+.idea/
+*.pkl
+*.pth

.env.example ADDED Viewed

	@@ -0,0 +1 @@


1	+ GOOGLE_API_KEY=your_gemini_api_key

.gitignore ADDED Viewed

	@@ -0,0 +1,16 @@

+@"
+__pycache__/
+*.pyc
+.env
+myenv/
+models/video_model/*
+models/audio_model/*
+!models/.gitkeep
+!models/video_model/.gitkeep
+!models/audio_model/.gitkeep
+uploads/*
+!uploads/.gitkeep
+*.bin
+*.pt
+*.pth
+"@ | Out-File .gitignore -Encoding utf8

Dockerfile ADDED Viewed

	@@ -0,0 +1,25 @@

+FROM python:3.10-slim
+WORKDIR /app
+RUN apt-get update && apt-get install -y \
+    ffmpeg \
+    libsm6 \
+    libxext6 \
+    libxrender-dev \
+    libgomp1 \
+    git \
+    && rm -rf /var/lib/apt/lists/*
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+COPY . .
+RUN mkdir -p uploads models/video_model models/audio_model
+RUN python download_model.py
+EXPOSE 7860
+CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7860"]

README.md ADDED Viewed

	@@ -0,0 +1,226 @@

+# DeepDefend
+> **Multi-Modal Deepfake Detection System**
+> Detect AI-generated deepfakes in videos using computer vision and audio analysis
+[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
+[![FastAPI](https://img.shields.io/badge/FastAPI-0.109.0-009688.svg)](https://fastapi.tiangolo.com)
+## Overview
+DeepDefend is a comprehensive deepfake detection system that combines **video frame analysis** and **audio analysis** to identify AI-generated synthetic media. Using machine learning models and AI-powered evidence fusion, it provides detailed, interval-by-interval analysis with explainable results.
+### Why DeepDefend?
+- **Multi-Modal Analysis**: Combines video and audio detection for higher accuracy
+- **AI-Powered Fusion**: Uses LLM to generate human-readable reports
+- **Interval Breakdown**: Shows exactly which parts of the video are suspicious
+- **REST API**: Easy integration with any frontend or application
+## Features
+### Core Detection Capabilities
+- **Video Analysis**
+  - Frame-by-frame deepfake detection using pre-trained models
+  - Face detection and region-specific analysis
+  - Suspicious region identification (eyes, mouth, face boundaries)
+  - Confidence scoring per frame
+- **Audio Analysis**
+  - Voice synthesis detection
+  - Spectrogram analysis for audio artifacts
+  - Frequency pattern recognition
+  - Audio splicing detection
+- **AI-Powered Reporting**
+  - LLM-based evidence fusion (Google Gemini)
+  - Natural language explanation of findings
+  - Verdict with confidence percentage
+  - Timestamped suspicious intervals
+### Processing Pipeline
+```
+Video Input
+    ↓
+┌───────────────────┐
+│ Media Extraction  │ → Extract frames (5 per interval)
+│                   │ → Extract audio chunks
+└────────┬──────────┘
+         │
+         ├──────────────────────┬──────────────────────┐
+         ▼                      ▼                      ▼
+┌─────────────────┐   ┌─────────────────┐   ┌────────────────┐
+│ Video Analysis  │   │ Audio Analysis  │   │ Timeline Gen   │
+│ • Face detect   │   │ • Spectrogram   │   │ • 2s intervals │
+│ • Region scan   │   │ • Voice synth   │   │ • Metadata     │
+│ • Fake score    │   │ • Artifacts     │   │                │
+└────────┬────────┘   └────────┬────────┘   └────────┬───────┘
+         │                     │                      │
+         └──────────────┬──────────────┬─────────────┘
+                        ▼              ▼
+                ┌──────────────────────────┐
+                │   LLM Fusion Engine      │
+                │ • Combine evidence       │
+                │ • Generate verdict       │
+                │ • Natural language report│
+                └────────────┬─────────────┘
+                             ▼
+                      Final Report
+                    (JSON Response)
+```
+## Demo
+### Live Demo
+**API**: [https://deepdefend-api.hf.space](https://deepdefend-api.hf.space)
+**Docs**: [https://deepdefend-api.hf.space/docs](https://deepdefend-api.hf.space/docs)
+### Example Analysis
+<details>
+<summary>Click to see sample output</summary>
+```json
+{
+  "verdict": "DEEPFAKE",
+  "confidence": 87.5,
+  "overall_scores": {
+    "overall_video_score": 0.823,
+    "overall_audio_score": 0.756,
+    "overall_combined_score": 0.789
+  },
+  "detailed_analysis": "This video shows strong indicators of deepfake manipulation...",
+  "suspicious_intervals": [
+    {
+      "interval": "4.0-6.0",
+      "video_score": 0.891,
+      "audio_score": 0.834,
+      "video_regions": ["eyes", "mouth"],
+      "audio_regions": ["voice_synthesis_artifacts"]
+    }
+  ],
+  "total_intervals_analyzed": 15,
+  "video_info": {
+    "duration": 12.498711111111112,
+    "fps": 29.923085402583734,
+    "total_frames": 374,
+    "file_size_mb": 31.36
+  },
+  "analysis_id": "4cd98ea5-8c14-4cae-8da4-689345b0aabc",
+  "timestamp": "2025-10-10T23:34:35.724916"
+}
+```
+</details>
+## Installation
+### Prerequisites
+- Python 3.10 or higher
+- FFmpeg installed on your system
+- Google Gemini API key
+### Local Setup
+1. **Clone the repository**
+```bash
+git clone https://github.com/yourusername/deepdefend.git
+```
+2. **Create virtual environment**
+```bash
+python -m venv venv
+# On Linux/Mac
+source venv/bin/activate
+# On Windows
+venv\Scripts\activate
+```
+3. **Install dependencies**
+```bash
+pip install -r requirements.txt
+```
+4. **Download ML models**
+```bash
+python models/download_model.py
+```
+*This will download ~2GB of models from Hugging Face*
+5. **Configure environment**
+```bash
+cp .env.example .env
+# Edit .env and add your GOOGLE_API_KEY
+```
+6. **Run the server**
+```bash
+uvicorn main:app --reload
+```
+The API will be available at `http://127.0.0.1:8000`
+### Docker Setup
+```bash
+# Build image
+docker build -t deepdefend .
+# Run container
+docker run -p 8000:8000 -e GOOGLE_API_KEY=your_key deepdefend
+```
+## Tech Stack
+### Backend
+- **Framework**: FastAPI 0.109.0
+- **Server**: Uvicorn
+- **ML Framework**: PyTorch 2.3.1
+- **Transformers**: Hugging Face Transformers 4.36.2
+### ML Models
+- **Video Detection**: [dima806/deepfake_vs_real_image_detection](https://huggingface.co/dima806/deepfake_vs_real_image_detection)
+- **Audio Detection**: [mo-thecreator/Deepfake-audio-detection](https://huggingface.co/mo-thecreator/Deepfake-audio-detection)
+- **LLM Fusion**: Google Gemini 2.5 Flash
+### Processing
+- **Computer Vision**: OpenCV, Pillow
+- **Audio Processing**: Librosa, SoundFile
+- **Video Processing**: FFmpeg
+### Deployment
+- **Container**: Docker
+- **Platforms**: Hugging Face Spaces
+## Project Structure
+```
+deepdefend/
+│
+│── extraction/
+│   ├── media_extractor.py     # Frame & audio extraction
+│   └── timeline_generator.py  # Timeline creation
+│
+│── analysis/
+│   ├── video_analyser.py      # Video deepfake detection
+│   ├── audio_analyser.py      # Audio deepfake detection
+│   ├── llm_analyser.py        # LLM-based fusion
+│   └── prompt.py              # LLM prompts
+│
+│── models/
+│   ├── download_model.py      # Model downloader
+│   ├── load_models.py         # Model loader
+│   ├── video_model/           # (Downloaded)
+│   └── audio_model/           # (Downloaded)
+│
+│── main.py                    # FastAPI application
+│── pipeline.py                # Main detection pipeline
+│── requirements.txt           # Python dependencies
+│── Dockerfile                 # Container configuration
+├── .gitignore
+└── README.md
+```

analysis/__init__.py ADDED Viewed

File without changes

analysis/audio_analyser.py ADDED Viewed

	@@ -0,0 +1,165 @@

+import torch
+import librosa
+import numpy as np
+from typing import Dict, List
+from models.load_models import model_loader
+class AudioAnalyzer:
+    """Analyzes audio chunks for deepfake detection"""
+    def __init__(self):
+        self.model, self.processor = model_loader.load_audio_model()
+        self.device = model_loader.get_device()
+    def predict_deepfake(self, audio: np.ndarray, sample_rate: int) -> Dict:
+        """Predict if audio chunk is deepfake"""
+        min_length = sample_rate * 1
+        if len(audio) < min_length:
+            audio = np.pad(audio, (0, min_length - len(audio)))
+        inputs = self.processor(
+            audio,
+            sampling_rate=sample_rate,
+            return_tensors="pt",
+            padding=True
+        )
+        if self.device == "cuda":
+            inputs = {k: v.cuda() for k, v in inputs.items()}
+        with torch.no_grad():
+            outputs = self.model(**inputs)
+            logits = outputs.logits
+            probs = torch.nn.functional.softmax(logits, dim=-1)
+        fake_prob = probs[0][1].item() if probs.shape[1] > 1 else probs[0][0].item()
+        confidence = max(probs[0]).item()
+        return {
+            'fake_score': round(fake_prob, 3),
+            'confidence': round(confidence, 3),
+            'label': 'fake' if fake_prob > 0.5 else 'real'
+        }
+    def analyze_spectrogram(self, audio: np.ndarray, sample_rate: int, fake_score: float) -> Dict:
+        """Analyze audio with adaptive thresholds based on fake_score"""
+        try:
+            spectral_centroid = librosa.feature.spectral_centroid(y=audio, sr=sample_rate)[0]
+            spectral_rolloff = librosa.feature.spectral_rolloff(y=audio, sr=sample_rate)[0]
+            zero_crossing_rate = librosa.feature.zero_crossing_rate(audio)[0]
+            mfcc = librosa.feature.mfcc(y=audio, sr=sample_rate, n_mfcc=13)
+            suspicious_regions = self._identify_audio_anomalies(
+                spectral_centroid, spectral_rolloff, zero_crossing_rate, mfcc, fake_score
+            )
+            return {
+                'regions': suspicious_regions,
+                'spectral_features': {
+                    'avg_spectral_centroid': round(float(np.mean(spectral_centroid)), 2),
+                    'avg_spectral_rolloff': round(float(np.mean(spectral_rolloff)), 2),
+                    'avg_zero_crossing_rate': round(float(np.mean(zero_crossing_rate)), 3),
+                    'mfcc_variance': round(float(np.var(mfcc)), 3)
+                }
+            }
+        except Exception as e:
+            if fake_score > 0.6:
+                return {
+                    'regions': ['voice_synthesis_detected', 'audio_artifacts'],
+                    'spectral_features': {}
+                }
+            else:
+                return {
+                    'regions': ['no_suspicious_patterns'],
+                    'spectral_features': {}
+                }
+    def _identify_audio_anomalies(self, spectral_centroid: np.ndarray, spectral_rolloff: np.ndarray, zero_crossing: np.ndarray, mfcc: np.ndarray, fake_score: float) -> List[str]:
+        suspicious_regions = []
+        if fake_score > 0.7:
+            pitch_low, pitch_high = 200, 6000
+            mfcc_threshold = 25
+            zcr_low, zcr_high = 0.02, 0.25
+            rolloff_threshold = 3000
+            centroid_jump = 800
+        elif fake_score > 0.5:
+            pitch_low, pitch_high = 250, 5500
+            mfcc_threshold = 28
+            zcr_low, zcr_high = 0.025, 0.22
+            rolloff_threshold = 2700
+            centroid_jump = 900
+        else:
+            pitch_low, pitch_high = 300, 5000
+            mfcc_threshold = 30
+            zcr_low, zcr_high = 0.03, 0.20
+            rolloff_threshold = 2500
+            centroid_jump = 1000
+        pitch_variance = np.var(spectral_centroid)
+        if pitch_variance < pitch_low:
+            suspicious_regions.append('monotone_voice')
+        elif pitch_variance > pitch_high:
+            suspicious_regions.append('erratic_pitch')
+        mfcc_var = np.var(mfcc)
+        if mfcc_var < mfcc_threshold:
+            suspicious_regions.append('voice_synthesis_artifacts')
+        zcr_mean = np.mean(zero_crossing)
+        if zcr_mean > zcr_high:
+            suspicious_regions.append('high_frequency_noise')
+        elif zcr_mean < zcr_low:
+            suspicious_regions.append('overly_smooth_audio')
+        rolloff_std = np.std(spectral_rolloff)
+        if rolloff_std > rolloff_threshold:
+            suspicious_regions.append('spectral_artifacts')
+        centroid_diff = np.diff(spectral_centroid)
+        if len(centroid_diff) > 0 and np.max(np.abs(centroid_diff)) > centroid_jump:
+            suspicious_regions.append('audio_splicing')
+        if np.std(spectral_centroid) < 50:
+            suspicious_regions.append('unnatural_consistency')
+        if fake_score > 0.6 and len(suspicious_regions) == 0:
+            suspicious_regions.append('general_audio_manipulation')
+        return suspicious_regions if suspicious_regions else ['no_suspicious_patterns']
+    def analyze_interval(self, interval_data: Dict) -> Dict:
+        """Analyze audio for a single interval"""
+        audio_data = interval_data['audio_data']
+        if not audio_data or not audio_data.get('has_audio', False):
+            return {
+                'interval_id': interval_data['interval_id'],
+                'interval': interval_data['interval'],
+                'fake_score': 0.0,
+                'confidence': 0.0,
+                'suspicious_regions': ['no_audio'],
+                'has_audio': False,
+                'spectral_features': {}
+            }
+        audio = audio_data['audio']
+        sample_rate = audio_data['sample_rate']
+        prediction = self.predict_deepfake(audio, sample_rate)
+        spectrogram_analysis = self.analyze_spectrogram(
+            audio, sample_rate, prediction['fake_score']
+        )
+        return {
+            'interval_id': interval_data['interval_id'],
+            'interval': interval_data['interval'],
+            'fake_score': prediction['fake_score'],
+            'confidence': prediction['confidence'],
+            'suspicious_regions': spectrogram_analysis['regions'],
+            'has_audio': True,
+            'spectral_features': spectrogram_analysis['spectral_features']
+        }

analysis/llm_analyser.py ADDED Viewed

	@@ -0,0 +1,108 @@

+from typing import List, Dict
+from langchain_google_genai import ChatGoogleGenerativeAI
+from analysis.prompt import _create_analysis_prompt
+from dotenv import load_dotenv
+import re
+load_dotenv()
+class LLMFusion:
+    """Fuses video and audio analysis results using LLM to generate human-readable report"""
+    def __init__(self):
+        self.llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0)
+    def prepare_analysis_json(self, timeline: List[Dict]) -> Dict:
+        analysis_data = {
+            'total_intervals': len(timeline),
+            'intervals': []
+        }
+        for interval in timeline:
+            interval_summary = {
+                'interval_id': interval['interval_id'],
+                'time_range': interval['interval'],
+                'video_analysis': interval.get('video_results', {}),
+                'audio_analysis': interval.get('audio_results', {})
+            }
+            analysis_data['intervals'].append(interval_summary)
+        return analysis_data
+    def calculate_overall_scores(self, timeline: List[Dict]) -> Dict:
+        """Calculate overall video and audio fake scores"""
+        video_scores = []
+        audio_scores = []
+        for interval in timeline:
+            if interval.get('video_results') and 'fake_score' in interval['video_results']:
+                video_scores.append(interval['video_results']['fake_score'])
+            if interval.get('audio_results') and 'fake_score' in interval['audio_results']:
+                audio_scores.append(interval['audio_results']['fake_score'])
+        overall_video = round(sum(video_scores) / len(video_scores), 3) if len(video_scores) > 0 else 0.0
+        overall_audio = round(sum(audio_scores) / len(audio_scores), 3) if len(audio_scores) > 0 else 0.0
+        if overall_video > 0 and overall_audio > 0:
+            overall_combined = round((overall_video + overall_audio) / 2, 3)
+        elif overall_video > 0:
+            overall_combined = overall_video
+        elif overall_audio > 0:
+            overall_combined = overall_audio
+        else:
+            overall_combined = 0.0
+        return {
+            'overall_video_score': overall_video,
+            'overall_audio_score': overall_audio,
+            'overall_combined_score': overall_combined
+        }
+    def generate_report(self, timeline: List[Dict], video_info: Dict) -> Dict:
+        analysis_json = self.prepare_analysis_json(timeline)
+        overall_scores = self.calculate_overall_scores(timeline)
+        prompt = _create_analysis_prompt(analysis_json, overall_scores, video_info)
+        try:
+            response = self.llm.invoke(prompt)
+            llm_response = response.content
+        except Exception as e:
+            print(f"LLM failed: {e}")
+            llm_response = "Analysis failed."
+        report = self._structure_report(llm_response, overall_scores, analysis_json)
+        return report
+    def _structure_report(self, llm_response: str, overall_scores: Dict, analysis_json: Dict) -> Dict:
+        """Extract structured information from LLM response"""
+        verdict = "DEEPFAKE" if overall_scores['overall_combined_score'] > 0.5 else "REAL"
+        confidence = 75.0
+        conf_match = re.search(r'(\d+)\s*%', llm_response)
+        if conf_match:
+            confidence = float(conf_match.group(1))
+        suspicious_intervals = []
+        for interval_data in analysis_json['intervals']:
+            video_score = interval_data.get('video_analysis', {}).get('fake_score', 0)
+            audio_score = interval_data.get('audio_analysis', {}).get('fake_score', 0)
+            if video_score > 0.6 or audio_score > 0.6:
+                suspicious_intervals.append({
+                    'interval': interval_data['time_range'],
+                    'video_score': video_score,
+                    'audio_score': audio_score,
+                    'video_regions': interval_data.get('video_analysis', {}).get('suspicious_regions', []),
+                    'audio_regions': interval_data.get('audio_analysis', {}).get('suspicious_regions', [])
+                })
+        return {
+            'verdict': verdict,
+            'confidence': confidence,
+            'overall_scores': overall_scores,
+            'detailed_analysis': llm_response,
+            'suspicious_intervals': suspicious_intervals,
+            'total_intervals_analyzed': analysis_json['total_intervals']
+        }

analysis/prompt.py ADDED Viewed

	@@ -0,0 +1,70 @@

+import json
+from typing import List, Dict
+def _create_analysis_prompt(analysis_json: Dict, overall_scores: Dict, video_info: Dict) -> str:
+    """Create prompt for LLM with proper score interpretation"""
+    prompt = f"""You are a deepfake detection expert. Analyze the following video analysis results and generate a human-readable report.
+VIDEO INFORMATION:
+- Duration: {video_info['duration']:.2f} seconds
+- Total Intervals Analyzed: {analysis_json['total_intervals']}
+OVERALL SCORES (CRITICAL - READ CAREFULLY):
+- Video Deepfake Score: {overall_scores['overall_video_score']}
+- Audio Deepfake Score: {overall_scores['overall_audio_score']}
+- Averaged Combined Score: {overall_scores['overall_combined_score']}
+SCORE INTERPRETATION GUIDE:
+- Scores range from 0.0 to 1.0
+- 0.0 - 0.3: LIKELY REAL (low probability of manipulation)
+- 0.3 - 0.5: POSSIBLY REAL (some minor artifacts, but probably authentic)
+- 0.5 - 0.7: POSSIBLY FAKE (suspicious patterns detected)
+- 0.7 - 1.0: LIKELY FAKE (high probability of deepfake)
+IMPORTANT: The numerical scores are the PRIMARY evidence. Suspicious regions are secondary indicators that provide detail about WHERE issues were detected, but should NOT override low scores.
+INTERVAL-BY-INTERVAL ANALYSIS:
+{json.dumps(analysis_json['intervals'], indent=2)}
+ANALYSIS RULES:
+1. If average score < 0.5, you should lean towards "REAL" verdict unless there is overwhelming contradictory evidence
+2. If average score > 0.5, you should lean towards "DEEPFAKE" verdict
+3. Suspicious regions (like "monotone_voice" or "eyes") only matter if the scores also indicate manipulation
+4. A low score with suspicious regions = detection system being cautious, likely still REAL
+5. Base your confidence on how far the scores are from 0.5 threshold
+TASK:
+Based on the analysis above, provide:
+1. **VERDICT**: State clearly if this is "REAL" or "DEEPFAKE"
+   - Must align with the overall scores
+   - If avg score < 0.5, verdict should typically be REAL
+   - If avg score > 0.5, verdict should typically be DEEPFAKE
+2. **CONFIDENCE**: Your confidence level (0-100%)
+   - Base this on how definitive the scores are
+   - Score near 0.0 or 1.0 = high confidence
+   - Score near 0.5 = low confidence
+3. **KEY FINDINGS**: Summarize the most important patterns found
+   - Focus on intervals with scores > 0.6 (those are actually suspicious)
+   - Mention if scores are consistently low (indicates authentic content)
+4. **SUSPICIOUS INTERVALS**: Only list intervals where fake_score > 0.6
+   - If no intervals exceed 0.6, state "No highly suspicious intervals detected"
+5. **EVIDENCE SUMMARY**:
+   - Video evidence: Mention specific facial regions only if video score > 0.5
+   - Audio evidence: Mention audio patterns only if audio score > 0.5
+   - If scores are low, acknowledge the content appears authentic
+6. **EXPLANATION**: In 2-3 sentences, explain your verdict
+   - Reference the numerical scores explicitly
+   - Explain in simple terms what the scores mean for this video
+CRITICAL REMINDER: Your verdict MUST be consistent with the numerical scores. Do not declare something a deepfake if the scores indicate it's real (< 0.5).
+Format your response as a clear, structured analysis that a non-technical person could understand."""
+    return prompt

analysis/video_analyser.py ADDED Viewed

	@@ -0,0 +1,183 @@

+import cv2
+import torch
+import numpy as np
+from PIL import Image
+from typing import List, Dict
+from collections import Counter
+from models.load_models import model_loader
+class VideoAnalyzer:
+    """Simple, reliable video analyzer for hackathon demo"""
+    def __init__(self):
+        self.model, self.processor = model_loader.load_video_model()
+        self.device = model_loader.get_device()
+        self.face_cascade = cv2.CascadeClassifier(
+            cv2.data.haarcascades + 'haarcascade_frontalface_default.xml'
+        )
+    def detect_face(self, frame: np.ndarray) -> Dict:
+        """Detect face in frame"""
+        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
+        faces = self.face_cascade.detectMultiScale(gray, 1.3, 5)
+        if len(faces) > 0:
+            x, y, w, h = max(faces, key=lambda f: f[2] * f[3])
+            face_crop = frame[y:y+h, x:x+w]
+            return {
+                'detected': True,
+                'bbox': {'x': int(x), 'y': int(y), 'w': int(w), 'h': int(h)},
+                'face_crop': face_crop
+            }
+        return {'detected': False, 'bbox': None, 'face_crop': None}
+    def predict_deepfake(self, frame: np.ndarray) -> Dict:
+        """Predict if frame is deepfake"""
+        frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
+        pil_img = Image.fromarray(frame_rgb)
+        inputs = self.processor(images=pil_img, return_tensors="pt")
+        if self.device == "cuda":
+            inputs = {k: v.cuda() for k, v in inputs.items()}
+        with torch.no_grad():
+            outputs = self.model(**inputs)
+            logits = outputs.logits
+            probs = torch.nn.functional.softmax(logits, dim=-1)
+        fake_prob = probs[0][1].item() if probs.shape[1] > 1 else probs[0][0].item()
+        confidence = max(probs[0]).item()
+        return {
+            'fake_score': round(fake_prob, 3),
+            'confidence': round(confidence, 3),
+            'label': 'fake' if fake_prob > 0.5 else 'real'
+        }
+    def detect_suspicious_regions(self, face: np.ndarray, fake_score: float) -> List[str]:
+        try:
+            gray = cv2.cvtColor(face, cv2.COLOR_BGR2GRAY)
+            h, w = gray.shape
+            suspicious_regions = []
+            regions = {
+                'eyes': (int(h*0.25), int(h*0.45), int(w*0.15), int(w*0.85)),
+                'nose': (int(h*0.40), int(h*0.65), int(w*0.35), int(w*0.65)),
+                'mouth': (int(h*0.60), int(h*0.80), int(w*0.30), int(w*0.70)),
+                'forehead': (int(h*0.08), int(h*0.28), int(w*0.25), int(w*0.75)),
+                'cheeks': (int(h*0.45), int(h*0.70), int(w*0.15), int(w*0.85)),
+                'chin': (int(h*0.75), int(h*0.95), int(w*0.30), int(w*0.70))
+            }
+            for region_name, (y1, y2, x1, x2) in regions.items():
+                region = gray[y1:y2, x1:x2]
+                if region.size == 0:
+                    continue
+                suspicious = False
+                variance = np.var(region)
+                if variance < 200 or variance > 8000:
+                    suspicious = True
+                edges = cv2.Canny(region, 50, 150)
+                edge_density = np.sum(edges > 0) / edges.size
+                if edge_density < 0.05:
+                    suspicious = True
+                if fake_score > 0.7 and variance < 400:
+                    suspicious = True
+                if suspicious:
+                    suspicious_regions.append(region_name)
+            left_half = gray[:, :w//2]
+            right_half = np.fliplr(gray[:, w//2:])
+            min_width = min(left_half.shape[1], right_half.shape[1])
+            left_half = left_half[:, :min_width]
+            right_half = right_half[:, :min_width]
+            symmetry_diff = np.mean(np.abs(left_half.astype(float) - right_half.astype(float)))
+            if symmetry_diff < 10:
+                suspicious_regions.append('unnatural_symmetry')
+            return suspicious_regions if suspicious_regions else ['none']
+        except Exception as e:
+            print(f"Region detection error: {e}")
+            return ['analysis_error']
+    def analyze_interval(self, interval_data: Dict) -> Dict:
+        """Analyze all frames in an interval"""
+        frames_data = interval_data['video_data']
+        if not frames_data:
+            return {
+                'interval_id': interval_data['interval_id'],
+                'interval': interval_data['interval'],
+                'fake_score': 0.0,
+                'confidence': 0.0,
+                'suspicious_regions': [],
+                'face_detected': False,
+                'frame_results': []
+            }
+        frame_results = []
+        total_fake_score = 0
+        faces_detected = 0
+        all_regions = []
+        for frame_data in frames_data:
+            frame = frame_data['frame']
+            timestamp = frame_data['timestamp']
+            face_info = self.detect_face(frame)
+            if face_info['detected']:
+                faces_detected += 1
+                pred = self.predict_deepfake(face_info['face_crop'])
+                regions = self.detect_suspicious_regions(face_info['face_crop'], pred['fake_score'])
+            else:
+                pred = self.predict_deepfake(frame)
+                regions = ['no_face_detected']
+            total_fake_score += pred['fake_score']
+            all_regions.extend(regions)
+            frame_results.append({
+                'timestamp': timestamp,
+                'fake_score': pred['fake_score'],
+                'confidence': pred['confidence'],
+                'face_detected': face_info['detected'],
+                'regions': regions
+            })
+        avg_fake_score = total_fake_score / len(frames_data)
+        region_counts = Counter(all_regions)
+        threshold = len(frames_data) * 0.5
+        consistent_regions = [
+            region for region, count in region_counts.items()
+            if count >= threshold and region not in ['none', 'no_face_detected', 'analysis_error']
+        ]
+        return {
+            'interval_id': interval_data['interval_id'],
+            'interval': interval_data['interval'],
+            'fake_score': round(avg_fake_score, 3),
+            'confidence': round(np.mean([f['confidence'] for f in frame_results]), 3),
+            'suspicious_regions': consistent_regions if consistent_regions else list(set(all_regions)),
+            'face_detected': faces_detected > 0,
+            'frame_count': len(frames_data),
+            'frames_with_faces': faces_detected,
+            'frame_results': frame_results
+        }

extraction/__init__.py ADDED Viewed

File without changes

extraction/media_extractor.py ADDED Viewed

	@@ -0,0 +1,122 @@

+import os
+import cv2
+import librosa
+import subprocess
+import numpy as np
+from typing import List, Dict, Tuple
+from extraction.timeline_generator import TimelineGenerator
+class MediaExtractor:
+    def __init__(self, frames_per_interval: int = 5):
+        self.frames_per_interval = frames_per_interval
+    def get_video_info(self, video_path: str) -> Dict:
+        cap = cv2.VideoCapture(video_path)
+        if not cap.isOpened():
+            raise ValueError(f"Cannot open video: {video_path}")
+        fps = cap.get(cv2.CAP_PROP_FPS)
+        total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+        duration = total_frames / fps if fps > 0 else 0
+        cap.release()
+        return {
+            'fps': fps,
+            'total_frames': total_frames,
+            'duration': duration
+        }
+    def extract_frames(self, video_path: str, timeline: List[Dict]) -> List[Dict]:
+        cap = cv2.VideoCapture(video_path)
+        if not cap.isOpened():
+            raise ValueError(f"Cannot open video: {video_path}")
+        for interval in timeline:
+            sample_times = np.linspace(
+                interval['start'],
+                interval['end'],
+                self.frames_per_interval,
+                endpoint=False
+            )
+            for sample_time in sample_times:
+                cap.set(cv2.CAP_PROP_POS_MSEC, sample_time * 1000)
+                ret, frame = cap.read()
+                if ret:
+                    interval['video_data'].append({
+                    'frame': frame,
+                    'timestamp': round(sample_time, 2)
+                })
+        cap.release()
+        return timeline
+    def extract_audio(self, video_path: str, timeline: List[Dict]) -> List[Dict]:
+        temp_audio = "temp_audio.wav"
+        command = [
+            'ffmpeg', '-i', video_path,
+            '-vn', '-acodec', 'pcm_s16le',
+            '-ar', '16000', '-ac', '1',
+            '-y', temp_audio
+        ]
+        try:
+            subprocess.run(command, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, check=True)
+            has_audio = os.path.exists(temp_audio) and os.path.getsize(temp_audio) > 0
+        except subprocess.CalledProcessError:
+            has_audio = False
+        if not has_audio:
+            print("Warning: No audio track detected in video")
+            for interval in timeline:
+                interval['audio_data'] = {
+                    'audio': np.zeros(16000 * 2),
+                    'sample_rate': 16000,
+                    'has_audio': False
+                }
+            return timeline
+        audio, sr = librosa.load(temp_audio, sr=16000, mono=True)
+        for interval in timeline:
+            start_sample = int(interval['start'] * sr)
+            end_sample = int(interval['end'] * sr)
+            end_sample = min(end_sample, len(audio))
+            audio_chunk = audio[start_sample:end_sample]
+            if len(audio_chunk) < sr * 0.5:
+                audio_chunk = np.pad(audio_chunk, (0, int(sr * 0.5) - len(audio_chunk)))
+            interval['audio_data'] = {
+                'audio': audio_chunk,
+                'sample_rate': sr,
+                'has_audio': True
+            }
+        if os.path.exists(temp_audio):
+            os.remove(temp_audio)
+        return timeline
+    def extract_all(self, video_path: str, interval_duration: float = 2.0) -> Tuple[List[Dict], Dict]:
+        video_info = self.get_video_info(video_path)
+        timeline_gen = TimelineGenerator(interval_duration)
+        timeline = timeline_gen.create_timeline(video_info['duration'])
+        timeline = self.extract_frames(video_path, timeline)
+        timeline = self.extract_audio(video_path, timeline)
+        return timeline, video_info

extraction/timeline_generator.py ADDED Viewed

	@@ -0,0 +1,38 @@

+import numpy as np
+from typing import List, Dict
+class TimelineGenerator:
+    def __init__(self, interval_duration: float = 2.0):
+        self.interval_duration = interval_duration
+    def create_timeline(self, video_duration: float) -> List[Dict]:
+        num_intervals = int(np.ceil(video_duration / self.interval_duration))
+        timeline = []
+        for i in range(num_intervals):
+            start_time = i * self.interval_duration
+            end_time = min((i + 1) * self.interval_duration, video_duration)
+            timeline.append({
+                'interval_id': i,
+                'start': round(start_time, 2),
+                'end': round(end_time, 2),
+                'interval': f"{start_time:.1f}-{end_time:.1f}",
+                'duration': round(end_time - start_time, 2),
+                'video_data': [],
+                'audio_data': None,
+                'video_results': None,
+                'audio_results': None
+            })
+        return timeline
+    def get_interval_for_timestamp(self, timeline: List[Dict], timestamp: float) -> Dict:
+        for interval in timeline:
+            if interval['start'] <= timestamp < interval['end']:
+                return interval
+        return timeline[-1]

main.py ADDED Viewed

	@@ -0,0 +1,438 @@

+from fastapi import FastAPI, File, UploadFile, HTTPException, Query
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import JSONResponse
+from pydantic import BaseModel
+from typing import List
+from contextlib import asynccontextmanager
+import os
+import uuid
+import shutil
+import json
+from pathlib import Path
+from datetime import datetime
+from pipeline import DeepfakeDetectionPipeline
+analysis_history = []
+MAX_HISTORY = 10
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    yield
+app = FastAPI(
+    title="DeepDefend API",
+    description="Advanced Deepfake Detection System with Multi-Modal Analysis",
+    version="1.0.0",
+    lifespan=lifespan
+)
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+UPLOAD_DIR = Path("uploads")
+UPLOAD_DIR.mkdir(exist_ok=True)
+pipeline = None
+def get_pipeline():
+    global pipeline
+    if pipeline is None:
+        print("Loading DeepDefend Pipeline...")
+        pipeline = DeepfakeDetectionPipeline()
+    return pipeline
+class AnalysisResult(BaseModel):
+    verdict: str
+    confidence: float
+    overall_scores: dict
+    detailed_analysis: str
+    suspicious_intervals: list
+    total_intervals_analyzed: int
+    video_info: dict
+    analysis_id: str
+    timestamp: str
+class HistoryItem(BaseModel):
+    analysis_id: str
+    filename: str
+    verdict: str
+    confidence: float
+    timestamp: str
+    video_duration: float
+class StatsResponse(BaseModel):
+    total_analyses: int
+    deepfakes_detected: int
+    real_videos: int
+    avg_confidence: float
+    avg_video_score: float
+    avg_audio_score: float
+class IntervalDetail(BaseModel):
+    interval_id: int
+    time_range: str
+    video_score: float
+    audio_score: float
+    verdict: str
+    suspicious_regions: dict
+def add_to_history(analysis_data: dict):
+    """Add analysis to history"""
+    history_item = {
+        "analysis_id": analysis_data["analysis_id"],
+        "filename": analysis_data["filename"],
+        "verdict": analysis_data["verdict"],
+        "confidence": analysis_data["confidence"],
+        "timestamp": analysis_data["timestamp"],
+        "video_duration": analysis_data["video_info"]["duration"],
+        "overall_scores": analysis_data["overall_scores"]
+    }
+    analysis_history.insert(0, history_item)
+    if len(analysis_history) > MAX_HISTORY:
+        analysis_history.pop()
+@app.get("/")
+async def root():
+    return {
+        "service": "DeepDefend API",
+        "version": "1.0.0",
+        "status": "online",
+        "description": "Advanced Multi-Modal Deepfake Detection",
+        "features": [
+            "Video frame-by-frame analysis",
+            "Audio deepfake detection",
+            "AI-powered evidence fusion",
+            "Frame-level heatmap generation",
+            "Interval breakdown analysis",
+            "Analysis history tracking"
+        ],
+        "endpoints": {
+            "analyze": "POST /api/analyze",
+            "history": "GET /api/history",
+            "stats": "GET /api/stats",
+            "intervals": "GET /api/intervals/{analysis_id}",
+            "compare": "GET /api/compare",
+            "health": "GET /api/health"
+        }
+    }
+@app.get("/api/health")
+async def health():
+    """Health check with system info"""
+    return {
+        "status": "healthy",
+        "pipeline_loaded": pipeline is not None,
+        "total_analyses": len(analysis_history),
+        "storage_used_mb": sum(
+            f.stat().st_size for f in UPLOAD_DIR.glob('*') if f.is_file()
+        ) / (1024 * 1024) if UPLOAD_DIR.exists() else 0,
+        "timestamp": datetime.now().isoformat()
+    }
+@app.post("/api/analyze", response_model=AnalysisResult)
+async def analyze_video(
+    file: UploadFile = File(...),
+    interval_duration: float = Query(default=2.0, ge=1.0, le=5.0)
+):
+    """
+    Upload and analyze video for deepfakes
+    Returns complete analysis with:
+    - Overall verdict and confidence
+    - Video/audio scores
+    - Suspicious intervals
+    - AI-generated detailed analysis
+    """
+    allowed_extensions = ['.mp4', '.avi', '.mov', '.mkv', '.webm']
+    file_ext = os.path.splitext(file.filename)[1].lower()
+    if file_ext not in allowed_extensions:
+        raise HTTPException(
+            status_code=400,
+            detail=f"Invalid file type. Allowed: {', '.join(allowed_extensions)}"
+        )
+    file.file.seek(0, 2)
+    file_size = file.file.tell()
+    file.file.seek(0)
+    if file_size > 250 * 1024 * 1024:
+        raise HTTPException(status_code=400, detail="File too large. Max: 250MB")
+    if file_size < 100 * 1024:
+        raise HTTPException(status_code=400, detail="File too small. Min: 100KB")
+    analysis_id = str(uuid.uuid4())
+    video_path = UPLOAD_DIR / f"{analysis_id}{file_ext}"
+    try:
+        with open(video_path, "wb") as buffer:
+            shutil.copyfileobj(file.file, buffer)
+        pipe = get_pipeline()
+        print(f"\nAnalyzing: {file.filename}")
+        results = pipe.analyze_video(str(video_path), interval_duration)
+        final_report = results['final_report']
+        video_info = results['video_info']
+        analysis_data = {
+            "analysis_id": analysis_id,
+            "filename": file.filename,
+            "verdict": final_report['verdict'],
+            "confidence": final_report['confidence'],
+            "overall_scores": final_report['overall_scores'],
+            "detailed_analysis": final_report['detailed_analysis'],
+            "suspicious_intervals": final_report['suspicious_intervals'],
+            "total_intervals_analyzed": final_report['total_intervals_analyzed'],
+            "video_info": {
+                "duration": video_info['duration'],
+                "fps": video_info['fps'],
+                "total_frames": video_info['total_frames'],
+                "file_size_mb": round(file_size / (1024 * 1024), 2)
+            },
+            "timestamp": datetime.now().isoformat()
+        }
+        add_to_history(analysis_data)
+        interval_data = {
+            'analysis_id': analysis_id,
+            'timeline': [
+                {
+                    'interval_id': interval['interval_id'],
+                    'interval': interval['interval'],
+                    'start': interval['start'],
+                    'end': interval['end'],
+                    'video_results': interval.get('video_results'),
+                    'audio_results': interval.get('audio_results')
+                }
+                for interval in results.get('timeline', [])
+            ]
+        }
+        results_path = UPLOAD_DIR / f"{analysis_id}_results.json"
+        with open(results_path, 'w') as f:
+            json.dump(interval_data, f, indent=2)
+        return AnalysisResult(**analysis_data)
+    except Exception as e:
+        print(f"Error: {e}")
+        raise HTTPException(status_code=500, detail=f"Analysis failed: {str(e)}")
+    finally:
+        if video_path.exists():
+            os.remove(video_path)
+@app.get("/api/history", response_model=List[HistoryItem])
+async def get_history(limit: int = Query(default=10, ge=1, le=50)):
+    """Get recent analysis history"""
+    return [
+        HistoryItem(
+            analysis_id=item["analysis_id"],
+            filename=item["filename"],
+            verdict=item["verdict"],
+            confidence=item["confidence"],
+            timestamp=item["timestamp"],
+            video_duration=item["video_duration"]
+        )
+        for item in analysis_history[:limit]
+    ]
+@app.get("/api/stats", response_model=StatsResponse)
+async def get_stats():
+    """Get overall statistics"""
+    if not analysis_history:
+        return StatsResponse(
+            total_analyses=0,
+            deepfakes_detected=0,
+            real_videos=0,
+            avg_confidence=0.0,
+            avg_video_score=0.0,
+            avg_audio_score=0.0
+        )
+    deepfakes = sum(1 for item in analysis_history if item["verdict"] == "DEEPFAKE")
+    real = len(analysis_history) - deepfakes
+    avg_confidence = sum(item["confidence"] for item in analysis_history) / len(analysis_history)
+    avg_video = sum(item["overall_scores"]["overall_video_score"] for item in analysis_history) / len(analysis_history)
+    avg_audio = sum(item["overall_scores"]["overall_audio_score"] for item in analysis_history) / len(analysis_history)
+    return StatsResponse(
+        total_analyses=len(analysis_history),
+        deepfakes_detected=deepfakes,
+        real_videos=real,
+        avg_confidence=round(avg_confidence, 2),
+        avg_video_score=round(avg_video, 3),
+        avg_audio_score=round(avg_audio, 3)
+    )
+@app.get("/api/intervals/{analysis_id}")
+async def get_interval_details(analysis_id: str):
+    """Get detailed interval-by-interval breakdown"""
+    results_path = UPLOAD_DIR / f"{analysis_id}_results.json"
+    if not results_path.exists():
+        raise HTTPException(status_code=404, detail="Analysis not found")
+    with open(results_path, 'r') as f:
+        interval_data = json.load(f)
+    timeline = interval_data.get('timeline', [])
+    intervals = []
+    for interval in timeline:
+        video_res = interval.get('video_results', {})
+        audio_res = interval.get('audio_results', {})
+        avg_score = (video_res.get('fake_score', 0) + audio_res.get('fake_score', 0)) / 2
+        intervals.append({
+            "interval_id": interval['interval_id'],
+            "time_range": interval['interval'],
+            "start": interval['start'],
+            "end": interval['end'],
+            "video_score": video_res.get('fake_score', 0),
+            "audio_score": audio_res.get('fake_score', 0),
+            "combined_score": round(avg_score, 3),
+            "verdict": "SUSPICIOUS" if avg_score > 0.6 else "NORMAL",
+            "suspicious_regions": {
+                "video": video_res.get('suspicious_regions', []),
+                "audio": audio_res.get('suspicious_regions', [])
+            },
+            "has_face": video_res.get('face_detected', False),
+            "has_audio": audio_res.get('has_audio', False)
+        })
+    return {
+        "analysis_id": analysis_id,
+        "total_intervals": len(intervals),
+        "intervals": intervals
+    }
+@app.get("/api/compare")
+async def compare_scores():
+    """Compare video vs audio detection rates"""
+    if not analysis_history:
+        return {
+            "message": "No analysis data available",
+            "comparison": None
+        }
+    video_higher = 0
+    audio_higher = 0
+    equal = 0
+    for item in analysis_history:
+        scores = item["overall_scores"]
+        v_score = scores["overall_video_score"]
+        a_score = scores["overall_audio_score"]
+        if v_score > a_score:
+            video_higher += 1
+        elif a_score > v_score:
+            audio_higher += 1
+        else:
+            equal += 1
+    return {
+        "total_analyses": len(analysis_history),
+        "comparison": {
+            "video_better_detection": video_higher,
+            "audio_better_detection": audio_higher,
+            "equal_detection": equal
+        },
+        "percentages": {
+            "video_dominant": round((video_higher / len(analysis_history)) * 100, 1),
+            "audio_dominant": round((audio_higher / len(analysis_history)) * 100, 1),
+            "balanced": round((equal / len(analysis_history)) * 100, 1)
+        }
+    }
+@app.get("/api/recent-verdict")
+async def get_recent_verdict_distribution(limit: int = Query(default=20, ge=5, le=50)):
+    """Get verdict distribution for recent analyses"""
+    recent = analysis_history[:limit]
+    if not recent:
+        return {
+            "total": 0,
+            "deepfakes": 0,
+            "real": 0,
+            "distribution": []
+        }
+    deepfakes = sum(1 for item in recent if item["verdict"] == "DEEPFAKE")
+    real = len(recent) - deepfakes
+    distribution = {
+        "very_confident": 0,
+        "confident": 0,
+        "moderate": 0,
+        "low": 0
+    }
+    for item in recent:
+        conf = item["confidence"]
+        if conf >= 80:
+            distribution["very_confident"] += 1
+        elif conf >= 60:
+            distribution["confident"] += 1
+        elif conf >= 40:
+            distribution["moderate"] += 1
+        else:
+            distribution["low"] += 1
+    return {
+        "total": len(recent),
+        "deepfakes": deepfakes,
+        "real": real,
+        "deepfake_rate": round((deepfakes / len(recent)) * 100, 1),
+        "confidence_distribution": distribution
+    }
+@app.delete("/api/clear-history")
+async def clear_history():
+    """Clear analysis history (for demo reset)"""
+    global analysis_history
+    count = len(analysis_history)
+    analysis_history.clear()
+    for file in UPLOAD_DIR.glob("*_results.json"):
+        os.remove(file)
+    return {
+        "message": "History cleared",
+        "items_removed": count
+    }
+@app.exception_handler(HTTPException)
+async def http_exception_handler(request, exc):
+    return JSONResponse(
+        status_code=exc.status_code,
+        content={"error": exc.detail, "status_code": exc.status_code}
+    )
+@app.exception_handler(Exception)
+async def global_exception_handler(request, exc):
+    print(f"Error: {exc}")
+    return JSONResponse(
+        status_code=500,
+        content={"error": "Internal server error", "detail": str(exc)}
+    )

models/.gitkeep ADDED Viewed

File without changes

models/__init__.py ADDED Viewed

File without changes

models/audio_model/.gitkeep ADDED Viewed

File without changes

models/download_model.py ADDED Viewed

	@@ -0,0 +1,33 @@

+from transformers import AutoModelForImageClassification, AutoImageProcessor
+from transformers import AutoModelForAudioClassification, AutoFeatureExtractor
+import os
+def download_models():
+    os.makedirs("./models/video_model", exist_ok=True)
+    os.makedirs("./models/audio_model", exist_ok=True)
+    print("Downloading video deepfake detection model...")
+    video_model_name = "dima806/deepfake_vs_real_image_detection"
+    video_model = AutoModelForImageClassification.from_pretrained(video_model_name)
+    video_processor = AutoImageProcessor.from_pretrained(video_model_name)
+    video_model.save_pretrained("./models/video_model")
+    video_processor.save_pretrained("./models/video_model")
+    print("Video model saved to ./models/video_model")
+    print("\nDownloading audio deepfake detection model...")
+    audio_model_name = "mo-thecreator/Deepfake-audio-detection"
+    audio_model = AutoModelForAudioClassification.from_pretrained(audio_model_name)
+    audio_processor = AutoFeatureExtractor.from_pretrained(audio_model_name)
+    audio_model.save_pretrained("./models/audio_model")
+    audio_processor.save_pretrained("./models/audio_model")
+    print("Audio model saved to ./models/audio_model")
+    print("\nAll models downloaded successfully!")
+if __name__ == "__main__":
+    download_models()

models/load_models.py ADDED Viewed

	@@ -0,0 +1,49 @@

+from transformers import AutoModelForImageClassification, AutoImageProcessor
+from transformers import AutoModelForAudioClassification, AutoFeatureExtractor
+import torch
+class ModelLoader:
+    _instance = None
+    _video_model = None
+    _video_processor = None
+    _audio_model = None
+    _audio_processor = None
+    def __new__(cls):
+        if cls._instance is None:
+            cls._instance = super(ModelLoader, cls).__new__(cls)
+        return cls._instance
+    def load_video_model(self):
+        if self._video_model is None:
+            self._video_model = AutoModelForImageClassification.from_pretrained("./models/video_model")
+            self._video_processor = AutoImageProcessor.from_pretrained("./models/video_model")
+            self._video_model.eval()
+            if torch.cuda.is_available():
+                self._video_model = self._video_model.cuda()
+            print("Video model loaded!")
+        return self._video_model, self._video_processor
+    def load_audio_model(self):
+        if self._audio_model is None:
+            self._audio_model = AutoModelForAudioClassification.from_pretrained("./models/audio_model")
+            self._audio_processor = AutoFeatureExtractor.from_pretrained("./models/audio_model")
+            self._audio_model.eval()
+            if torch.cuda.is_available():
+                self._audio_model = self._audio_model.cuda()
+            print("Audio model loaded!")
+        return self._audio_model, self._audio_processor
+    def get_device(self):
+        return "cuda" if torch.cuda.is_available() else "cpu"
+model_loader = ModelLoader()

models/video_model/.gitkeep ADDED Viewed

File without changes

pipeline.py ADDED Viewed

	@@ -0,0 +1,35 @@

+from typing import Dict
+from extraction.media_extractor import MediaExtractor
+from analysis.video_analyser import VideoAnalyzer
+from analysis.audio_analyser import AudioAnalyzer
+from analysis.llm_analyser import LLMFusion
+class DeepfakeDetectionPipeline:
+    """Complete deepfake detection pipeline"""
+    def __init__(self):
+        self.media_extractor = MediaExtractor(frames_per_interval=5)
+        self.video_analyzer = VideoAnalyzer()
+        self.audio_analyzer = AudioAnalyzer()
+        self.llm_fusion = LLMFusion()
+    def analyze_video(self, video_path: str, interval_duration: float = 2.0) -> Dict:
+        timeline, video_info = self.media_extractor.extract_all(video_path, interval_duration)
+        for i, interval in enumerate(timeline):
+            video_results = self.video_analyzer.analyze_interval(interval)
+            interval['video_results'] = video_results
+        for i, interval in enumerate(timeline):
+            audio_results = self.audio_analyzer.analyze_interval(interval)
+            interval['audio_results'] = audio_results
+        final_report = self.llm_fusion.generate_report(timeline, video_info)
+        return {
+            'video_info': video_info,
+            'timeline': timeline,
+            'final_report': final_report,
+            'summary': self.llm_fusion.generate_report(timeline, video_info)
+        }

requirements.txt ADDED Viewed

	@@ -0,0 +1,35 @@

+# FastAPI and Server
+fastapi==0.109.0
+uvicorn[standard]==0.27.0
+python-multipart==0.0.6
+pydantic==2.5.3
+# Core ML Framework
+torch==2.3.1
+torchvision==0.18.1
+torchaudio==2.3.1
+transformers==4.36.2
+# Computer Vision
+opencv-python-headless==4.9.0.80  # Changed: headless for Docker
+Pillow==10.2.0
+# Audio Processing
+librosa==0.10.1
+soundfile==0.12.1
+ffmpeg-python==0.2.0
+audioread==3.0.1  # Added: required for librosa
+# LLM Integration
+langchain==0.1.0
+langchain-google-genai==0.0.6
+google-generativeai==0.3.2
+# Data Processing
+numpy==1.24.3  # Compatible with librosa
+pandas==2.0.3
+scipy==1.11.4
+# Utilities
+requests==2.31.0
+python-dotenv==1.0.0

uploads/.gitkeep ADDED Viewed

File without changes