MohamedMotaz commited on
Commit
f0f1aaa
1 Parent(s): 4a490bc

Deployment

Browse files
3_RetinaFace_and_HSEmotion.ipynb ADDED
@@ -0,0 +1,439 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "nbformat": 4,
3
+ "nbformat_minor": 0,
4
+ "metadata": {
5
+ "colab": {
6
+ "provenance": []
7
+ },
8
+ "kernelspec": {
9
+ "name": "python3",
10
+ "display_name": "Python 3"
11
+ },
12
+ "language_info": {
13
+ "name": "python"
14
+ }
15
+ },
16
+ "cells": [
17
+ {
18
+ "cell_type": "markdown",
19
+ "source": [
20
+ "# Video and Image Emotion Annotation\n",
21
+ "\n",
22
+ "This script facilitates the detection of faces and annotation of recognized emotions in both videos and images. It utilizes state-of-the-art deep learning models for face detection and emotion recognition, namely RetinaFace and HSEmotionRecognizer, respectively. The goal is to enhance media content understanding by automatically labeling facial expressions with emotional states.\n",
23
+ "Components:\n",
24
+ "\n",
25
+ "## Face Detection using RetinaFace:\n",
26
+ " The detect_faces function leverages the RetinaFace model to identify faces within a given frame of video or image data. It retrieves facial bounding boxes, providing precise coordinates for subsequent processing.\n",
27
+ "\n",
28
+ "## Emotion Recognition with HSEmotionRecognizer:\n",
29
+ " The HSEmotionRecognizer model, initialized as recognizer, interprets emotional states from extracted face regions. It predicts emotions based on learned features from the provided face images.\n",
30
+ "\n",
31
+ "## Annotation and Visualization:\n",
32
+ " The annotate_frame function annotates each detected face with its recognized emotion. It draws bounding boxes around faces and labels them with the predicted emotional state, enhancing visual understanding of the content.\n",
33
+ "\n",
34
+ "## Processing Pipeline:\n",
35
+ " Video Processing:\n",
36
+ " process_video_frames: Iterates through frames of a video, applying face detection and emotion annotation. It saves the processed frames into a temporary video file.\n",
37
+ " add_audio_to_video: Incorporates audio from the original video back into the processed frames, creating a final annotated video output.\n",
38
+ " process_video: Integrates frame processing and audio addition into a cohesive function for video processing tasks.\n",
39
+ " Image Processing:\n",
40
+ " process_image: Handles single images by detecting faces, annotating emotions, and optionally combining input and annotated images for visualization.\n",
41
+ "\n",
42
+ "# Usage:\n",
43
+ "\n",
44
+ " Video Processing: Provide paths to video files (*.mp4, *.avi, *.mov, *.mkv) to analyze and annotate facial expressions throughout the video duration.\n",
45
+ " Image Processing: For static images (*.jpg, *.jpeg, *.png), the script detects faces, predicts emotions, and optionally displays the original and annotated images side by side.\n"
46
+ ],
47
+ "metadata": {
48
+ "id": "uVIVXD0L9CLi"
49
+ }
50
+ },
51
+ {
52
+ "cell_type": "markdown",
53
+ "source": [
54
+ "## Setup\n",
55
+ "install the required libraries:"
56
+ ],
57
+ "metadata": {
58
+ "id": "H5vMPITJIVyT"
59
+ }
60
+ },
61
+ {
62
+ "cell_type": "code",
63
+ "execution_count": 1,
64
+ "metadata": {
65
+ "colab": {
66
+ "base_uri": "https://localhost:8080/"
67
+ },
68
+ "id": "XfbBa45h4i-Q",
69
+ "outputId": "897c0f6c-2622-4143-beff-0916c33b62cb"
70
+ },
71
+ "outputs": [
72
+ {
73
+ "output_type": "stream",
74
+ "name": "stdout",
75
+ "text": [
76
+ "Collecting retina-face\n",
77
+ " Downloading retina_face-0.0.17-py3-none-any.whl (25 kB)\n",
78
+ "Collecting hsemotion\n",
79
+ " Downloading hsemotion-0.3.0.tar.gz (8.0 kB)\n",
80
+ " Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
81
+ "Requirement already satisfied: moviepy in /usr/local/lib/python3.10/dist-packages (1.0.3)\n",
82
+ "Requirement already satisfied: numpy>=1.14.0 in /usr/local/lib/python3.10/dist-packages (from retina-face) (1.25.2)\n",
83
+ "Requirement already satisfied: gdown>=3.10.1 in /usr/local/lib/python3.10/dist-packages (from retina-face) (5.1.0)\n",
84
+ "Requirement already satisfied: Pillow>=5.2.0 in /usr/local/lib/python3.10/dist-packages (from retina-face) (9.4.0)\n",
85
+ "Requirement already satisfied: opencv-python>=3.4.4 in /usr/local/lib/python3.10/dist-packages (from retina-face) (4.8.0.76)\n",
86
+ "Requirement already satisfied: tensorflow>=1.9.0 in /usr/local/lib/python3.10/dist-packages (from retina-face) (2.15.0)\n",
87
+ "Requirement already satisfied: torch in /usr/local/lib/python3.10/dist-packages (from hsemotion) (2.3.0+cu121)\n",
88
+ "Requirement already satisfied: torchvision in /usr/local/lib/python3.10/dist-packages (from hsemotion) (0.18.0+cu121)\n",
89
+ "Collecting timm (from hsemotion)\n",
90
+ " Downloading timm-1.0.3-py3-none-any.whl (2.3 MB)\n",
91
+ "\u001b[2K \u001b[90m━━━━��━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.3/2.3 MB\u001b[0m \u001b[31m12.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
92
+ "\u001b[?25hRequirement already satisfied: decorator<5.0,>=4.0.2 in /usr/local/lib/python3.10/dist-packages (from moviepy) (4.4.2)\n",
93
+ "Requirement already satisfied: tqdm<5.0,>=4.11.2 in /usr/local/lib/python3.10/dist-packages (from moviepy) (4.66.4)\n",
94
+ "Requirement already satisfied: requests<3.0,>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from moviepy) (2.31.0)\n",
95
+ "Requirement already satisfied: proglog<=1.0.0 in /usr/local/lib/python3.10/dist-packages (from moviepy) (0.1.10)\n",
96
+ "Requirement already satisfied: imageio<3.0,>=2.5 in /usr/local/lib/python3.10/dist-packages (from moviepy) (2.31.6)\n",
97
+ "Requirement already satisfied: imageio-ffmpeg>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from moviepy) (0.5.1)\n",
98
+ "Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.10/dist-packages (from gdown>=3.10.1->retina-face) (4.12.3)\n",
99
+ "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from gdown>=3.10.1->retina-face) (3.14.0)\n",
100
+ "Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from imageio-ffmpeg>=0.2.0->moviepy) (67.7.2)\n",
101
+ "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3.0,>=2.8.1->moviepy) (3.3.2)\n",
102
+ "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3.0,>=2.8.1->moviepy) (3.7)\n",
103
+ "Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3.0,>=2.8.1->moviepy) (2.0.7)\n",
104
+ "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3.0,>=2.8.1->moviepy) (2024.6.2)\n",
105
+ "Requirement already satisfied: absl-py>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (1.4.0)\n",
106
+ "Requirement already satisfied: astunparse>=1.6.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (1.6.3)\n",
107
+ "Requirement already satisfied: flatbuffers>=23.5.26 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (24.3.25)\n",
108
+ "Requirement already satisfied: gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (0.5.4)\n",
109
+ "Requirement already satisfied: google-pasta>=0.1.1 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (0.2.0)\n",
110
+ "Requirement already satisfied: h5py>=2.9.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (3.9.0)\n",
111
+ "Requirement already satisfied: libclang>=13.0.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (18.1.1)\n",
112
+ "Requirement already satisfied: ml-dtypes~=0.2.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (0.2.0)\n",
113
+ "Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (3.3.0)\n",
114
+ "Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (24.1)\n",
115
+ "Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (3.20.3)\n",
116
+ "Requirement already satisfied: six>=1.12.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (1.16.0)\n",
117
+ "Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (2.4.0)\n",
118
+ "Requirement already satisfied: typing-extensions>=3.6.6 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (4.12.2)\n",
119
+ "Requirement already satisfied: wrapt<1.15,>=1.11.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (1.14.1)\n",
120
+ "Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (0.37.0)\n",
121
+ "Requirement already satisfied: grpcio<2.0,>=1.24.3 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (1.64.1)\n",
122
+ "Requirement already satisfied: tensorboard<2.16,>=2.15 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (2.15.2)\n",
123
+ "Requirement already satisfied: tensorflow-estimator<2.16,>=2.15.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (2.15.0)\n",
124
+ "Requirement already satisfied: keras<2.16,>=2.15.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (2.15.0)\n",
125
+ "Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from timm->hsemotion) (6.0.1)\n",
126
+ "Requirement already satisfied: huggingface_hub in /usr/local/lib/python3.10/dist-packages (from timm->hsemotion) (0.23.3)\n",
127
+ "Requirement already satisfied: safetensors in /usr/local/lib/python3.10/dist-packages (from timm->hsemotion) (0.4.3)\n",
128
+ "Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch->hsemotion) (1.12.1)\n",
129
+ "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch->hsemotion) (3.3)\n",
130
+ "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch->hsemotion) (3.1.4)\n",
131
+ "Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch->hsemotion) (2023.6.0)\n",
132
+ "Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch->hsemotion)\n",
133
+ " Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)\n",
134
+ "Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch->hsemotion)\n",
135
+ " Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)\n",
136
+ "Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch->hsemotion)\n",
137
+ " Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)\n",
138
+ "Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch->hsemotion)\n",
139
+ " Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)\n",
140
+ "Collecting nvidia-cublas-cu12==12.1.3.1 (from torch->hsemotion)\n",
141
+ " Using cached nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)\n",
142
+ "Collecting nvidia-cufft-cu12==11.0.2.54 (from torch->hsemotion)\n",
143
+ " Using cached nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)\n",
144
+ "Collecting nvidia-curand-cu12==10.3.2.106 (from torch->hsemotion)\n",
145
+ " Using cached nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)\n",
146
+ "Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch->hsemotion)\n",
147
+ " Using cached nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)\n",
148
+ "Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch->hsemotion)\n",
149
+ " Using cached nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)\n",
150
+ "Collecting nvidia-nccl-cu12==2.20.5 (from torch->hsemotion)\n",
151
+ " Using cached nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl (176.2 MB)\n",
152
+ "Collecting nvidia-nvtx-cu12==12.1.105 (from torch->hsemotion)\n",
153
+ " Using cached nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)\n",
154
+ "Requirement already satisfied: triton==2.3.0 in /usr/local/lib/python3.10/dist-packages (from torch->hsemotion) (2.3.0)\n",
155
+ "Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch->hsemotion)\n",
156
+ " Downloading nvidia_nvjitlink_cu12-12.5.40-py3-none-manylinux2014_x86_64.whl (21.3 MB)\n",
157
+ "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m21.3/21.3 MB\u001b[0m \u001b[31m51.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
158
+ "\u001b[?25hRequirement already satisfied: wheel<1.0,>=0.23.0 in /usr/local/lib/python3.10/dist-packages (from astunparse>=1.6.0->tensorflow>=1.9.0->retina-face) (0.43.0)\n",
159
+ "Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (2.27.0)\n",
160
+ "Requirement already satisfied: google-auth-oauthlib<2,>=0.5 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (1.2.0)\n",
161
+ "Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (3.6)\n",
162
+ "Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (0.7.2)\n",
163
+ "Requirement already satisfied: werkzeug>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (3.0.3)\n",
164
+ "Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.10/dist-packages (from beautifulsoup4->gdown>=3.10.1->retina-face) (2.5)\n",
165
+ "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch->hsemotion) (2.1.5)\n",
166
+ "Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /usr/local/lib/python3.10/dist-packages (from requests<3.0,>=2.8.1->moviepy) (1.7.1)\n",
167
+ "Requirement already satisfied: mpmath<1.4.0,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy->torch->hsemotion) (1.3.0)\n",
168
+ "Requirement already satisfied: cachetools<6.0,>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (5.3.3)\n",
169
+ "Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (0.4.0)\n",
170
+ "Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.10/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (4.9)\n",
171
+ "Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from google-auth-oauthlib<2,>=0.5->tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (1.3.1)\n",
172
+ "Requirement already satisfied: pyasn1<0.7.0,>=0.4.6 in /usr/local/lib/python3.10/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (0.6.0)\n",
173
+ "Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.10/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<2,>=0.5->tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (3.2.2)\n",
174
+ "Building wheels for collected packages: hsemotion\n",
175
+ " Building wheel for hsemotion (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
176
+ " Created wheel for hsemotion: filename=hsemotion-0.3.0-py3-none-any.whl size=11244 sha256=bd2bf1b0a08fe9b4e58666996092c52273fc8a8fac2f39229f53350e81e2c12b\n",
177
+ " Stored in directory: /root/.cache/pip/wheels/38/88/e0/3b365122443c2ec55f3e058f2b7ad59df7b5e302c457c4539a\n",
178
+ "Successfully built hsemotion\n",
179
+ "Installing collected packages: nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, nvidia-cusparse-cu12, nvidia-cudnn-cu12, nvidia-cusolver-cu12, timm, retina-face, hsemotion\n",
180
+ "Successfully installed hsemotion-0.3.0 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.5.40 nvidia-nvtx-cu12-12.1.105 retina-face-0.0.17 timm-1.0.3\n"
181
+ ]
182
+ }
183
+ ],
184
+ "source": [
185
+ "! pip install retina-face hsemotion moviepy"
186
+ ]
187
+ },
188
+ {
189
+ "cell_type": "code",
190
+ "source": [
191
+ "from moviepy.editor import VideoFileClip, concatenate_videoclips\n",
192
+ "from retinaface import RetinaFace\n",
193
+ "from hsemotion.facial_emotions import HSEmotionRecognizer\n",
194
+ "import cv2\n",
195
+ "import numpy as np\n",
196
+ "import os\n",
197
+ "from google.colab.patches import cv2_imshow # Import cv2_imshow for Colab"
198
+ ],
199
+ "metadata": {
200
+ "id": "Ab06qU5l4p5G"
201
+ },
202
+ "execution_count": 2,
203
+ "outputs": []
204
+ },
205
+ {
206
+ "cell_type": "code",
207
+ "source": [
208
+ "## Initialize recognizer\n",
209
+ "\n",
210
+ "recognizer = HSEmotionRecognizer(model_name='enet_b0_8_best_vgaf', device='cpu')\n",
211
+ "\n",
212
+ "## Face Detection Function\n",
213
+ "\n",
214
+ "def detect_faces(frame):\n",
215
+ " \"\"\" Detect faces in the frame using RetinaFace \"\"\"\n",
216
+ " faces = RetinaFace.detect_faces(frame)\n",
217
+ " if isinstance(faces, dict):\n",
218
+ " face_list = []\n",
219
+ " for key in faces.keys():\n",
220
+ " face = faces[key]\n",
221
+ " facial_area = face['facial_area']\n",
222
+ " face_dict = {\n",
223
+ " 'box': (facial_area[0], facial_area[1], facial_area[2] - facial_area[0], facial_area[3] - facial_area[1])\n",
224
+ " }\n",
225
+ " face_list.append(face_dict)\n",
226
+ " return face_list\n",
227
+ " return []\n",
228
+ "\n",
229
+ "## Annotation Function\n",
230
+ "\n",
231
+ "def annotate_frame(frame, faces):\n",
232
+ " \"\"\" Annotate the frame with recognized emotions using global recognizer \"\"\"\n",
233
+ " for face in faces:\n",
234
+ " x, y, w, h = face['box']\n",
235
+ " face_image = frame[y:y+h, x:x+w] # Extract face region from frame\n",
236
+ " emotion = classify_emotions(face_image)\n",
237
+ " cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)\n",
238
+ " cv2.putText(frame, emotion, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255, 0, 0), 2)\n",
239
+ "\n",
240
+ "## Emotion Classification Function\n",
241
+ "\n",
242
+ "def classify_emotions(face_image):\n",
243
+ " \"\"\" Classify emotions for the given face image using global recognizer \"\"\"\n",
244
+ " results = recognizer.predict_emotions(face_image)\n",
245
+ " if results:\n",
246
+ " emotion = results[0] # Get the most likely emotion\n",
247
+ " else:\n",
248
+ " emotion = 'Unknown'\n",
249
+ " return emotion\n",
250
+ "\n",
251
+ "## Process Video Frames\n",
252
+ "\n",
253
+ "def process_video_frames(video_path, temp_output_path, frame_skip=5):\n",
254
+ " # Load the video\n",
255
+ " video_clip = VideoFileClip(video_path)\n",
256
+ " fps = video_clip.fps\n",
257
+ "\n",
258
+ " # Initialize output video writer\n",
259
+ " out = cv2.VideoWriter(temp_output_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (int(video_clip.size[0]), int(video_clip.size[1])))\n",
260
+ "\n",
261
+ " # Iterate through frames, detect faces, and annotate emotions\n",
262
+ " frame_count = 0\n",
263
+ " for frame in video_clip.iter_frames():\n",
264
+ " if frame_count % frame_skip == 0: # Process every nth frame\n",
265
+ " faces = detect_faces(frame)\n",
266
+ " annotate_frame(frame, faces)\n",
267
+ " frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR) # Convert RGB to BGR for OpenCV\n",
268
+ " out.write(frame)\n",
269
+ " frame_count += 1\n",
270
+ "\n",
271
+ " # Release resources and cleanup\n",
272
+ " out.release()\n",
273
+ " cv2.destroyAllWindows()\n",
274
+ " video_clip.close()\n",
275
+ "\n",
276
+ "## Add Audio to Processed Video\n",
277
+ "\n",
278
+ "def add_audio_to_video(original_video_path, processed_video_path, output_path):\n",
279
+ " try:\n",
280
+ " original_clip = VideoFileClip(original_video_path)\n",
281
+ " processed_clip = VideoFileClip(processed_video_path)\n",
282
+ " final_clip = processed_clip.set_audio(original_clip.audio)\n",
283
+ " final_clip.write_videofile(output_path, codec='libx264', audio_codec='aac')\n",
284
+ " except Exception as e:\n",
285
+ " print(f\"Error while combining with audio: {e}\")\n",
286
+ " finally:\n",
287
+ " original_clip.close()\n",
288
+ " processed_clip.close()\n",
289
+ "\n",
290
+ "## Process Video\n",
291
+ "\n",
292
+ "def process_video(video_path, output_path):\n",
293
+ " temp_output_path = 'temp_output_video.mp4'\n",
294
+ "\n",
295
+ " # Process video frames and save to a temporary file\n",
296
+ " process_video_frames(video_path, temp_output_path, frame_skip=5) # Adjust frame_skip as needed\n",
297
+ "\n",
298
+ " # Add audio to the processed video\n",
299
+ " add_audio_to_video(video_path, temp_output_path, output_path)\n",
300
+ "\n",
301
+ "## Process Image\n",
302
+ "\n",
303
+ "def process_image(input_path, output_path):\n",
304
+ " # Step 1: Read input image\n",
305
+ " image = cv2.imread(input_path)\n",
306
+ " if image is None:\n",
307
+ " print(f\"Error: Unable to read image at '{input_path}'\")\n",
308
+ " return\n",
309
+ "\n",
310
+ " # Step 2: Detect faces and annotate emotions\n",
311
+ " faces = detect_faces(image)\n",
312
+ " annotate_frame(image, faces)\n",
313
+ "\n",
314
+ " # Step 3: Write annotated image to output path\n",
315
+ " cv2.imwrite(output_path, image)\n",
316
+ "\n",
317
+ " # Step 4: Combine input and output images horizontally\n",
318
+ " input_image = cv2.imread(input_path)\n",
319
+ " combined_image = cv2.hconcat([input_image, image])\n",
320
+ "\n",
321
+ " # Step 5: Save or display the combined image\n",
322
+ " cv2.imwrite(output_path, combined_image)\n",
323
+ " cv2_imshow(combined_image) # Display combined image in Colab\n",
324
+ "\n"
325
+ ],
326
+ "metadata": {
327
+ "colab": {
328
+ "base_uri": "https://localhost:8080/"
329
+ },
330
+ "id": "eSjUuQBG7Opw",
331
+ "outputId": "bb32e267-67f0-4797-f5ad-72e6ff5ef869"
332
+ },
333
+ "execution_count": 15,
334
+ "outputs": [
335
+ {
336
+ "output_type": "stream",
337
+ "name": "stdout",
338
+ "text": [
339
+ "/root/.hsemotion/enet_b0_8_best_vgaf.pt Compose(\n",
340
+ " Resize(size=(224, 224), interpolation=bilinear, max_size=None, antialias=True)\n",
341
+ " ToTensor()\n",
342
+ " Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])\n",
343
+ ")\n"
344
+ ]
345
+ }
346
+ ]
347
+ },
348
+ {
349
+ "cell_type": "markdown",
350
+ "source": [
351
+ "# Time to process the video or image\n",
352
+ "**NOTE : You can use your own data by changing the path**"
353
+ ],
354
+ "metadata": {
355
+ "id": "fQ77JezWHh9y"
356
+ }
357
+ },
358
+ {
359
+ "cell_type": "code",
360
+ "source": [
361
+ "if __name__ == \"__main__\":\n",
362
+ " input_path = '/content/رياكشن عبلة كامل تبكي.mp4' # Update with your video or image path\n",
363
+ " output_path = '/content/رياكشن عبلة كامل تبكي out.mp4' # Update with the desired output path\n",
364
+ "\n",
365
+ " if input_path.lower().endswith(('.mp4', '.avi', '.mov', '.mkv')):\n",
366
+ " process_video(input_path, output_path)\n",
367
+ " elif input_path.lower().endswith(('.jpg', '.jpeg', '.png')):\n",
368
+ " process_image(input_path, output_path)\n",
369
+ " else:\n",
370
+ " print(\"Unsupported file format. Please provide a video or image file.\")"
371
+ ],
372
+ "metadata": {
373
+ "colab": {
374
+ "base_uri": "https://localhost:8080/"
375
+ },
376
+ "id": "HiWL0XT5AONd",
377
+ "outputId": "060281bb-0269-4369-ecce-32f472a3cf0a"
378
+ },
379
+ "execution_count": 16,
380
+ "outputs": [
381
+ {
382
+ "output_type": "stream",
383
+ "name": "stdout",
384
+ "text": [
385
+ "Moviepy - Building video /content/رياكشن عبلة كامل تبكي out.mp4.\n",
386
+ "MoviePy - Writing audio in رياكشن عبلة كامل تبكي outTEMP_MPY_wvf_snd.mp4\n"
387
+ ]
388
+ },
389
+ {
390
+ "output_type": "stream",
391
+ "name": "stderr",
392
+ "text": []
393
+ },
394
+ {
395
+ "output_type": "stream",
396
+ "name": "stdout",
397
+ "text": [
398
+ "MoviePy - Done.\n",
399
+ "Moviepy - Writing video /content/رياكشن عبلة كامل تبكي out.mp4\n",
400
+ "\n"
401
+ ]
402
+ },
403
+ {
404
+ "output_type": "stream",
405
+ "name": "stderr",
406
+ "text": []
407
+ },
408
+ {
409
+ "output_type": "stream",
410
+ "name": "stdout",
411
+ "text": [
412
+ "Moviepy - Done !\n",
413
+ "Moviepy - video ready /content/رياكشن عبلة كامل تبكي out.mp4\n"
414
+ ]
415
+ }
416
+ ]
417
+ },
418
+ {
419
+ "cell_type": "code",
420
+ "source": [
421
+ "if __name__ == \"__main__\":\n",
422
+ " input_path = '/content/mn (2).jpeg' # Update with your video or image path\n",
423
+ " output_path = '/content/mn (2)-out.jpeg' # Update with the desired output path\n",
424
+ "\n",
425
+ " if input_path.lower().endswith(('.mp4', '.avi', '.mov', '.mkv')):\n",
426
+ " process_video(input_path, output_path)\n",
427
+ " elif input_path.lower().endswith(('.jpg', '.jpeg', '.png')):\n",
428
+ " process_image(input_path, output_path)\n",
429
+ " else:\n",
430
+ " print(\"Unsupported file format. Please provide a video or image file.\")"
431
+ ],
432
+ "metadata": {
433
+ "id": "mdllZ7085ZlK"
434
+ },
435
+ "execution_count": null,
436
+ "outputs": []
437
+ }
438
+ ]
439
+ }
app.py ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ from face_emotion_pipeline import process_video, process_image
3
+
4
+ def process_file(file, is_video):
5
+ input_path = file.name
6
+ output_path = "output." + ("mp4" if is_video else "png")
7
+ if is_video:
8
+ process_video(input_path, output_path)
9
+ else:
10
+ process_image(input_path, output_path)
11
+ return output_path
12
+
13
+ iface = gr.Interface(
14
+ fn=process_file,
15
+ inputs=[gr.inputs.File(label="Upload File"), gr.inputs.Checkbox(label="Is Video?")],
16
+ outputs=gr.outputs.File(label="Processed File"),
17
+ title="Face Emotion Detection",
18
+ description="Upload an image or video to detect and annotate emotions."
19
+ )
20
+
21
+ if __name__ == "__main__":
22
+ iface.launch()
face_emotion_pipeline.py ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import cv2
2
+ from moviepy.editor import VideoFileClip
3
+ from retinaface import RetinaFace
4
+ from hsemotion.facial_emotions import HSEmotionRecognizer
5
+
6
+ recognizer = HSEmotionRecognizer(model_name='enet_b0_8_best_vgaf', device='cpu')
7
+
8
+ def detect_faces(frame):
9
+ faces = RetinaFace.detect_faces(frame)
10
+ if isinstance(faces, dict):
11
+ face_list = []
12
+ for key in faces.keys():
13
+ face = faces[key]
14
+ facial_area = face['facial_area']
15
+ face_dict = {
16
+ 'box': (facial_area[0], facial_area[1], facial_area[2] - facial_area[0], facial_area[3] - facial_area[1])
17
+ }
18
+ face_list.append(face_dict)
19
+ return face_list
20
+ return []
21
+
22
+ def annotate_frame(frame, faces):
23
+ for face in faces:
24
+ x, y, w, h = face['box']
25
+ face_image = frame[y:y+h, x:x+w]
26
+ emotion = classify_emotions(face_image)
27
+ cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
28
+ cv2.putText(frame, emotion, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255, 0, 0), 2)
29
+
30
+ def classify_emotions(face_image):
31
+ results = recognizer.predict_emotions(face_image)
32
+ if results:
33
+ emotion = results[0]
34
+ else:
35
+ emotion = 'Unknown'
36
+ return emotion
37
+
38
+ def process_video_frames(video_path, temp_output_path, frame_skip=5):
39
+ video_clip = VideoFileClip(video_path)
40
+ fps = video_clip.fps
41
+ out = cv2.VideoWriter(temp_output_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (int(video_clip.size[0]), int(video_clip.size[1])))
42
+ frame_count = 0
43
+ for frame in video_clip.iter_frames():
44
+ if frame_count % frame_skip == 0:
45
+ faces = detect_faces(frame)
46
+ annotate_frame(frame, faces)
47
+ frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
48
+ out.write(frame)
49
+ frame_count += 1
50
+ out.release()
51
+ cv2.destroyAllWindows()
52
+ video_clip.close()
53
+
54
+ def add_audio_to_video(original_video_path, processed_video_path, output_path):
55
+ try:
56
+ original_clip = VideoFileClip(original_video_path)
57
+ processed_clip = VideoFileClip(processed_video_path)
58
+ final_clip = processed_clip.set_audio(original_clip.audio)
59
+ final_clip.write_videofile(output_path, codec='libx264', audio_codec='aac')
60
+ except Exception as e:
61
+ print(f"Error while combining with audio: {e}")
62
+ finally:
63
+ original_clip.close()
64
+ processed_clip.close()
65
+
66
+ def process_video(video_path, output_path):
67
+ temp_output_path = 'temp_output_video.mp4'
68
+ process_video_frames(video_path, temp_output_path, frame_skip=5)
69
+ add_audio_to_video(video_path, temp_output_path, output_path)
70
+
71
+ def process_image(input_path, output_path):
72
+ image = cv2.imread(input_path)
73
+ if image is None:
74
+ print(f"Error: Unable to read image at '{input_path}'")
75
+ return
76
+ faces = detect_faces(image)
77
+ annotate_frame(image, faces)
78
+ cv2.imwrite(output_path, image)
79
+ input_image = cv2.imread(input_path)
80
+ combined_image = cv2.hconcat([input_image, image])
81
+ cv2.imwrite(output_path, combined_image)
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ opencv-python-headless
2
+ moviepy
3
+ hsemotion
4
+ gradio
5
+ retina-face hsemotion moviepy