File size: 28,301 Bytes
f0f1aaa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "source": [
        "# Video and Image Emotion Annotation\n",
        "\n",
        "This script facilitates the detection of faces and annotation of recognized emotions in both videos and images. It utilizes state-of-the-art deep learning models for face detection and emotion recognition, namely RetinaFace and HSEmotionRecognizer, respectively. The goal is to enhance media content understanding by automatically labeling facial expressions with emotional states.\n",
        "Components:\n",
        "\n",
        "## Face Detection using RetinaFace:\n",
        "     The detect_faces function leverages the RetinaFace model to identify faces within a given frame of video or image data. It retrieves facial bounding boxes, providing precise coordinates for subsequent processing.\n",
        "\n",
        "## Emotion Recognition with HSEmotionRecognizer:\n",
        "     The HSEmotionRecognizer model, initialized as recognizer, interprets emotional states from extracted face regions. It predicts emotions based on learned features from the provided face images.\n",
        "\n",
        "## Annotation and Visualization:\n",
        "     The annotate_frame function annotates each detected face with its recognized emotion. It draws bounding boxes around faces and labels them with the predicted emotional state, enhancing visual understanding of the content.\n",
        "\n",
        "## Processing Pipeline:\n",
        "        Video Processing:\n",
        "        process_video_frames: Iterates through frames of a video, applying face detection and emotion annotation. It saves the processed frames into a temporary video file.\n",
        "        add_audio_to_video: Incorporates audio from the original video back into the processed frames, creating a final annotated video output.\n",
        "        process_video: Integrates frame processing and audio addition into a cohesive function for video processing tasks.\n",
        "        Image Processing:\n",
        "        process_image: Handles single images by detecting faces, annotating emotions, and optionally combining input and annotated images for visualization.\n",
        "\n",
        "# Usage:\n",
        "\n",
        "    Video Processing: Provide paths to video files (*.mp4, *.avi, *.mov, *.mkv) to analyze and annotate facial expressions throughout the video duration.\n",
        "    Image Processing: For static images (*.jpg, *.jpeg, *.png), the script detects faces, predicts emotions, and optionally displays the original and annotated images side by side.\n"
      ],
      "metadata": {
        "id": "uVIVXD0L9CLi"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Setup\n",
        "install the required libraries:"
      ],
      "metadata": {
        "id": "H5vMPITJIVyT"
      }
    },
    {
      "cell_type": "code",
      "execution_count": 1,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "XfbBa45h4i-Q",
        "outputId": "897c0f6c-2622-4143-beff-0916c33b62cb"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Collecting retina-face\n",
            "  Downloading retina_face-0.0.17-py3-none-any.whl (25 kB)\n",
            "Collecting hsemotion\n",
            "  Downloading hsemotion-0.3.0.tar.gz (8.0 kB)\n",
            "  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "Requirement already satisfied: moviepy in /usr/local/lib/python3.10/dist-packages (1.0.3)\n",
            "Requirement already satisfied: numpy>=1.14.0 in /usr/local/lib/python3.10/dist-packages (from retina-face) (1.25.2)\n",
            "Requirement already satisfied: gdown>=3.10.1 in /usr/local/lib/python3.10/dist-packages (from retina-face) (5.1.0)\n",
            "Requirement already satisfied: Pillow>=5.2.0 in /usr/local/lib/python3.10/dist-packages (from retina-face) (9.4.0)\n",
            "Requirement already satisfied: opencv-python>=3.4.4 in /usr/local/lib/python3.10/dist-packages (from retina-face) (4.8.0.76)\n",
            "Requirement already satisfied: tensorflow>=1.9.0 in /usr/local/lib/python3.10/dist-packages (from retina-face) (2.15.0)\n",
            "Requirement already satisfied: torch in /usr/local/lib/python3.10/dist-packages (from hsemotion) (2.3.0+cu121)\n",
            "Requirement already satisfied: torchvision in /usr/local/lib/python3.10/dist-packages (from hsemotion) (0.18.0+cu121)\n",
            "Collecting timm (from hsemotion)\n",
            "  Downloading timm-1.0.3-py3-none-any.whl (2.3 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.3/2.3 MB\u001b[0m \u001b[31m12.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: decorator<5.0,>=4.0.2 in /usr/local/lib/python3.10/dist-packages (from moviepy) (4.4.2)\n",
            "Requirement already satisfied: tqdm<5.0,>=4.11.2 in /usr/local/lib/python3.10/dist-packages (from moviepy) (4.66.4)\n",
            "Requirement already satisfied: requests<3.0,>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from moviepy) (2.31.0)\n",
            "Requirement already satisfied: proglog<=1.0.0 in /usr/local/lib/python3.10/dist-packages (from moviepy) (0.1.10)\n",
            "Requirement already satisfied: imageio<3.0,>=2.5 in /usr/local/lib/python3.10/dist-packages (from moviepy) (2.31.6)\n",
            "Requirement already satisfied: imageio-ffmpeg>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from moviepy) (0.5.1)\n",
            "Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.10/dist-packages (from gdown>=3.10.1->retina-face) (4.12.3)\n",
            "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from gdown>=3.10.1->retina-face) (3.14.0)\n",
            "Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from imageio-ffmpeg>=0.2.0->moviepy) (67.7.2)\n",
            "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3.0,>=2.8.1->moviepy) (3.3.2)\n",
            "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3.0,>=2.8.1->moviepy) (3.7)\n",
            "Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3.0,>=2.8.1->moviepy) (2.0.7)\n",
            "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3.0,>=2.8.1->moviepy) (2024.6.2)\n",
            "Requirement already satisfied: absl-py>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (1.4.0)\n",
            "Requirement already satisfied: astunparse>=1.6.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (1.6.3)\n",
            "Requirement already satisfied: flatbuffers>=23.5.26 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (24.3.25)\n",
            "Requirement already satisfied: gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (0.5.4)\n",
            "Requirement already satisfied: google-pasta>=0.1.1 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (0.2.0)\n",
            "Requirement already satisfied: h5py>=2.9.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (3.9.0)\n",
            "Requirement already satisfied: libclang>=13.0.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (18.1.1)\n",
            "Requirement already satisfied: ml-dtypes~=0.2.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (0.2.0)\n",
            "Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (3.3.0)\n",
            "Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (24.1)\n",
            "Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (3.20.3)\n",
            "Requirement already satisfied: six>=1.12.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (1.16.0)\n",
            "Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (2.4.0)\n",
            "Requirement already satisfied: typing-extensions>=3.6.6 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (4.12.2)\n",
            "Requirement already satisfied: wrapt<1.15,>=1.11.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (1.14.1)\n",
            "Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (0.37.0)\n",
            "Requirement already satisfied: grpcio<2.0,>=1.24.3 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (1.64.1)\n",
            "Requirement already satisfied: tensorboard<2.16,>=2.15 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (2.15.2)\n",
            "Requirement already satisfied: tensorflow-estimator<2.16,>=2.15.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (2.15.0)\n",
            "Requirement already satisfied: keras<2.16,>=2.15.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow>=1.9.0->retina-face) (2.15.0)\n",
            "Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from timm->hsemotion) (6.0.1)\n",
            "Requirement already satisfied: huggingface_hub in /usr/local/lib/python3.10/dist-packages (from timm->hsemotion) (0.23.3)\n",
            "Requirement already satisfied: safetensors in /usr/local/lib/python3.10/dist-packages (from timm->hsemotion) (0.4.3)\n",
            "Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch->hsemotion) (1.12.1)\n",
            "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch->hsemotion) (3.3)\n",
            "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch->hsemotion) (3.1.4)\n",
            "Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch->hsemotion) (2023.6.0)\n",
            "Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch->hsemotion)\n",
            "  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)\n",
            "Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch->hsemotion)\n",
            "  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)\n",
            "Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch->hsemotion)\n",
            "  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)\n",
            "Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch->hsemotion)\n",
            "  Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)\n",
            "Collecting nvidia-cublas-cu12==12.1.3.1 (from torch->hsemotion)\n",
            "  Using cached nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)\n",
            "Collecting nvidia-cufft-cu12==11.0.2.54 (from torch->hsemotion)\n",
            "  Using cached nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)\n",
            "Collecting nvidia-curand-cu12==10.3.2.106 (from torch->hsemotion)\n",
            "  Using cached nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)\n",
            "Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch->hsemotion)\n",
            "  Using cached nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)\n",
            "Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch->hsemotion)\n",
            "  Using cached nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)\n",
            "Collecting nvidia-nccl-cu12==2.20.5 (from torch->hsemotion)\n",
            "  Using cached nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl (176.2 MB)\n",
            "Collecting nvidia-nvtx-cu12==12.1.105 (from torch->hsemotion)\n",
            "  Using cached nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)\n",
            "Requirement already satisfied: triton==2.3.0 in /usr/local/lib/python3.10/dist-packages (from torch->hsemotion) (2.3.0)\n",
            "Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch->hsemotion)\n",
            "  Downloading nvidia_nvjitlink_cu12-12.5.40-py3-none-manylinux2014_x86_64.whl (21.3 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m21.3/21.3 MB\u001b[0m \u001b[31m51.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: wheel<1.0,>=0.23.0 in /usr/local/lib/python3.10/dist-packages (from astunparse>=1.6.0->tensorflow>=1.9.0->retina-face) (0.43.0)\n",
            "Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (2.27.0)\n",
            "Requirement already satisfied: google-auth-oauthlib<2,>=0.5 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (1.2.0)\n",
            "Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (3.6)\n",
            "Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (0.7.2)\n",
            "Requirement already satisfied: werkzeug>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (3.0.3)\n",
            "Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.10/dist-packages (from beautifulsoup4->gdown>=3.10.1->retina-face) (2.5)\n",
            "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch->hsemotion) (2.1.5)\n",
            "Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /usr/local/lib/python3.10/dist-packages (from requests<3.0,>=2.8.1->moviepy) (1.7.1)\n",
            "Requirement already satisfied: mpmath<1.4.0,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy->torch->hsemotion) (1.3.0)\n",
            "Requirement already satisfied: cachetools<6.0,>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (5.3.3)\n",
            "Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (0.4.0)\n",
            "Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.10/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (4.9)\n",
            "Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from google-auth-oauthlib<2,>=0.5->tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (1.3.1)\n",
            "Requirement already satisfied: pyasn1<0.7.0,>=0.4.6 in /usr/local/lib/python3.10/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (0.6.0)\n",
            "Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.10/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<2,>=0.5->tensorboard<2.16,>=2.15->tensorflow>=1.9.0->retina-face) (3.2.2)\n",
            "Building wheels for collected packages: hsemotion\n",
            "  Building wheel for hsemotion (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for hsemotion: filename=hsemotion-0.3.0-py3-none-any.whl size=11244 sha256=bd2bf1b0a08fe9b4e58666996092c52273fc8a8fac2f39229f53350e81e2c12b\n",
            "  Stored in directory: /root/.cache/pip/wheels/38/88/e0/3b365122443c2ec55f3e058f2b7ad59df7b5e302c457c4539a\n",
            "Successfully built hsemotion\n",
            "Installing collected packages: nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, nvidia-cusparse-cu12, nvidia-cudnn-cu12, nvidia-cusolver-cu12, timm, retina-face, hsemotion\n",
            "Successfully installed hsemotion-0.3.0 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.5.40 nvidia-nvtx-cu12-12.1.105 retina-face-0.0.17 timm-1.0.3\n"
          ]
        }
      ],
      "source": [
        "! pip install retina-face hsemotion moviepy"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "from moviepy.editor import VideoFileClip, concatenate_videoclips\n",
        "from retinaface import RetinaFace\n",
        "from hsemotion.facial_emotions import HSEmotionRecognizer\n",
        "import cv2\n",
        "import numpy as np\n",
        "import os\n",
        "from google.colab.patches import cv2_imshow  # Import cv2_imshow for Colab"
      ],
      "metadata": {
        "id": "Ab06qU5l4p5G"
      },
      "execution_count": 2,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "## Initialize recognizer\n",
        "\n",
        "recognizer = HSEmotionRecognizer(model_name='enet_b0_8_best_vgaf', device='cpu')\n",
        "\n",
        "## Face Detection Function\n",
        "\n",
        "def detect_faces(frame):\n",
        "    \"\"\" Detect faces in the frame using RetinaFace \"\"\"\n",
        "    faces = RetinaFace.detect_faces(frame)\n",
        "    if isinstance(faces, dict):\n",
        "        face_list = []\n",
        "        for key in faces.keys():\n",
        "            face = faces[key]\n",
        "            facial_area = face['facial_area']\n",
        "            face_dict = {\n",
        "                'box': (facial_area[0], facial_area[1], facial_area[2] - facial_area[0], facial_area[3] - facial_area[1])\n",
        "            }\n",
        "            face_list.append(face_dict)\n",
        "        return face_list\n",
        "    return []\n",
        "\n",
        "## Annotation Function\n",
        "\n",
        "def annotate_frame(frame, faces):\n",
        "    \"\"\" Annotate the frame with recognized emotions using global recognizer \"\"\"\n",
        "    for face in faces:\n",
        "        x, y, w, h = face['box']\n",
        "        face_image = frame[y:y+h, x:x+w]  # Extract face region from frame\n",
        "        emotion = classify_emotions(face_image)\n",
        "        cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)\n",
        "        cv2.putText(frame, emotion, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255, 0, 0), 2)\n",
        "\n",
        "## Emotion Classification Function\n",
        "\n",
        "def classify_emotions(face_image):\n",
        "    \"\"\" Classify emotions for the given face image using global recognizer \"\"\"\n",
        "    results = recognizer.predict_emotions(face_image)\n",
        "    if results:\n",
        "        emotion = results[0]  # Get the most likely emotion\n",
        "    else:\n",
        "        emotion = 'Unknown'\n",
        "    return emotion\n",
        "\n",
        "## Process Video Frames\n",
        "\n",
        "def process_video_frames(video_path, temp_output_path, frame_skip=5):\n",
        "    # Load the video\n",
        "    video_clip = VideoFileClip(video_path)\n",
        "    fps = video_clip.fps\n",
        "\n",
        "    # Initialize output video writer\n",
        "    out = cv2.VideoWriter(temp_output_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (int(video_clip.size[0]), int(video_clip.size[1])))\n",
        "\n",
        "    # Iterate through frames, detect faces, and annotate emotions\n",
        "    frame_count = 0\n",
        "    for frame in video_clip.iter_frames():\n",
        "        if frame_count % frame_skip == 0:  # Process every nth frame\n",
        "            faces = detect_faces(frame)\n",
        "            annotate_frame(frame, faces)\n",
        "        frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)  # Convert RGB to BGR for OpenCV\n",
        "        out.write(frame)\n",
        "        frame_count += 1\n",
        "\n",
        "    # Release resources and cleanup\n",
        "    out.release()\n",
        "    cv2.destroyAllWindows()\n",
        "    video_clip.close()\n",
        "\n",
        "## Add Audio to Processed Video\n",
        "\n",
        "def add_audio_to_video(original_video_path, processed_video_path, output_path):\n",
        "    try:\n",
        "        original_clip = VideoFileClip(original_video_path)\n",
        "        processed_clip = VideoFileClip(processed_video_path)\n",
        "        final_clip = processed_clip.set_audio(original_clip.audio)\n",
        "        final_clip.write_videofile(output_path, codec='libx264', audio_codec='aac')\n",
        "    except Exception as e:\n",
        "        print(f\"Error while combining with audio: {e}\")\n",
        "    finally:\n",
        "        original_clip.close()\n",
        "        processed_clip.close()\n",
        "\n",
        "## Process Video\n",
        "\n",
        "def process_video(video_path, output_path):\n",
        "    temp_output_path = 'temp_output_video.mp4'\n",
        "\n",
        "    # Process video frames and save to a temporary file\n",
        "    process_video_frames(video_path, temp_output_path, frame_skip=5)  # Adjust frame_skip as needed\n",
        "\n",
        "    # Add audio to the processed video\n",
        "    add_audio_to_video(video_path, temp_output_path, output_path)\n",
        "\n",
        "## Process Image\n",
        "\n",
        "def process_image(input_path, output_path):\n",
        "    # Step 1: Read input image\n",
        "    image = cv2.imread(input_path)\n",
        "    if image is None:\n",
        "        print(f\"Error: Unable to read image at '{input_path}'\")\n",
        "        return\n",
        "\n",
        "    # Step 2: Detect faces and annotate emotions\n",
        "    faces = detect_faces(image)\n",
        "    annotate_frame(image, faces)\n",
        "\n",
        "    # Step 3: Write annotated image to output path\n",
        "    cv2.imwrite(output_path, image)\n",
        "\n",
        "    # Step 4: Combine input and output images horizontally\n",
        "    input_image = cv2.imread(input_path)\n",
        "    combined_image = cv2.hconcat([input_image, image])\n",
        "\n",
        "    # Step 5: Save or display the combined image\n",
        "    cv2.imwrite(output_path, combined_image)\n",
        "    cv2_imshow(combined_image)  # Display combined image in Colab\n",
        "\n"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "eSjUuQBG7Opw",
        "outputId": "bb32e267-67f0-4797-f5ad-72e6ff5ef869"
      },
      "execution_count": 15,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "/root/.hsemotion/enet_b0_8_best_vgaf.pt Compose(\n",
            "    Resize(size=(224, 224), interpolation=bilinear, max_size=None, antialias=True)\n",
            "    ToTensor()\n",
            "    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])\n",
            ")\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Time to process the video or image\n",
        "**NOTE : You can use your own data by changing the path**"
      ],
      "metadata": {
        "id": "fQ77JezWHh9y"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "if __name__ == \"__main__\":\n",
        "    input_path = '/content/Ψ±ΩŠΨ§ΩƒΨ΄Ω† ΨΉΨ¨Ω„Ψ© ΩƒΨ§Ω…Ω„ ΨͺΨ¨ΩƒΩŠ.mp4'  # Update with your video or image path\n",
        "    output_path = '/content/Ψ±ΩŠΨ§ΩƒΨ΄Ω† ΨΉΨ¨Ω„Ψ© ΩƒΨ§Ω…Ω„ ΨͺΨ¨ΩƒΩŠ out.mp4'  # Update with the desired output path\n",
        "\n",
        "    if input_path.lower().endswith(('.mp4', '.avi', '.mov', '.mkv')):\n",
        "        process_video(input_path, output_path)\n",
        "    elif input_path.lower().endswith(('.jpg', '.jpeg', '.png')):\n",
        "        process_image(input_path, output_path)\n",
        "    else:\n",
        "        print(\"Unsupported file format. Please provide a video or image file.\")"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "HiWL0XT5AONd",
        "outputId": "060281bb-0269-4369-ecce-32f472a3cf0a"
      },
      "execution_count": 16,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Moviepy - Building video /content/Ψ±ΩŠΨ§ΩƒΨ΄Ω† ΨΉΨ¨Ω„Ψ© ΩƒΨ§Ω…Ω„ ΨͺΨ¨ΩƒΩŠ out.mp4.\n",
            "MoviePy - Writing audio in Ψ±ΩŠΨ§ΩƒΨ΄Ω† ΨΉΨ¨Ω„Ψ© ΩƒΨ§Ω…Ω„ ΨͺΨ¨ΩƒΩŠ outTEMP_MPY_wvf_snd.mp4\n"
          ]
        },
        {
          "output_type": "stream",
          "name": "stderr",
          "text": []
        },
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "MoviePy - Done.\n",
            "Moviepy - Writing video /content/Ψ±ΩŠΨ§ΩƒΨ΄Ω† ΨΉΨ¨Ω„Ψ© ΩƒΨ§Ω…Ω„ ΨͺΨ¨ΩƒΩŠ out.mp4\n",
            "\n"
          ]
        },
        {
          "output_type": "stream",
          "name": "stderr",
          "text": []
        },
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Moviepy - Done !\n",
            "Moviepy - video ready /content/Ψ±ΩŠΨ§ΩƒΨ΄Ω† ΨΉΨ¨Ω„Ψ© ΩƒΨ§Ω…Ω„ ΨͺΨ¨ΩƒΩŠ out.mp4\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "if __name__ == \"__main__\":\n",
        "    input_path = '/content/mn (2).jpeg'  # Update with your video or image path\n",
        "    output_path = '/content/mn (2)-out.jpeg'  # Update with the desired output path\n",
        "\n",
        "    if input_path.lower().endswith(('.mp4', '.avi', '.mov', '.mkv')):\n",
        "        process_video(input_path, output_path)\n",
        "    elif input_path.lower().endswith(('.jpg', '.jpeg', '.png')):\n",
        "        process_image(input_path, output_path)\n",
        "    else:\n",
        "        print(\"Unsupported file format. Please provide a video or image file.\")"
      ],
      "metadata": {
        "id": "mdllZ7085ZlK"
      },
      "execution_count": null,
      "outputs": []
    }
  ]
}