--- language: - en library_name: keras pipeline_tag: image-classification --- # Model Card for Model ID This modelcard aims to classify emotions into one of seven categories: anger, happy, sad, fear, surprise, disgust, neutral. ## Model Details Dataset: - Train: Happy - 14,379 / Angry - 7988 / Disgust - 872 / Sad - 9768 / Neutral - 9947 / Fear - 8200 / Surprise - 6376 - Test: Happy - 3599 / Angry - 1918 / Disgust - 222 / Sad - 2386 / Neutral - 2449 / Fear - 2042 / Surprise - 1628 - Val: Happy - 2880 / Angry - 1600 / Disgust - 172 / Sad - 1954 / Neutral - 1990 / Fear - 1640 / Surprise - 1628 Model: 1. Transfer learning using MobileNetv2 with 2 additional Dense layers and an output layer with softmax activation function. 2. Used weights to adjust for class imbalances. 3. Total Params: 3,675,823 4. Trainable Params: 136,839 5. Accuracy: 0.823 | Precision: 0.825 | Recall: 0.823 | F1: 0.821 ## Room for Improvement: This model was created with extremely limited hardware acceleration (GPU) resources. Therefore, it is high likely that evaluation metrics that surpass the 95% mark can be achieved in the following manner: 1. MobileNetv2 was used for its fast inference and low latency but perhaps, with more resources, a more suitable base model can be found. 2. Data augmentation in order to better correct for class imbalances. 3. Using learning rate decay to train for longer (with lower LR) after nearing local minima (aprox 60 epochs). 4. Error Analysis ## Uses Cannot be used for commercial purposes in the EU. ### Direct Use Combine with the Open CV haar casacade for face detection. ## How to Get Started with the Model Use the script below to get started with the model locally on your device's camera: import cv2 import numpy as np import tensorflow as tf def display_emotion(frame, model): font = cv2.FONT_HERSHEY_SIMPLEX font_scale = 1.5 text_color = (0, 0, 255) x, y, w, h = 0, 0, 175, 75 gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml') faces = face_cascade.detectMultiScale(gray, 1.1, 4) for x, y, w, h in faces: roi_gray = gray[y:y+h, x:x+w] roi_color = frame[y:y+h, x:x+w] cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2) # Green square faces = face_cascade.detectMultiScale(roi_gray) if len(faces) == 0: print("Face not detected...") else: for (ex, ey, ew, eh) in faces: face_roi = roi_color[ey:ey+eh, ex:ex+ew] resized_image = cv2.resize(face_roi, (224, 224)) final_image = np.expand_dims(resized_image, axis=0) predictions = model.predict(final_image) class_labels = ['angry', 'disgust', 'fear', 'happy', 'neutral', 'sad', 'surprise'] predicted_label = class_labels[np.argmax(predictions)] # Black background rectangle cv2.rectangle(frame, (x, y), (x+w, y-25), (0, 0, 0), -1) # Add text cv2.putText(frame, predicted_label, (x, y-10), font, 0.7, text_color, 2) cv2.rectangle(frame, (x, y), (x+w, y+h), text_color) return frame def main(): model = tf.keras.models.load_model('emotion_detection.keras') cap = cv2.VideoCapture(1) if not cap.isOpened(): cap = cv2.VideoCapture(0) if not cap.isOpened(): raise IOError("Cannot open webcam") while True: ret, frame = cap.read() if not ret: break frame = display_emotion(frame, model) cv2.imshow('Facial Expression Recognition', frame) if cv2.waitKey(2) & 0xFF == ord('q'): break cap.release() cv2.destroyAllWindows() if __name__ == "__main__": main() #### Preprocessing [optional] MobileNetv2 recieves image inputs of size (224, 224) #### Speeds, Sizes, Times [optional] Latency (local demo, no GPU): 39 ms/step ## Model Card Authors [optional] Ronny Nehme