Model Card for Model ID
This modelcard aims to classify emotions into one of seven categories: anger, happy, sad, fear, surprise, disgust, neutral.
Model Details
Dataset:
Train: Happy - 14,379 / Angry - 7988 / Disgust - 872 / Sad - 9768 / Neutral - 9947 / Fear - 8200 / Surprise - 6376
Test: Happy - 3599 / Angry - 1918 / Disgust - 222 / Sad - 2386 / Neutral - 2449 / Fear - 2042 / Surprise - 1628
Val: Happy - 2880 / Angry - 1600 / Disgust - 172 / Sad - 1954 / Neutral - 1990 / Fear - 1640 / Surprise - 1628
Model:
- Transfer learning using MobileNetv2 with 2 additional Dense layers and an output layer with softmax activation function.
- Used weights to adjust for class imbalances.
- Total Params: 3,675,823
- Trainable Params: 136,839
- Accuracy: 0.823 | Precision: 0.825 | Recall: 0.823 | F1: 0.821
Room for Improvement:
This model was created with extremely limited hardware acceleration (GPU) resources. Therefore, it is high likely that evaluation metrics that surpass the 95% mark can be achieved in the following manner:
- MobileNetv2 was used for its fast inference and low latency but perhaps, with more resources, a more suitable base model can be found.
- Data augmentation in order to better correct for class imbalances.
- Using learning rate decay to train for longer (with lower LR) after nearing local minima (aprox 60 epochs).
- Error Analysis
Uses
Cannot be used for commercial purposes in the EU.
Direct Use
Combine with the Open CV haar casacade for face detection.
How to Get Started with the Model
Use the script below to get started with the model locally on your device's camera:
import cv2
import numpy as np
import tensorflow as tf
def display_emotion(frame, model):
font = cv2.FONT_HERSHEY_SIMPLEX
font_scale = 1.5
text_color = (0, 0, 255)
x, y, w, h = 0, 0, 175, 75
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
faces = face_cascade.detectMultiScale(gray, 1.1, 4)
for x, y, w, h in faces:
roi_gray = gray[y:y+h, x:x+w]
roi_color = frame[y:y+h, x:x+w]
cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2) # Green square
faces = face_cascade.detectMultiScale(roi_gray)
if len(faces) == 0:
print("Face not detected...")
else:
for (ex, ey, ew, eh) in faces:
face_roi = roi_color[ey:ey+eh, ex:ex+ew]
resized_image = cv2.resize(face_roi, (224, 224))
final_image = np.expand_dims(resized_image, axis=0)
predictions = model.predict(final_image)
class_labels = ['angry', 'disgust', 'fear', 'happy', 'neutral', 'sad', 'surprise']
predicted_label = class_labels[np.argmax(predictions)]
# Black background rectangle
cv2.rectangle(frame, (x, y), (x+w, y-25), (0, 0, 0), -1)
# Add text
cv2.putText(frame, predicted_label, (x, y-10), font, 0.7, text_color, 2)
cv2.rectangle(frame, (x, y), (x+w, y+h), text_color)
return frame
def main():
model = tf.keras.models.load_model('emotion_detection.keras')
cap = cv2.VideoCapture(1)
if not cap.isOpened():
cap = cv2.VideoCapture(0)
if not cap.isOpened():
raise IOError("Cannot open webcam")
while True:
ret, frame = cap.read()
if not ret:
break
frame = display_emotion(frame, model)
cv2.imshow('Facial Expression Recognition', frame)
if cv2.waitKey(2) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
if __name__ == "__main__":
main()
Preprocessing [optional]
MobileNetv2 recieves image inputs of size (224, 224)
Speeds, Sizes, Times [optional]
Latency (local demo, no GPU): 39 ms/step
Model Card Authors [optional]
Ronny Nehme
- Downloads last month
- 1