EfficientNet-B5 Pose Estimation

A 2D human pose estimation model trained at DeKUT-DSAIL using the MMPose framework. Predicts 17 COCO keypoints from a single cropped person image.

Property Value
Backbone EfficientNet-B5
Attention Neck None
Parameters ~40 M
Input Size 192 Γ— 256
Output Heatmaps (17, 64, 48)

Evaluation Results

Evaluated on COCO 2017 val using OKS-based metrics (top-down, GT bounding boxes).

Metric Score
COCO AP 0.713
COCO AR 0.748

Repository Files

model.safetensors      # Model weights (safetensors format)
model.py               # Self-contained PoseEstimator inference helper
requirements.txt       # Python dependencies
pose.jpg               # Example test image
README.md              # This model card

Quick Start

Step 1 β€” Clone the repository

git clone https://huggingface.co/DeKUT-DSAIL/efficientnet_b5_coco_256x192
cd efficientnet_b5_coco_256x192

Step 2 β€” Create a virtual environment

Linux / macOS

python -m venv venv
source venv/bin/activate

Windows (Command Prompt)

python -m venv venv
venv\Scripts\activate.bat

Windows (PowerShell)

python -m venv venv
venv\Scripts\Activate.ps1

Step 3 β€” Install dependencies

pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
pip install -r requirements.txt

GPU users: Replace the PyTorch URL with your CUDA version. See pytorch.org/get-started.

Step 4 β€” Run inference

import cv2
from model import PoseEstimator

estimator = PoseEstimator("DeKUT-DSAIL/efficientnet_b5_coco_256x192")

image = cv2.imread("pose.jpg")
keypoints, scores = estimator.predict(image)

print("Keypoints shape:", keypoints.shape)  # (N, 17, 2)
print("Scores shape:   ", scores.shape)     # (N, 17, 1)

annotated = estimator.visualize(image, keypoints, scores, score_threshold=0.3)
cv2.imwrite("output.jpg", annotated)
print("Saved output.jpg")

Input / Output Specification

Property Value
Input size (1, 3, 256, 192) β€” RGB, channel-first
Normalisation Mean [0.485, 0.456, 0.406] / Std [0.229, 0.224, 0.225]
Output Heatmaps (N, 17, 64, 48)
Keypoints COCO 17-joint format

COCO 17 Keypoints

Index Name Index Name
0 nose 9 left_wrist
1 left_eye 10 right_wrist
2 right_eye 11 left_hip
3 left_ear 12 right_hip
4 right_ear 13 left_knee
5 left_shoulder 14 right_knee
6 right_shoulder 15 left_ankle
7 left_elbow 16 right_ankle
8 right_elbow

Training Details

Trained using MMPose on the following datasets:

Dataset Link
COCO 2017 cocodataset.org
MPII Human Pose mpii.is.tue.mpg.de
CrowdPose GitHub
OCHuman GitHub
Parameter Value
Optimizer AdamW
Learning rate 1 Γ— 10⁻³
LR schedule Multi-step decay
Batch size 64
Epochs 210
Input size 192 Γ— 256
Loss MSE on heatmaps + knowledge distillation loss

Architecture

Input Image (3, 256, 192)
        β”‚
        β–Ό
  EfficientNet-B5 Backbone
        β”‚
        β”‚
        β–Ό
  HeatmapHead (3Γ— deconv + 1Γ—1 conv)
        β”‚
        β–Ό
  Output Heatmaps (17, 64, 48)

Developed by

DeKUT-DSAIL β€” Dedan Kimathi University of Technology

  • Framework: PyTorch / MMPose
  • Model type: 2D Human Pose Estimation
  • Task: Keypoint Detection
  • License: Apache 2.0
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
39M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support