EfficientNet-B5 Pose Estimation

A 2D human pose estimation model trained at DeKUT-DSAIL using the MMPose framework. Predicts 17 COCO keypoints from a single cropped person image.

Property	Value
Backbone	EfficientNet-B5
Attention Neck	None
Parameters	~40 M
Input Size	192 × 256
Output	Heatmaps (17, 64, 48)

Evaluation Results

Evaluated on COCO 2017 val using OKS-based metrics (top-down, GT bounding boxes).

Metric	Score
COCO AP	0.713
COCO AR	0.748

Repository Files

model.safetensors      # Model weights (safetensors format)
model.py               # Self-contained PoseEstimator inference helper
requirements.txt       # Python dependencies
pose.jpg               # Example test image
README.md              # This model card

Quick Start

Step 1 — Clone the repository

git clone https://huggingface.co/DeKUT-DSAIL/efficientnet_b5_coco_256x192
cd efficientnet_b5_coco_256x192

Step 2 — Create a virtual environment

Linux / macOS

python -m venv venv
source venv/bin/activate

Windows (Command Prompt)

python -m venv venv
venv\Scripts\activate.bat

Windows (PowerShell)

python -m venv venv
venv\Scripts\Activate.ps1

Step 3 — Install dependencies

pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
pip install -r requirements.txt

GPU users: Replace the PyTorch URL with your CUDA version. See pytorch.org/get-started.

Step 4 — Run inference

import cv2
from model import PoseEstimator

estimator = PoseEstimator("DeKUT-DSAIL/efficientnet_b5_coco_256x192")

image = cv2.imread("pose.jpg")
keypoints, scores = estimator.predict(image)

print("Keypoints shape:", keypoints.shape)  # (N, 17, 2)
print("Scores shape:   ", scores.shape)     # (N, 17, 1)

annotated = estimator.visualize(image, keypoints, scores, score_threshold=0.3)
cv2.imwrite("output.jpg", annotated)
print("Saved output.jpg")

Input / Output Specification

Property	Value
Input size	`(1, 3, 256, 192)` — RGB, channel-first
Normalisation	Mean `[0.485, 0.456, 0.406]` / Std `[0.229, 0.224, 0.225]`
Output	Heatmaps `(N, 17, 64, 48)`
Keypoints	COCO 17-joint format

COCO 17 Keypoints

Index	Name	Index	Name
0	nose	9	left_wrist
1	left_eye	10	right_wrist
2	right_eye	11	left_hip
3	left_ear	12	right_hip
4	right_ear	13	left_knee
5	left_shoulder	14	right_knee
6	right_shoulder	15	left_ankle
7	left_elbow	16	right_ankle
8	right_elbow

Training Details

Trained using MMPose on the following datasets:

Dataset	Link
COCO 2017	cocodataset.org
MPII Human Pose	mpii.is.tue.mpg.de
CrowdPose	GitHub
OCHuman	GitHub

Parameter	Value
Optimizer	AdamW
Learning rate	1 × 10⁻³
LR schedule	Multi-step decay
Batch size	64
Epochs	210
Input size	192 × 256
Loss	MSE on heatmaps + knowledge distillation loss

Architecture

Input Image (3, 256, 192)
        │
        ▼
  EfficientNet-B5 Backbone
        │
        │
        ▼
  HeatmapHead (3× deconv + 1×1 conv)
        │
        ▼
  Output Heatmaps (17, 64, 48)

Developed by

DeKUT-DSAIL — Dedan Kimathi University of Technology

Framework: PyTorch / MMPose
Model type: 2D Human Pose Estimation
Task: Keypoint Detection
License: Apache 2.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

39M params

Tensor type

F32

Inference Providers NEW

Keypoint Detection

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support