EfficientNet-B5 Pose Estimation
A 2D human pose estimation model trained at DeKUT-DSAIL using the MMPose framework. Predicts 17 COCO keypoints from a single cropped person image.
| Property | Value |
|---|---|
| Backbone | EfficientNet-B5 |
| Attention Neck | None |
| Parameters | ~40 M |
| Input Size | 192 Γ 256 |
| Output | Heatmaps (17, 64, 48) |
Evaluation Results
Evaluated on COCO 2017 val using OKS-based metrics (top-down, GT bounding boxes).
| Metric | Score |
|---|---|
| COCO AP | 0.713 |
| COCO AR | 0.748 |
Repository Files
model.safetensors # Model weights (safetensors format)
model.py # Self-contained PoseEstimator inference helper
requirements.txt # Python dependencies
pose.jpg # Example test image
README.md # This model card
Quick Start
Step 1 β Clone the repository
git clone https://huggingface.co/DeKUT-DSAIL/efficientnet_b5_coco_256x192
cd efficientnet_b5_coco_256x192
Step 2 β Create a virtual environment
Linux / macOS
python -m venv venv
source venv/bin/activate
Windows (Command Prompt)
python -m venv venv
venv\Scripts\activate.bat
Windows (PowerShell)
python -m venv venv
venv\Scripts\Activate.ps1
Step 3 β Install dependencies
pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
pip install -r requirements.txt
GPU users: Replace the PyTorch URL with your CUDA version. See pytorch.org/get-started.
Step 4 β Run inference
import cv2
from model import PoseEstimator
estimator = PoseEstimator("DeKUT-DSAIL/efficientnet_b5_coco_256x192")
image = cv2.imread("pose.jpg")
keypoints, scores = estimator.predict(image)
print("Keypoints shape:", keypoints.shape) # (N, 17, 2)
print("Scores shape: ", scores.shape) # (N, 17, 1)
annotated = estimator.visualize(image, keypoints, scores, score_threshold=0.3)
cv2.imwrite("output.jpg", annotated)
print("Saved output.jpg")
Input / Output Specification
| Property | Value |
|---|---|
| Input size | (1, 3, 256, 192) β RGB, channel-first |
| Normalisation | Mean [0.485, 0.456, 0.406] / Std [0.229, 0.224, 0.225] |
| Output | Heatmaps (N, 17, 64, 48) |
| Keypoints | COCO 17-joint format |
COCO 17 Keypoints
| Index | Name | Index | Name |
|---|---|---|---|
| 0 | nose | 9 | left_wrist |
| 1 | left_eye | 10 | right_wrist |
| 2 | right_eye | 11 | left_hip |
| 3 | left_ear | 12 | right_hip |
| 4 | right_ear | 13 | left_knee |
| 5 | left_shoulder | 14 | right_knee |
| 6 | right_shoulder | 15 | left_ankle |
| 7 | left_elbow | 16 | right_ankle |
| 8 | right_elbow |
Training Details
Trained using MMPose on the following datasets:
| Dataset | Link |
|---|---|
| COCO 2017 | cocodataset.org |
| MPII Human Pose | mpii.is.tue.mpg.de |
| CrowdPose | GitHub |
| OCHuman | GitHub |
| Parameter | Value |
|---|---|
| Optimizer | AdamW |
| Learning rate | 1 Γ 10β»Β³ |
| LR schedule | Multi-step decay |
| Batch size | 64 |
| Epochs | 210 |
| Input size | 192 Γ 256 |
| Loss | MSE on heatmaps + knowledge distillation loss |
Architecture
Input Image (3, 256, 192)
β
βΌ
EfficientNet-B5 Backbone
β
β
βΌ
HeatmapHead (3Γ deconv + 1Γ1 conv)
β
βΌ
Output Heatmaps (17, 64, 48)
Developed by
DeKUT-DSAIL β Dedan Kimathi University of Technology
- Framework: PyTorch / MMPose
- Model type: 2D Human Pose Estimation
- Task: Keypoint Detection
- License: Apache 2.0
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support