I3D Kinetics-600
This model is a fine-tuned version of the Inflated 3D Convnet model for action recognition, trained on the Kinetics-400 dataset.
Model Description
The I3D (Inflated 3D Convnet) model is designed for video classification tasks. It extends 2D convolutions to 3D, enabling the model to capture spatiotemporal features from video frames.
Intended Uses
The model can be used for action recognition in videos. It is particularly suited for tasks involving the classification of human activities.
Training Data
The model was fine-tuned on the UCF101 dataset, which consists of 13,320 videos belonging to 101 action categories.
Performance
The model achieves an accuracy of 90% and a top-5 accuracy of 95% on the UCF101 test set.
Example Usage
from transformers import pipeline
model = pipeline("video-classification", model="Mouwiya/i3d-kinetics-600")
# Example video path
video_path = "path_to_your_video.mp4"
# Perform video classification
results = model(video_path)
print(results)
- Downloads last month
- 11
Inference API (serverless) does not yet support tf-keras models for this pipeline type.
Evaluation results
- Accuracy on UCF101self-reported0.980
- Top-5 Accuracy on UCF101self-reported0.950