maxf-coder/task_image_classifier
Viewer โข Updated โข 9.7k โข 5.62k
EfficientNet-B4 trained on 14 activity categories for the image-to-prompt pipeline.
| Metric | Value |
|---|---|
| Test samples | {test_samples} |
| Top-1 accuracy | {top1} |
| Top-3 accuracy | {top3} |
| Macro F1 | {macro_f1} |
| Weighted F1 | {weighted_f1} |
| Class | Precision | Recall | F1 | Support |
|---|---|---|---|---|
| {class_rows} |
import torch
import timm
from PIL import Image
from torchvision import transforms
model = timm.create_model("efficientnet_b4", pretrained=False, num_classes=14)
model.load_state_dict(torch.load("efficientnet_b4_classifier.pth", map_location="cpu"))
model.eval()
transform = transforms.Compose([
transforms.Resize((380, 380)),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
])
img = Image.open("photo.jpg").convert("RGB")
tensor = transform(img).unsqueeze(0)
with torch.no_grad():
logits = model(tensor)
pred = logits.argmax(1).item()
Two-phase training: 5 frozen epochs (head only) + 20 unfrozen epochs (last 2 blocks). Optimizer: AdamW with cosine annealing. Mixed precision (AMP). See train_classifier.py for details.