DRAGON NAS Classifier

Image classifier trained on configs/alpaga_datasets.json with 2 classes.

Method

  • Backbone : DINOv2-base (facebook/dinov2-base) - frozen, 86M params
  • Head : found by Neural Architecture Search (DRAGON + Mutant-UCB)
  • NAS objective : -(0.70*MacroF1(Horama/Horama_WOW) + 0.30*MacroF1(_other))
  • Data augmentation :
    • Image-level : ? augmented views/image
    • Feature-level : Mixup (alpha=0.0)

Results

Split Accuracy Loss Macro-F1 F1w Recall Kappa
validation 61.72% 0.6286 57.90% 59.41% 61.72% 0.1875
test 63.64% 0.5904 55.13% 57.47% 63.64% 0.2009
test_thoiry 87.50% 0.3123 46.67% 93.33% 87.50% N/A
test_combined 64.52% 0.5801 58.26% 59.53% 64.52% 0.2494

Model variants

Variant Description Metric
best_nas NAS search weights (before retrain) -
best_retrain (recommended) Best validation loss 0.6396

Classes (2)

  • alpaga
  • vigogne

Architecture

Best DRAGON Architecture

  • Nodes : 2
  • Operations : [['add', 'Identity', 'Identity'], ['add', 'Dropout', 0.3804138167058462, 'ELU']]
  • LR : 0.000778
  • WD : 0.006691
  • Classes : 2

Architecture

graph TD
    subgraph BACKBONE ["Backbone (frozen)"]
    IMG[/"Image"/] --> ENCODER["Encoder"]
    ENCODER --> CLS["Features"]
    end
    subgraph HEAD ["Classification head (DRAGON NAS)"]
    N0["['add', 'Identity', 'Identity'] [add]"]
    N1["['add', 'Dropout', 0.3804138167058462, 'ELU'] | ELU [add]"]
    OUT_MLP["Linear -> 2 classes"]
    end
    subgraph OUTPUT ["Output"]
    SOFTMAX["Softmax"] --> PRED[/"Prediction<br/>2 classes"/]
    end
    CLS --> N0
    N0 --> N1
    N1 --> OUT_MLP
    OUT_MLP --> SOFTMAX
    style BACKBONE fill:#f0f0f0,stroke:#666
    style HEAD fill:#e8f4fd,stroke:#1a73e8
    style OUTPUT fill:#e8fde8,stroke:#1a8c1a

Usage (ONNX)

import onnxruntime as ort
import numpy as np
from transformers import Dinov2Model
import torchvision.transforms as T

backbone = Dinov2Model.from_pretrained("facebook/dinov2-base")
transform = T.Compose([
    T.Resize(518, interpolation=T.InterpolationMode.BICUBIC),
    T.CenterCrop(518),
    T.ToTensor(),
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
features = backbone(transform(image).unsqueeze(0)).last_hidden_state[:, 0]

session = ort.InferenceSession("model_head.onnx")
logits = session.run(None, {"features": features.numpy()})[0]
pred = np.argmax(logits, axis=1)
Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support