DRAGON NAS Classifier
Image classifier trained on configs/alpaga_datasets.json with 2 classes.
Method
- Backbone : DINOv2-base (facebook/dinov2-base) - frozen, 86M params
- Head : found by Neural Architecture Search (DRAGON + Mutant-UCB)
- NAS objective :
-(0.70*MacroF1(Horama/Horama_WOW) + 0.30*MacroF1(_other))
- Data augmentation :
- Image-level : ? augmented views/image
- Feature-level : Mixup (alpha=0.0)
Results
| Split |
Accuracy |
Loss |
Macro-F1 |
F1w |
Recall |
Kappa |
| validation |
61.72% |
0.6286 |
57.90% |
59.41% |
61.72% |
0.1875 |
| test |
63.64% |
0.5904 |
55.13% |
57.47% |
63.64% |
0.2009 |
| test_thoiry |
87.50% |
0.3123 |
46.67% |
93.33% |
87.50% |
N/A |
| test_combined |
64.52% |
0.5801 |
58.26% |
59.53% |
64.52% |
0.2494 |
Model variants
| Variant |
Description |
Metric |
| best_nas |
NAS search weights (before retrain) |
- |
| best_retrain (recommended) |
Best validation loss |
0.6396 |
Classes (2)
Architecture
Best DRAGON Architecture
- Nodes : 2
- Operations : [['add', 'Identity', 'Identity'], ['add', 'Dropout', 0.3804138167058462, 'ELU']]
- LR : 0.000778
- WD : 0.006691
- Classes : 2
Architecture
graph TD
subgraph BACKBONE ["Backbone (frozen)"]
IMG[/"Image"/] --> ENCODER["Encoder"]
ENCODER --> CLS["Features"]
end
subgraph HEAD ["Classification head (DRAGON NAS)"]
N0["['add', 'Identity', 'Identity'] [add]"]
N1["['add', 'Dropout', 0.3804138167058462, 'ELU'] | ELU [add]"]
OUT_MLP["Linear -> 2 classes"]
end
subgraph OUTPUT ["Output"]
SOFTMAX["Softmax"] --> PRED[/"Prediction<br/>2 classes"/]
end
CLS --> N0
N0 --> N1
N1 --> OUT_MLP
OUT_MLP --> SOFTMAX
style BACKBONE fill:#f0f0f0,stroke:#666
style HEAD fill:#e8f4fd,stroke:#1a73e8
style OUTPUT fill:#e8fde8,stroke:#1a8c1a
Usage (ONNX)
import onnxruntime as ort
import numpy as np
from transformers import Dinov2Model
import torchvision.transforms as T
backbone = Dinov2Model.from_pretrained("facebook/dinov2-base")
transform = T.Compose([
T.Resize(518, interpolation=T.InterpolationMode.BICUBIC),
T.CenterCrop(518),
T.ToTensor(),
T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
features = backbone(transform(image).unsqueeze(0)).last_hidden_state[:, 0]
session = ort.InferenceSession("model_head.onnx")
logits = session.run(None, {"features": features.numpy()})[0]
pred = np.argmax(logits, axis=1)