uoft-cs/cifar10
Viewer β’ Updated β’ 60k β’ 180k β’ 107
How to use josemedina/KD-Teachers with Keras:
# Available backend options are: "jax", "torch", "tensorflow".
import os
os.environ["KERAS_BACKEND"] = "jax"
import keras
model = keras.saving.load_model("hf://josemedina/KD-Teachers")
Pre-trained teacher models used for Knowledge Distillation with Mixup augmentation on CIFAR-10 and CIFAR-100. These checkpoints are the teacher component of the KD-Mixup framework.
All models were fine-tuned from ImageNet pre-trained weights on CIFAR-10 and CIFAR-100 using SGD with momentum, ReduceLROnPlateau scheduling, and mixed precision training (float16).
| Model | Backbone | Pre-training |
|---|---|---|
best_resnet152v2 |
ResNet-152 V2 | ImageNet |
best_convnexttiny |
ConvNeXt-Tiny | ImageNet |
best_convnextlarge |
ConvNeXt-Large | ImageNet |
best_vitbase |
ViT-B/16 | ImageNet |
| Model | Accuracy | Confidence | ECE |
|---|---|---|---|
| ConvNeXt-Large | 0.9856 | 0.9789 | 0.0070 |
| ViT-B/16 | 0.9850 | 0.9927 | 0.0085 |
| ResNet-152 V2 | 0.9698 | 0.9838 | 0.0152 |
| ConvNeXt-Tiny | 0.9672 | 0.9744 | 0.0104 |
| Model | Accuracy | Confidence | ECE |
|---|---|---|---|
| ConvNeXt-Large | 0.9217 | 0.9189 | 0.0049 |
| ViT-B/16 | 0.9151 | 0.9272 | 0.0174 |
| ResNet-152 V2 | 0.8257 | 0.8873 | 0.0619 |
| ConvNeXt-Tiny | 0.8196 | 0.7908 | 0.0288 |
cifar10/
βββ best_resnet152v2.keras
βββ best_convnexttiny.keras
βββ best_convnextlarge.keras
βββ best_vitbase.keras
cifar100/
βββ best_resnet152v2.keras
βββ best_convnexttiny.keras
βββ best_convnextlarge.keras
βββ best_vitbase.keras
Download a checkpoint and place it in your local checkpoints/teachers/{dataset}/ folder:
from huggingface_hub import hf_hub_download
path = hf_hub_download(
repo_id="josemedina/KD-Teachers",
filename="cifar100/best_resnet152v2.keras"
)
Then load it with Keras:
import tensorflow as tf
model = tf.keras.models.load_model(path)
The expected checkpoint path for the KD-Mixup training script is:
checkpoints/teachers/{dataset}/best_{teacher_name}.keras
([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])If you use these checkpoints in your research, please cite:
@misc{medina2025kdmixup,
author = {Medina, Jos{\'e} and Hadachi, Amnir and Honeine, Paul and Bensrhair, Abdelaziz},
title = {Beyond Dark Knowledge: Mixup-Based Knowledge Distillation Under Vicinal Teacher Distributions},
year = {2025},
publisher = {University of Tartu},
url = {https://github.com/JoseLMedinaC/KD-Mixup}
}