KD-Teachers

Pre-trained teacher models used for Knowledge Distillation with Mixup augmentation on CIFAR-10 and CIFAR-100. These checkpoints are the teacher component of the KD-Mixup framework.

All models were fine-tuned from ImageNet pre-trained weights on CIFAR-10 and CIFAR-100 using SGD with momentum, ReduceLROnPlateau scheduling, and mixed precision training (float16).

Models

Model	Backbone	Pre-training
`best_resnet152v2`	ResNet-152 V2	ImageNet
`best_convnexttiny`	ConvNeXt-Tiny	ImageNet
`best_convnextlarge`	ConvNeXt-Large	ImageNet
`best_vitbase`	ViT-B/16	ImageNet

Performance

CIFAR-10

Model	Accuracy	Confidence	ECE
ConvNeXt-Large	0.9856	0.9789	0.0070
ViT-B/16	0.9850	0.9927	0.0085
ResNet-152 V2	0.9698	0.9838	0.0152
ConvNeXt-Tiny	0.9672	0.9744	0.0104

CIFAR-100

Model	Accuracy	Confidence	ECE
ConvNeXt-Large	0.9217	0.9189	0.0049
ViT-B/16	0.9151	0.9272	0.0174
ResNet-152 V2	0.8257	0.8873	0.0619
ConvNeXt-Tiny	0.8196	0.7908	0.0288

File Structure

cifar10/
├── best_resnet152v2.keras
├── best_convnexttiny.keras
├── best_convnextlarge.keras
└── best_vitbase.keras
cifar100/
├── best_resnet152v2.keras
├── best_convnexttiny.keras
├── best_convnextlarge.keras
└── best_vitbase.keras

Usage

Download a checkpoint and place it in your local checkpoints/teachers/{dataset}/ folder:

from huggingface_hub import hf_hub_download

path = hf_hub_download(
    repo_id="josemedina/KD-Teachers",
    filename="cifar100/best_resnet152v2.keras"
)

Then load it with Keras:

import tensorflow as tf

model = tf.keras.models.load_model(path)

The expected checkpoint path for the KD-Mixup training script is:

checkpoints/teachers/{dataset}/best_{teacher_name}.keras

Training Details

Input size: 224 × 224 × 3
Batch size: 250
Optimizer: SGD (momentum=0.9, lr=1e-4)
LR schedule: ReduceLROnPlateau (patience=3, factor=0.9, min_lr=1e-5)
Max epochs: 500 (best checkpoint saved by val accuracy)
Augmentation: Random crop, horizontal flip
Precision: Mixed float16
ViT-B/16 normalization: ImageNet mean/std ([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

Citation

If you use these checkpoints in your research, please cite:

@misc{medina2025kdmixup,
  author    = {Medina, Jos{\'e} and Hadachi, Amnir and Honeine, Paul and Bensrhair, Abdelaziz},
  title     = {Beyond Dark Knowledge: Mixup-Based Knowledge Distillation Under Vicinal Teacher Distributions},
  year      = {2025},
  publisher = {University of Tartu},
  url       = {https://github.com/JoseLMedinaC/KD-Mixup}
}

Downloads last month: 146

josemedina
/

KD-Teachers