Edit model card

m2mKD

This repository contains the checkpoints for m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers.

Released checkpoints

For the usage of the checkpoints listed below, please refer to the instructions provided on our GitHub repo.

  • nac_scale_tinyimnet.pth/nac_scale_imnet.pth: NAC model with a scale-free prior trained using m2mKD.
  • vmoe_base.pth: V-MoE-Base model trained using m2mKD.
  • FT_huge: a directory containing DeiT-Huge teacher modules for NAC model training.
  • nac_tinyimnet_students: a directory containing NAC student modules for Tiny-ImageNet.

Acknowledgement

Our implementation is mainly based on Deep-Incubation.

Citation

If you use the checkpoints, please cite our paper:

@misc{lo2024m2mkd,
    title={m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers}, 
    author={Ka Man Lo and Yiming Liang and Wenyu Du and Yuantao Fan and Zili Wang and Wenhao Huang and Lei Ma and Jie Fu},
    year={2024},
    eprint={2402.16918},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Drag image file here or click to browse from your device
Unable to determine this model's library. Check the docs .