--- license: cc-by-4.0 --- # LoRA-Ensemble: Uncertainty Modelling for Self-attention Networks Michelle Halbheer, Dominik J. Mühlematter, Alexander Becker, Dominik Narnhofer, Helge Aasen, Konrad Schindler and Mehmet Ozgur Turkoglu - 2024 [[Paper on ArXiv]](https://arxiv.org/abs/2405.14438) ## Pre-trained models This repository contains the pre-trained models corresponding to the code we released on [GitHub](https://github.com/prs-eth/LoRA-Ensemble/). The usage of the models with our pipeline is described in the GitHub repository. This repository only contains the models for our final experiments on the CIFAR-10, CIFAR-100 and HAM10000 datasets, not, however, for all intermediate results. The results of the ESC-50 dataset cannot be published at this time, as it would require storing five models per epoch during training in order to store all models of the five-fold cross-validation. This is infeasible on our infrastructure so we only release models for CIFAR-10, CIFAR-100 and HAM10000. ## Base models Alongside our pre-trained models we release the base models we used for our models. This is to ensure the reproducibility of our results even if the weights by `torchvision` and `timm` should get changed. ## Citation If you find our work useful or interesting or use our code, please cite our paper as follows ```latex @misc{ title = {LoRA-Ensemble: Uncertainty Modelling for Self-attention Networks}, author = {Halbheer, Michelle and M\"uhlematter, Dominik Jan and Becker, Alexander and Narnhofer, Dominik and Aasen, Helge and Schindler, Konrad and Turkoglu, Mehmet Ozgur} year = {2024} note = {arXiv: } } ``` ## CIFAR-100 The table below shows the evaluation results obtained using different methods. Each method was trained five times with varying random seeds. | Method (ViT) | Accuracy | ECE | Settings name* | Model weights* | |----------------------|------------------------|-----------------------|-------------------|------------------------------| | Single Network | \\(76.6\pm0.2\\) | \\(0.144\pm0.001\\) |CIFAR100_settings_explicit|Deep_Ensemble_ViT_base_32_1_members_CIFAR100_settings_explicit\.pt| | Single Network with LoRA | \\(79.6\pm0.2\\) | \\(\textbf{0.014}\pm0.003\\) |CIFAR100_settings_LoRA|LoRA_Former_ViT_base_32_1_members_CIFAR100_settings_LoRA\.pt| | MC Dropout | \\(77.1\pm0.5\\) | \\(0.055\pm0.002\\) |CIFAR100_settings_MCDropout|MCDropout_ViT_base_32_16_members_CIFAR100_settings_MCDropout\.pt| | Explicit Ensemble | \\(\underline{79.8}\pm0.2\\) | \\(0.098\pm0.001\\) |CIFAR100_settings_explicit|Deep_Ensemble_ViT_base_32_16_members_CIFAR100_settings_explicit\.pt| | LoRA-Ensemble | \\(\textbf{82.5}\pm0.1\\) | \\(\underline{0.035}\pm0.001\\) |CIFAR100_settings_LoRA|LoRA_Former_ViT_base_32_16_members_CIFAR100_settings_LoRA\.pt| \* Settings and model names are followed by a number in the range 1-5 indicating the used random seed. ## HAM10000 The table below shows the evaluation results obtained using different methods. Each method was trained five times with varying random seeds. | Method (ViT) | Accuracy| ECE | Settings name* | Model weights* | |----------------------|------------------------|-----------------------|-------------------|------------------------------| | Single Network | \\(84.3\pm0.5\\) | \\(0.136\pm0.006\\) |HAM10000_settings_explicit|Deep_Ensemble_ViT_base_32_1_members_HAM10000_settings_explicit\.pt| | Single Network with LoRA | \\(83.2\pm0.7\\) | \\(0.085\pm0.004\\) |HAM10000_settings_LoRA|LoRA_Former_ViT_base_32_1_members_HAM10000_settings_LoRA\.pt| | MC Dropout | \\(83.7\pm0.4\\) | \\(\underline{0.099}\pm0.007\\) |HAM10000_settings_MCDropout|MCDropout_ViT_base_32_16_members_HAM10000_settings_MCDropout\.pt| | Explicit Ensemble | \\(\underline{85.7}\pm0.3\\) | \\(0.106\pm0.002\\) |HAM10000_settings_explicit|Deep_Ensemble_ViT_base_32_16_members_HAM10000_settings_explicit\.pt| | LoRA-Ensemble | \\(\textbf{88.0}\pm0.2\\) | \\(\textbf{0.037}\pm0.002\\) |HAM10000_settings_LoRA|LoRA_Former_ViT_base_32_16_members_HAM10000_settings_LoRA\.pt| \* Settings and model names are followed by a number in the range 1-5 indicating the used random seed.