---
datasets:
- imagenet-1k
pipeline_tag: image-classification
---

## Model Architecture Details

### Architecture Overview

- **Architecture**: ViT Base

### Configuration

| Attribute            | Value          |
|----------------------|----------------|
| Patch Size           | 16             |
| Image Size           | 224            |
| Num Layers           | 2              |
| Attention Heads      | 4              |
| Objective Function   | CrossEntropy   |

### Performance

- **Validation Accuracy (Top 5)**: 0.34
- **Validation Accuracy**: 0.16

### Additional Resources

The model was trained using the library: [ViT-Prisma](https://github.com/soniajoseph/ViT-Prisma).\
For detailed metrics, plots, and further analysis of the model's training process, refer to the [training report](https://wandb.ai/perceptual-alignment/Imagenet/reports/ViT-Small-Imagenet-training-report--Vmlldzo3MDk3MTM5).