Arno Onken commited on
Commit
da009e2
1 Parent(s): aa6df66

Update README and add initial model

Browse files
Files changed (2) hide show
  1. README.md +85 -1
  2. model.pt +3 -0
README.md CHANGED
@@ -1,3 +1,87 @@
1
  ---
2
- license: gpl-3.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: mit
3
+ datasets:
4
+ - mnist
5
+ metrics:
6
+ - accuracy
7
  ---
8
+ # Model Card for mnistvit
9
+
10
+ A vision transformer (ViT) trained on MNIST with a PyTorch-only implementation,
11
+ achieving 99.65% test set accuracy.
12
+
13
+ ## Model Details
14
+
15
+ ### Model Description
16
+
17
+ The model is a vision transformer, as described in the original
18
+ Dosovitskiy et al., ICLR 2021 paper.
19
+
20
+ - **Developed by:** Arno Onken
21
+ - **Model type:** Vision Transformer
22
+ - **License:** MIT
23
+
24
+ ### Model Sources
25
+
26
+ - **Python Package Index:**
27
+ [https://pypi.org/project/mnistvit/](https://pypi.org/project/mnistvit/)
28
+ - **Paper:** [Dosovitskiy et al., ICLR 2021](https://openreview.net/forum?id=YicbFdNTTy)
29
+
30
+ ## Uses
31
+
32
+ The model is intended to be used for learning about vision transformers. It is small
33
+ and trained on MNIST as a simple and well understood dataset. Together with the
34
+ mnistvit package code, the importance of various hyperparameters can be explored.
35
+
36
+ ## How to Get Started with the Model
37
+
38
+ Install the mnistvit package, which provides code for training and running the model:
39
+
40
+ ```
41
+ pip install mnistvit
42
+ ```
43
+
44
+ Place the `model.pt` file from this repository in a directory of your choice and run
45
+ Python from that directory.
46
+
47
+ To evaluate the test set accuracy and loss of the model stored in `model.pt`:
48
+ ```
49
+ python -m mnistvit --use-accuracy --use-loss
50
+ ```
51
+
52
+ Individual images can be classified as well. To predict the class of a digit image
53
+ stored in a file `sample.jpg`:
54
+ ```
55
+ python -m mnistvit --image-file sample.jpg
56
+ ```
57
+
58
+ ## Training Details
59
+
60
+ ### Training Data
61
+
62
+ This model was trained on the 60,000 training set images of the
63
+ [MNIST](https://huggingface.co/datasets/ylecun/mnist/) dataset. Data augmentation was
64
+ used in the form of random rotations, translations and scaling as detailed in the
65
+ `mnistvit.preprocess` module.
66
+
67
+ ### Training Procedure
68
+
69
+ - **Training regime:** fp32
70
+
71
+ Hyperparameters were obtained from an 80:20 training set - validation set split of the
72
+ original MNIST training set, running Ray Tune with Optuna as detailed in the
73
+ `mnistvit.tune` module. The resulting parameters were then set as default parameters in
74
+ the `mnistvit.train` module.
75
+
76
+ ## Evaluation
77
+
78
+ ### Testing Data
79
+
80
+ This model was evaluated on the 10,000 test set images of the
81
+ [MNIST](https://huggingface.co/datasets/ylecun/mnist/) dataset.
82
+
83
+ ### Results
84
+
85
+ Test set accuracy: 99.65%
86
+
87
+ Test set cross entropy loss: 0.011
model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cf45c0f6be01dba4df12f028f1a2a3013764c1ff00453d2fee52a92b6fac6527
3
+ size 44466002