Update README.md
Browse files
README.md
CHANGED
@@ -40,6 +40,7 @@ It achieves the following results on the evaluation set:
|
|
40 |
## Model description
|
41 |
|
42 |
This model is a fine-tuned version of [google/vit-base-patch16-224](https://huggingface.co/google/vit-base-patch16-224), which is a Vision Transformer (ViT)
|
|
|
43 |
ViT model is originaly a transformer encoder model pre-trained and fine-tuned on ImageNet 2012.
|
44 |
It was introduced in the paper "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale" by Dosovitskiy et al.
|
45 |
The model processes images as sequences of 16x16 patches, adding a [CLS] token for classification tasks, and uses absolute position embeddings. Pre-training enables the model to learn rich image representations, which can be leveraged for downstream tasks by adding a linear classifier on top of the [CLS] token. The weights were converted from the timm repository by Ross Wightman.
|
@@ -50,9 +51,11 @@ This must be used for classification of x-ray images of the brain to diagnose of
|
|
50 |
|
51 |
## Training and evaluation data
|
52 |
|
53 |
-
|
|
|
54 |
|
55 |
## Training procedure
|
|
|
56 |
|
57 |
### Training hyperparameters
|
58 |
|
|
|
40 |
## Model description
|
41 |
|
42 |
This model is a fine-tuned version of [google/vit-base-patch16-224](https://huggingface.co/google/vit-base-patch16-224), which is a Vision Transformer (ViT)
|
43 |
+
|
44 |
ViT model is originaly a transformer encoder model pre-trained and fine-tuned on ImageNet 2012.
|
45 |
It was introduced in the paper "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale" by Dosovitskiy et al.
|
46 |
The model processes images as sequences of 16x16 patches, adding a [CLS] token for classification tasks, and uses absolute position embeddings. Pre-training enables the model to learn rich image representations, which can be leveraged for downstream tasks by adding a linear classifier on top of the [CLS] token. The weights were converted from the timm repository by Ross Wightman.
|
|
|
51 |
|
52 |
## Training and evaluation data
|
53 |
|
54 |
+
The model was fine-tuned in the dataset [Mahadih534/brain-tumor-dataset](https://huggingface.co/datasets/Mahadih534/brain-tumor-dataset) that contains 253 brain images. This dataset was originally created by Yousef Ghanem.
|
55 |
+
|
56 |
|
57 |
## Training procedure
|
58 |
+
The model was fine-tuned
|
59 |
|
60 |
### Training hyperparameters
|
61 |
|