umarzaib commited on
Commit
7bb8629
1 Parent(s): cc182bf

Upload 2 files

Browse files

Title: AlexNet Image Classifier

Description:
This is an image classification model based on the AlexNet architecture. AlexNet is a convolutional neural network (CNN) designed for image classification tasks. It consists of eight layers—five convolutional layers followed by three fully connected layers. This model has been pre-trained on the ImageNet dataset and can classify images into one of ten classes: Airplane, Car, Bird, Cat, Deer, Dog, Frog, Horse, Ship, or Truck. The model takes an image as input and outputs the predicted class along with its probability.

Usage:
This model is suitable for various computer vision applications such as object recognition, image tagging, and content-based image retrieval. It can be used to classify images in real-time or in batch processing pipelines.

Input:
The model expects an image as input. The image should be in RGB format with dimensions (height, width, channels), where channels are in the order of Red, Green, and Blue.

Output:
The model outputs the predicted class label and its corresponding probability score. The class label represents one of the ten categories mentioned above, while the probability score indicates the confidence of the model's prediction.

Model Training:
This model has been trained using PyTorch deep learning framework and leverages the torchvision library for image preprocessing. It was trained on a large dataset of labeled images to learn rich feature representations for accurate classification. The model's parameters have been fine-tuned through backpropagation using gradient descent optimization techniques.

Performance:
The model achieves high accuracy on benchmark datasets such as ImageNet. However, its performance may vary depending on the complexity and diversity of the input images. It is recommended to evaluate the model's performance on your specific dataset to ensure optimal results.

Further Customization:
This model can be further fine-tuned or adapted to specific use cases by retraining on domain-specific datasets. Additionally, you can explore techniques such as transfer learning to improve the model's performance on your target tasks.

License:
This model is provided under an open-source license and can be freely used, modified, and distributed for both academic and commercial purposes. However, please refer to the original license terms of the underlying libraries and datasets used during training.

Authors:
This model was developed by Umar Zaib, leveraging the capabilities of deep learning frameworks and open-source libraries.

References:

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems, 25, 1097–1105.
torchvision: https://pytorch.org/vision/stable/index.html

Files changed (2) hide show
  1. AlexNet.ipynb +223 -0
  2. AlexNet_Model.pt +3 -0
AlexNet.ipynb ADDED
@@ -0,0 +1,223 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": 15,
6
+ "id": "b60d911e-b970-4ac1-ac19-d8e7a3b8652e",
7
+ "metadata": {},
8
+ "outputs": [],
9
+ "source": [
10
+ "import torch\n",
11
+ "\n",
12
+ "# Load the model onto the CPU\n",
13
+ "AlexNet_Model = torch.load('AlexNet_Model.pt', map_location=torch.device('cpu'))\n",
14
+ "\n"
15
+ ]
16
+ },
17
+ {
18
+ "cell_type": "code",
19
+ "execution_count": 16,
20
+ "id": "2d154a7e-f3ba-4f52-8489-c45e9c736a9c",
21
+ "metadata": {},
22
+ "outputs": [],
23
+ "source": [
24
+ "import torchvision.transforms as transforms\n",
25
+ "transform = transforms.Compose([\n",
26
+ " transforms.Resize(256),\n",
27
+ " transforms.CenterCrop(224),\n",
28
+ " transforms.ToTensor(),\n",
29
+ " transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),\n",
30
+ "])"
31
+ ]
32
+ },
33
+ {
34
+ "cell_type": "code",
35
+ "execution_count": 17,
36
+ "id": "390a39ff-b95c-48bd-8556-2e6cc8fe8d40",
37
+ "metadata": {},
38
+ "outputs": [
39
+ {
40
+ "name": "stdout",
41
+ "output_type": "stream",
42
+ "text": [
43
+ "Image: car.jpeg\n",
44
+ "Class: Car, Probability: 1.0\n",
45
+ "Class: Truck, Probability: 1.9733740683203216e-11\n",
46
+ "Class: Ship, Probability: 1.0144151279612226e-16\n",
47
+ "Class: Airplane, Probability: 8.654194920607349e-20\n",
48
+ "Class: Frog, Probability: 5.849379056607012e-28\n",
49
+ "Class: Deer, Probability: 2.1034692384348658e-29\n",
50
+ "Class: Bird, Probability: 8.280908192503261e-30\n",
51
+ "Class: Horse, Probability: 3.741535700249233e-30\n",
52
+ "Class: Cat, Probability: 4.912921745335171e-31\n",
53
+ "Class: Dog, Probability: 1.9247921025898383e-33\n"
54
+ ]
55
+ }
56
+ ],
57
+ "source": [
58
+ "\n",
59
+ "import torch\n",
60
+ "from PIL import Image\n",
61
+ "\n",
62
+ "classes = ('Airplane', 'Car', 'Bird', 'Cat', 'Deer', 'Dog', 'Frog', 'Horse', 'Ship', 'Truck')\n",
63
+ "# Define a function to preprocess the image\n",
64
+ "def preprocess_image(image_path):\n",
65
+ " image = Image.open(image_path)\n",
66
+ " image = transform(image).unsqueeze(0)\n",
67
+ " return image\n",
68
+ "\n",
69
+ "# Define the image paths\n",
70
+ "image_paths = [\n",
71
+ " \"car.jpeg\"\n",
72
+ "]\n",
73
+ "\n",
74
+ "# Preprocess the images\n",
75
+ "preprocessed_images = [preprocess_image(image_path) for image_path in image_paths]\n",
76
+ "\n",
77
+ "# Make predictions\n",
78
+ "with torch.no_grad():\n",
79
+ " predictions = AlexNet_Model(torch.cat(preprocessed_images, dim=0))\n",
80
+ "\n",
81
+ "# Get the predicted probabilities for each class\n",
82
+ "predicted_probabilities = torch.nn.functional.softmax(predictions, dim=1)\n",
83
+ "\n",
84
+ "# Print the probabilities for each class sorted by probability\n",
85
+ "for image_path, probabilities in zip(image_paths, predicted_probabilities):\n",
86
+ " print(f\"Image: {image_path}\")\n",
87
+ " sorted_indices = torch.argsort(probabilities, descending=True)\n",
88
+ " for idx in sorted_indices:\n",
89
+ " class_name = classes[idx]\n",
90
+ " prob = probabilities[idx].item()\n",
91
+ " print(f\"Class: {class_name}, Probability: {prob}\")\n"
92
+ ]
93
+ },
94
+ {
95
+ "cell_type": "code",
96
+ "execution_count": 11,
97
+ "id": "e01098f6-0b9b-4991-a6d4-29f9fa146ca9",
98
+ "metadata": {},
99
+ "outputs": [
100
+ {
101
+ "name": "stdout",
102
+ "output_type": "stream",
103
+ "text": [
104
+ "Running on local URL: http://127.0.0.1:7868\n",
105
+ "\n",
106
+ "To create a public link, set `share=True` in `launch()`.\n"
107
+ ]
108
+ },
109
+ {
110
+ "data": {
111
+ "text/html": [
112
+ "<div><iframe src=\"http://127.0.0.1:7868/\" width=\"100%\" height=\"500\" allow=\"autoplay; camera; microphone; clipboard-read; clipboard-write;\" frameborder=\"0\" allowfullscreen></iframe></div>"
113
+ ],
114
+ "text/plain": [
115
+ "<IPython.core.display.HTML object>"
116
+ ]
117
+ },
118
+ "metadata": {},
119
+ "output_type": "display_data"
120
+ },
121
+ {
122
+ "data": {
123
+ "text/plain": []
124
+ },
125
+ "execution_count": 11,
126
+ "metadata": {},
127
+ "output_type": "execute_result"
128
+ },
129
+ {
130
+ "name": "stdout",
131
+ "output_type": "stream",
132
+ "text": [
133
+ "[('Car', 1.0), ('Truck', 1.9733740683203216e-11), ('Ship', 1.0144151279612226e-16), ('Airplane', 8.654194920607349e-20), ('Frog', 5.849379056607012e-28), ('Deer', 2.1034692384348658e-29), ('Bird', 8.280908192503261e-30), ('Horse', 3.741535700249233e-30), ('Cat', 4.912921745335171e-31), ('Dog', 1.9247921025898383e-33)]\n",
134
+ "[('Airplane', 0.5250793695449829), ('Truck', 0.3982219994068146), ('Car', 0.05421096831560135), ('Ship', 0.022192474454641342), ('Dog', 8.419439836870879e-05), ('Cat', 7.965308031998575e-05), ('Frog', 7.490476127713919e-05), ('Horse', 2.9290411475813016e-05), ('Bird', 2.588589813967701e-05), ('Deer', 1.2184905244794209e-06)]\n"
135
+ ]
136
+ }
137
+ ],
138
+ "source": [
139
+ "import torch\n",
140
+ "from PIL import Image\n",
141
+ "import torchvision.transforms as transforms\n",
142
+ "import gradio as gr\n",
143
+ "import numpy as np\n",
144
+ "\n",
145
+ "# Load the model onto the CPU\n",
146
+ "AlexNet_Model = torch.load('AlexNet_Model.pt', map_location=torch.device('cpu'))\n",
147
+ "\n",
148
+ "transform = transforms.Compose([\n",
149
+ " transforms.Resize(256),\n",
150
+ " transforms.CenterCrop(224),\n",
151
+ " transforms.ToTensor(),\n",
152
+ " transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),\n",
153
+ "])\n",
154
+ "\n",
155
+ "classes = ('Airplane', 'Car', 'Bird', 'Cat', 'Deer', 'Dog', 'Frog', 'Horse', 'Ship', 'Truck')\n",
156
+ "\n",
157
+ "# Define a function to preprocess the image\n",
158
+ "def preprocess_image(image):\n",
159
+ " if isinstance(image, np.ndarray):\n",
160
+ " image = Image.fromarray(image)\n",
161
+ " image = transform(image).unsqueeze(0)\n",
162
+ " return image\n",
163
+ "\n",
164
+ "# Define the prediction function\n",
165
+ "def predict(image):\n",
166
+ " image = preprocess_image(image)\n",
167
+ " with torch.no_grad():\n",
168
+ " predictions = AlexNet_Model(image)\n",
169
+ "\n",
170
+ " # Get the predicted probabilities for each class\n",
171
+ " predicted_probabilities = torch.nn.functional.softmax(predictions, dim=1)\n",
172
+ " results = []\n",
173
+ " for probabilities in (predicted_probabilities):\n",
174
+ " sorted_indices = torch.argsort(probabilities, descending=True)\n",
175
+ " for idx in sorted_indices:\n",
176
+ " class_name = classes[idx]\n",
177
+ " prob = probabilities[idx].item()\n",
178
+ " #print(f\"Class: {class_name}, Probability: {prob}\")\n",
179
+ " results.append((class_name, prob))\n",
180
+ "\n",
181
+ "\n",
182
+ " return {class_name: prob for class_name, prob in results}\n",
183
+ "\n",
184
+ "# Create Gradio interface\n",
185
+ "iface = gr.Interface(predict, \n",
186
+ " inputs=\"image\", \n",
187
+ " outputs=\"label\", \n",
188
+ " title=\"AlexNet Image Classifier\",\n",
189
+ " description=\"Classify images into one of 10 classes: Airplane, Car, Bird, Cat, Deer, Dog, Frog, Horse, Ship, or Truck.\")\n",
190
+ "iface.launch()\n"
191
+ ]
192
+ },
193
+ {
194
+ "cell_type": "code",
195
+ "execution_count": null,
196
+ "id": "7f6dac97-96d4-45f2-b630-9ecb6b2ca159",
197
+ "metadata": {},
198
+ "outputs": [],
199
+ "source": []
200
+ }
201
+ ],
202
+ "metadata": {
203
+ "kernelspec": {
204
+ "display_name": "Python 3 (ipykernel)",
205
+ "language": "python",
206
+ "name": "python3"
207
+ },
208
+ "language_info": {
209
+ "codemirror_mode": {
210
+ "name": "ipython",
211
+ "version": 3
212
+ },
213
+ "file_extension": ".py",
214
+ "mimetype": "text/x-python",
215
+ "name": "python",
216
+ "nbconvert_exporter": "python",
217
+ "pygments_lexer": "ipython3",
218
+ "version": "3.12.0"
219
+ }
220
+ },
221
+ "nbformat": 4,
222
+ "nbformat_minor": 5
223
+ }
AlexNet_Model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1be679887d728be6cd3b1b49876b2d8379dfa49687193e252b67f3b164678f9f
3
+ size 177724574