# Project: Portfolio - Final Project

**Instructions for Students:**

Please carefully follow these steps to complete and submit your assignment:

1. **Completing the Assignment**: You are required to work on and complete all tasks in the provided assignment. Be disciplined and ensure that you thoroughly engage with each task.
 
2. **Creating a Google Drive Folder**: If you don't previously have a folder for collecting assignments, you must create a new folder in your Google Drive. This will be a repository for all your completed assignment files, helping you keep your work organized and easy to access.
 
3. **Uploading Completed Assignment**: Upon completion of your assignment, make sure to upload all necessary files, involving codes, reports, and related documents into the created Google Drive folder. Save this link in the 'Student Identity' section and also provide it as the last parameter in the `submit` function that has been provided.
 
4. **Sharing Folder Link**: You're required to share the link to your assignment Google Drive folder. This is crucial for the submission and evaluation of your assignment.
 
5. **Setting Permission toPublic**: Please make sure your **Google Drive folder is set to public**. This allows your instructor to access your solutions and assess your work correctly.

Adhering to these procedures will facilitate a smooth assignment process for you and the reviewers.

**Description:**

Welcome to your final portfolio project assignment for AI Bootcamp. This is your chance to put all the skills and knowledge you've learned throughout the bootcamp into action by creating real-world AI application.

You have the freedom to create any application or model, be it text-based or image-based or even voice-based or multimodal.

To get you started, here are some ideas:

1. **Sentiment Analysis Application:** Develop an application that can determine sentiment (positive, negative, neutral) from text data like reviews or social media posts. You can use Natural Language Processing (NLP) libraries like NLTK or TextBlob, or more advanced pre-trained models from transformers library by Hugging Face, for your sentiment analysis model.

2. **Chatbot:** Design a chatbot serving a specific purpose such as customer service for a certain industry, a personal fitness coach, or a study helper. Libraries like ChatterBot or Dialogflow can assist in designing conversational agents.

3. **Predictive Text Application:** Develop a model that suggests the next word or sentence similar to predictive text on smartphone keyboards. You could use the transformers library by Hugging Face, which includes pre-trained models like GPT-2.

4. **Image Classification Application:** Create a model to distinguish between different types of flowers or fruits. For this type of image classification task, pre-trained models like ResNet or VGG from PyTorch or TensorFlow can be utilized.

5. **News Article Classifier:** Develop a text classification model that categorizes news articles into predefined categories. NLTK, SpaCy, and sklearn are valuable libraries for text pre-processing, feature extraction, and building classification models.

6. **Recommendation System:** Create a simplified recommendation system. For instance, a book or movie recommender based on user preferences. Python's Surprise library can assist in building effective recommendation systems.

7. **Plant Disease Detection:** Develop a model to identify diseases in plants using leaf images. This project requires a good understanding of convolutional neural networks (CNNs) and image processing. PyTorch, TensorFlow, and OpenCV are all great tools to use.

8. **Facial Expression Recognition:** Develop a model to classify human facial expressions. This involves complex feature extraction and classification algorithms. You might want to leverage deep learning libraries like TensorFlow or PyTorch, along with OpenCV for processing facial images.

9. **Chest X-Ray Interpretation:** Develop a model to detect abnormalities in chest X-ray images. This task may require understanding of specific features in such images. Again, TensorFlow and PyTorch for deep learning, and libraries like SciKit-Image or PIL for image processing, could be of use.

10. **Food Classification:** Develop a model to classify a variety of foods such as local Indonesian food. Pre-trained models like ResNet or VGG from PyTorch or TensorFlow can be a good starting point.

11. **Traffic Sign Recognition:** Design a model to recognize different traffic signs. This project has real-world applicability in self-driving car technology. Once more, you might utilize PyTorch or TensorFlow for the deep learning aspect, and OpenCV for image processing tasks.

**Submission:**

Please upload both your model and application to Huggingface or your own Github account for submission.

**Presentation:**

You are required to create a presentation to showcase your project, including the following details:

- The objective of your model.
- A comprehensive description of your model.
- The specific metrics used to measure your model's effectiveness.
- A brief overview of the dataset used, including its source, pre-processing steps, and any insights.
- An explanation of the methodology used in developing the model.
- A discussion on challenges faced, how they were handled, and your learnings from those.
- Suggestions for potential future improvements to the model.
- A functioning link to a demo of your model in action.

**Grading:**

Submissions will be manually graded, with a select few given the opportunity to present their projects in front of a panel of judges. This will provide valuable feedback, further enhancing your project and expanding your knowledge base.

Remember, consistent practice is the key to mastering these concepts. Apply your knowledge, ask questions when in doubt, and above all, enjoy the process. Best of luck to you all!


In [1]:
# @title #### Student Identity
student_id = "REAS0XP1" # @param {type:"string"}
name = "Mikael Kristiadi" # @param {type:"string"}
drive_link = "https://drive.google.com/drive/folders/1lNPe5vm0Tntbs6leXDh7LULhTOh2esOI?usp=drive_link" # @param {type:"string"}
assignment_id = "00_portfolio_project"

## Installation and Import `rggrader` Package

In [None]:
%pip install rggrader
from rggrader import submit_image
from rggrader import submit

## Working Space

In [3]:
# Write your code here
# Feel free to add new code block as needed
import os
import numpy as np
import torch
import glob
import torch.nn as nn
from torchvision.transforms import transforms
from torch.utils.data import DataLoader
from torch.optim import Adam
from torch.autograd import Variable
import torchvision
import pathlib

In [4]:
!pip install split-folders
!pip install datasets torchvision
!pip install matplotlib

Collecting split-folders
 Downloading split_folders-0.5.1-py3-none-any.whl (8.4 kB)
Installing collected packages: split-folders
Successfully installed split-folders-0.5.1
Collecting datasets
 Downloading datasets-2.18.0-py3-none-any.whl (510 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m510.5/510.5 kB[0m [31m6.8 MB/s[0m eta [36m0:00:00[0m
Collecting dill<0.3.9,>=0.3.0 (from datasets)
 Downloading dill-0.3.8-py3-none-any.whl (116 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m18.3 MB/s[0m eta [36m0:00:00[0m
Collecting xxhash (from datasets)
 Downloading xxhash-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (194 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m194.1/194.1 kB[0m [31m26.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting multiprocess (from datasets)
 Downloading multiprocess-0.70.16-py310-none-any.whl (134 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m134.8

In [8]:
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle

In [9]:
!kaggle datasets download -d faldoae/padangfood

Downloading padangfood.zip to /content
 99% 113M/114M [00:07<00:00, 20.4MB/s]
100% 114M/114M [00:07<00:00, 15.8MB/s]


In [10]:
import zipfile
zip_ref = zipfile.ZipFile('/content/padangfood.zip', 'r')
zip_ref.extractall('/content')
zip_ref.close()

In [11]:
import splitfolders
splitfolders.ratio('/content/dataset_padang_food', output="output", seed=1337, ratio=(0.8, 0.2))

Copying files: 993 files [00:00, 2919.10 files/s]


**Transforms Data**

In [12]:
train_dataset_path = '/content/output/train'
test_dataset_path = '/content/output/val'

In [13]:
training_transforms = transforms.Compose([transforms.Resize([224,224]), transforms.ToTensor()])

In [14]:
training_dataset = torchvision.datasets.ImageFolder(root = train_dataset_path, transform = training_transforms)

In [15]:
training_loader = torch.utils.data.DataLoader(dataset = training_dataset, batch_size = 32, shuffle = False)

In [16]:
def get_mean_stdev(loader):
 mean = 0
 std = 0
 total_images_count = 0
 for images, _ in loader:
 image_count_in_a_batch = images.size(0)
 images = images.view(image_count_in_a_batch, images.size(1), (-1))
 mean += images.mean(2).sum(0)
 std += images.std(2).sum(0)
 total_images_count += image_count_in_a_batch

 mean /= total_images_count
 std /= total_images_count

 return mean, std

In [17]:
get_mean_stdev(training_loader)

(tensor([0.6207, 0.4703, 0.3137]), tensor([0.2067, 0.2334, 0.2579]))

In [18]:
mean = [0.6207, 0.4703, 0.3137]
std = [0.2067, 0.2334, 0.2579]

train_transforms = transforms.Compose([
 transforms.Resize((150,150)),
 transforms.RandomHorizontalFlip(),
 transforms.RandomRotation(10),
 transforms.ToTensor(),
 transforms.Normalize(torch.Tensor(mean), torch.Tensor(std))
])

test_transforms = transforms.Compose([
 transforms.Resize((150,150)),
 transforms.ToTensor(),
 transforms.Normalize(torch.Tensor(mean), torch.Tensor(std))
])

**Load Dataset**

In [19]:
train_loader = DataLoader(
 torchvision.datasets.ImageFolder(train_dataset_path, transform=train_transforms),
 batch_size=256, shuffle=True
)

test_loader = DataLoader(
 torchvision.datasets.ImageFolder(test_dataset_path, transform=test_transforms),
 batch_size=256, shuffle=False
)

**Categories**

In [20]:
root = pathlib.Path(train_dataset_path)
classes = sorted([j.name.split('/')[-1] for j in root.iterdir()])

In [21]:
print(classes)

['ayam_goreng', 'ayam_pop', 'daging_rendang', 'dendeng_batokok', 'gulai_ikan', 'gulai_tambusu', 'gulai_tunjang', 'telur_balado', 'telur_dadar']


**Model Define**

In [22]:
class ConvNet(nn.Module):
 def __init__(self, num_classes=9):
 super(ConvNet, self).__init__()

 self.conv1 = nn.Conv2d(in_channels=3, out_channels=12, kernel_size=3, stride=1, padding=1)
 self.bn1 = nn.BatchNorm2d(num_features=12)
 self.relu1 = nn.ReLU()
 self.pool = nn.MaxPool2d(kernel_size=2)

 self.conv2 =nn.Conv2d(in_channels=12, out_channels=20, kernel_size=3, stride=1, padding=1)
 self.relu2 = nn.ReLU()

 self.conv3 =nn.Conv2d(in_channels=20, out_channels=32, kernel_size=3, stride=1, padding=1)
 self.bn3 = nn.BatchNorm2d(num_features=32)
 self.relu3 = nn.ReLU()

 self.fc = nn.Linear(in_features=32*75*75, out_features=num_classes)

 #Feed forward
 def forward(self, input):
 output = self.conv1(input)
 output = self.bn1(output)
 output = self.relu1(output)

 output = self.pool(output)

 output = self.conv2(output)
 output = self.relu2(output)

 output = self.conv3(output)
 output = self.bn3(output)
 output = self.relu3(output)

 output = output.view(-1, 32*75*75)

 output = self.fc(output)

 return(output)

In [23]:
model = ConvNet(num_classes=9)

**Optimizer**

In [24]:
optimizer=Adam(model.parameters(), lr=0.001, weight_decay=0.0001)
loss_function = nn.CrossEntropyLoss()

**Define Train and Test Size**

In [25]:
train_count = len(glob.glob(train_dataset_path+'/**/*.jpg'))
test_count = len(glob.glob(test_dataset_path+'/**/*.jpg'))

In [26]:
print(train_count, test_count)

767 199


**Model Training**

In [32]:
num_epoch = 20
def train_nn(model, train_loader, test_loader, loss_function, optimizer, num_epoch):
 best_acc = 0

 for epoch in range(num_epoch):
 print('Epoch number %d ' % (epoch + 1))
 model.train()
 running_loss = 0.0
 running_correct = 0.0
 total = 0

 for i, (images, labels) in enumerate(train_loader):
 total += labels.size(0)

 optimizer.zero_grad()

 outputs = model(images)

 _, predicted = torch.max(outputs.data, 1)

 loss = loss_function(outputs, labels)
 loss.backward()

 optimizer.step()

 running_loss += loss.item()
 running_correct += (labels==predicted).sum().item()

 epoch_loss = running_loss/len(train_loader)
 epoch_acc = 100.00 * running_correct / total

 print(' -Training dataset. Got %d out of %d images correctly (%.3f%%). Epoch loss: %.3f'
 % (running_correct, total, epoch_acc, epoch_loss))

 test_dataset_acc = evaluate_model_on_test_set(model, test_loader)

 if(test_dataset_acc > best_acc):
 best_acc = test_dataset_acc
 save_checkpoint(model, epoch, optimizer, best_acc)

 print('Finished')
 return

In [33]:
def evaluate_model_on_test_set(model, test_loader):
 model.eval()
 predicted_correctly_on_epoch = 0
 total = 0

 with torch.no_grad():
 for i, (images, labels) in enumerate(test_loader):
 total += labels.size(0)

 outputs = model(images)

 _, predicted = torch.max(outputs.data, 1)

 predicted_correctly_on_epoch += (predicted == labels).sum().item()

 epoch_acc = 100.00 * predicted_correctly_on_epoch / total
 print(' -Testing dataset. Got %d out of %d images correctly (%.3f%%)'
 % (predicted_correctly_on_epoch, total, epoch_acc))

 return epoch_acc

In [34]:
def save_checkpoint(model, epoch, optimizer, best_acc):
 state = {
 'epoch': epoch + 1,
 'model': model.state_dict(),
 'best_accuracy': best_acc,
 'optimizer': optimizer.state_dict
 }
 torch.save(state, 'best_model_checkpoint.pth.tar')

In [35]:
import torchvision.models as models
import torch.nn as nn
import torch.optim as optim

model = ConvNet(num_classes=9)
loss_function = nn.CrossEntropyLoss()

optimizer=Adam(model.parameters(), lr=0.001, weight_decay=0.0001)

In [52]:
train_nn(model, train_loader, test_loader, loss_function, optimizer, 30)

Epoch number 1 
 -Training dataset. Got 600 out of 790 images correctly (75.949%). Epoch loss: 1.600
 -Testing dataset. Got 133 out of 203 images correctly (65.517%)
Epoch number 2 
 -Training dataset. Got 601 out of 790 images correctly (76.076%). Epoch loss: 1.671
 -Testing dataset. Got 131 out of 203 images correctly (64.532%)
Epoch number 3 
 -Training dataset. Got 601 out of 790 images correctly (76.076%). Epoch loss: 1.908
 -Testing dataset. Got 130 out of 203 images correctly (64.039%)
Epoch number 4 
 -Training dataset. Got 600 out of 790 images correctly (75.949%). Epoch loss: 2.209
 -Testing dataset. Got 130 out of 203 images correctly (64.039%)
Epoch number 5 
 -Training dataset. Got 590 out of 790 images correctly (74.684%). Epoch loss: 1.751
 -Testing dataset. Got 131 out of 203 images correctly (64.532%)
Epoch number 6 
 -Training dataset. Got 600 out of 790 images correctly (75.949%). Epoch loss: 1.693
 -Testing dataset. Got 133 out of 203 images correctly (65.517%)
Epoc

**Saving Model and Checkpoint**

In [53]:
checkpoint = torch.load('/content/best_model_checkpoint.pth.tar')

In [54]:
model = ConvNet(num_classes=9)
model.load_state_dict(checkpoint['model'])

torch.save(model, 'best_model.pth')

**Testing**

In [55]:
root = pathlib.Path(train_dataset_path)
classes = sorted([j.name.split('/')[-1] for j in root.iterdir()])
print(classes)

['ayam_goreng', 'ayam_pop', 'daging_rendang', 'dendeng_batokok', 'gulai_ikan', 'gulai_tambusu', 'gulai_tunjang', 'telur_balado', 'telur_dadar']


In [56]:
Image_transforms = transforms.Compose([
 transforms.Resize((150,150)),
 transforms.ToTensor(),
 transforms.Normalize(torch.Tensor(mean), torch.Tensor(std))
])

In [57]:
import PIL.Image as Image
def classify(model, Image_transforms, Image_path, classes):
 # Load the image and apply the image transforms
 image = Image.open(Image_path)
 image = Image_transforms(image)

 # Add a batch dimension to the image tensor
 image = image.unsqueeze(0)

 # Make a prediction using the model
 output = model(image)

 # Get the predicted class index
 _, predicted = torch.max(output.data, 1)

 # Print the predicted class index and the corresponding class label
 for index, class_label in enumerate(classes):
 if index == predicted.item():
 print(f'Predicted class index: {index}, Class: {class_label}')
 break
 else:
 print(f"Predicted class index {predicted.item()} is out of range for the classes list.")

In [58]:
classify(model, Image_transforms, '/content/ayam pop.png', classes)

Predicted class index: 1, Class: ayam_pop


## Submit Notebook

In [None]:
portfolio_link = ""
presentation_link = "https://www.canva.com/design/DAGAEAMyzxU/Nley3agoNuaG1DNuZNvGHQ/edit?utm_content=DAGAEAMyzxU&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton"

question_id = "01_portfolio_link"
submit(student_id, name, assignment_id, str(portfolio_link), question_id, drive_link)

question_id = "02_presentation_link"
submit(student_id, name, assignment_id, str(presentation_link), question_id, drive_link)

# FIN