--- license: apache-2.0 tags: - anime --- Trained a vit model to do classification on anime dataset. Divided into four categories: head_only, upperbody, knee_level, fullbody + head_only ![head_only_example2.jpg](https://cdn-uploads.huggingface.co/production/uploads/63891deed68e37abd59e883f/0fbMrqDA8_PKrm9UG2cR3.jpeg) + upperbody ![upperbody_example2.jpg](https://cdn-uploads.huggingface.co/production/uploads/63891deed68e37abd59e883f/rxEXJhuJLHrulgHyaRy61.jpeg) + knee_level ![knee_level_example2.jpg](https://cdn-uploads.huggingface.co/production/uploads/63891deed68e37abd59e883f/o63VCshR6u1d_p2myxum9.jpeg) + fullbody ![fullbody_example2.jpg](https://cdn-uploads.huggingface.co/production/uploads/63891deed68e37abd59e883f/UQ4UKrko4qcubo0ueM0wq.jpeg) ``` from datasets import load_dataset from PIL import Image from transformers import ViTImageProcessor, ViTForImageClassification, TrainingArguments, Trainer import torch import numpy as np from datasets import load_metric import os import shutil model_name_or_path = 'lrzjason/anime_portrait_vit' image_processor = ViTImageProcessor.from_pretrained(model_name_or_path) model = ViTForImageClassification.from_pretrained(model_name_or_path) input_dir = '/path/to/dir' file = 'example.jpg' image = Image.open(os.path.join(input_dir, file)) inputs = image_processor(image, return_tensors="pt") with torch.no_grad(): logits = model(**inputs).logits # model predicts one of the 1000 ImageNet classes predicted_label = logits.argmax(-1).item() print(f'predicted_label: {model.config.id2label[predicted_label]}') ``` Using this dataset: https://huggingface.co/datasets/animelover/genshin-impact-images