KaushalB's picture
This is an implementation of the Google's Vision Transformer large patch 32 that is used for music classification into different genres. The dataset used is the gtzan dataset which has melspectrograms of many songs.
faba7b5 verified
|
raw
history blame contribute delete
No virus
167 Bytes
metadata
license: apache-2.0
datasets:
  - ghermoso/egtzan_plus
metrics:
  - accuracy
library_name: transformers
pipeline_tag: image-classification
tags:
  - ViT
  - music
  - CV