This is an implementation of the Google's Vision Transformer large patch 32 that is used for music classification into different genres. The dataset used is the gtzan dataset which has melspectrograms of many songs.

faba7b5 verified 6 months ago

preview code

raw

history blame contribute delete

167 Bytes

	---
	license: apache-2.0
	datasets:
	- ghermoso/egtzan_plus
	metrics:
	- accuracy
	library_name: transformers
	pipeline_tag: image-classification
	tags:
	- ViT
	- music
	- CV
	---