Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Dataset is about ~2000 hours of speech and vocals

Supported languages (english or spanish?) who ever moves first is:

~800 hrs of English (with vast verity of speakers and every emotion)

~200 Spanish

~42 French

~188 Russian

~70 Arabic

~140 Japanese

~70 Chinese (Mandarin)

~80 Korean

~30 Hindi

~53 Indonesian

~30 Tagalog

~40 Portuguese

~35 German

~190 singing (all languages)

common language (I don't remember how much data was there)

Type: big-base for finetuning

Batch: 2-40-80

fp32

Sampling frequency: 32k 40k

Total steps count: 371406

Hardware used:

1 - h100, 4 - L40s

Expected release date - 22 july

image/png

old post was deleted
Downloads last month

-

Downloads are not tracked for this model. How to track
Unable to determine this model's library. Check the docs .