Data Studio

community

AI & ML interests

Dataset for Machine Learning.

Please feel free to contact me if needed: nguyenminh.quan0663@gmail.com (Minh Quan).

Data Information:

OCR

Vietnamese Document with 14 different types of noises: > 2 million samples (line level).
Vietnamese Document with noises: > 100 thousand samples (word level).

Text-to-Speech

> 3000 hours Vietnamese Male & Female Voices.