edwixx
/

nano-speech

speech-recognition

Model card Files Files and versions

Nano-Speech Dataset

A compact, general-purpose speech dataset designed for lightweight ASR and speech processing tasks.

Dataset Description

Domain: General speech
Format: Compressed ZIP archive
Size: ~25 GB
Access: Public (no authentication required)

Contents

The dataset is provided as a single archive:

nano_speech.zip — Contains speech audio files, ready for use in training and evaluation pipelines

Usage

from huggingface_hub import snapshot_download
snapshot_download("edwixx/nano-speech")

Or load directly with 🤗 Datasets (if a config is defined):

from datasets import load_dataset
dataset = load_dataset("edwixx/nano-speech")

Intended Uses

Automatic Speech Recognition (ASR) model training
Speech embedding and representation learning
Evaluating compact speech models on diverse audio
Lightweight / on-device speech processing research

License

This dataset is released under the MIT License.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support