---
license: apache-2.0
metrics:
- accuracy
language:
- en
- zh
- ko
- ja
- de
- fr
- es
- pt
- vi
- tr
- it
- ru
- id
tags:
- keras
- tensorflow
- image-classification
libraries: TensorBoard
widget:
- example_title: English Sample
  src: >-
    https://huggingface.co/SpeechFlow/spoken_language_identification/blob/main/test_audios/english.wav
pipeline_tag: audio-classification
library_name: transformers
---

# Spoken_language_identification

## Model description

This is a spoken language recognition model trained on private dataset using Tensorflow.
the model uses the CRNN-Attention architecture that has previously been used for extracting utterance-level feature representations.

The system is trained with recordings sampled at 16kHz, single channel, and 16-bit Signed Integer PCM encoding.

The model can classify a speech utterance according to the language spoken.
It covers 13 different languages(
chinese
english
french
german
indonesian
italian
japanese
korean
portuguese
russian
spanish
turkish
vietnamese
)

## Intended uses & Limitations

#### How to use

```python

import librosa
from huggingface_hub import from_pretrained_keras
from featurizers.speech_featurizers import TFSpeechFeaturizer,
model = from_pretrained_keras("SpeechFlow/spoken_language_identification")
signal, _ = librosa.load(wav_path, sr=16000)
output, prob = model.predict_pb(signal)
print(output)

```