|
# Model Card for Respeecher/ukrainian-data2vec |
|
|
|
This model can be used as Feature Extractor model for Ukrainian language audio data |
|
|
|
It can also be used as Backbone for downstream tasks, like ASR, Audio Classification, etc. |
|
|
|
### How to Get Started with the Model |
|
|
|
```python |
|
from transformers import AutoProcessor, Data2VecAudioModel |
|
import torch |
|
from datasets import load_dataset, Audio |
|
|
|
dataset = load_dataset("mozilla-foundation/common_voice_11_0", "uk", split="validation") |
|
# Resample |
|
dataset = dataset.cast_column("audio", Audio(sampling_rate=16_000)) |
|
|
|
processor = AutoProcessor.from_pretrained("Respeecher/ukrainian-data2vec") |
|
model = Data2VecAudioModel.from_pretrained("Respeecher/ukrainian-data2vec") |
|
|
|
# audio file is decoded on the fly |
|
inputs = processor(dataset[0]["audio"]["array"], sampling_rate=sampling_rate, return_tensors="pt") |
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
|
|
last_hidden_states = outputs.last_hidden_state |
|
list(last_hidden_states.shape) |
|
``` |
|
|