Testing the Model with an Example Wav File

#1
by krugjo - opened

Hey,

Does anyone know how to use this model with an Example wav file? I want to use an Audio file in my /content/ directory of my Colab notebook

-- path = 'example.wav'

and thought I can just exchange the line

-- process_func(signal, sampling_rate)

with

-- process_func(path, sampling_rate)

But sadly its not that easy. Can anyone help?

audEERING GmbH org
β€’
edited Apr 19, 2023

Hi,

please have a look at the following tutorial:

https://github.com/audeering/w2v2-how-to

I have now added also added a link in the model card.

cheers
Johannes

Thanks for the fast reply, I looked into the tutorial and it helped a lot!

I have loaded the model as in the tutorial and tried to evaluate it on a few wav files. Most of the time I got promising results but a few times the values of arousal, dominance and valence where shooting over the boundaries of 1.0. As far of my understanding the values of these parameters should be in the range of 0 - 1 or have I maybe misread the paper?

import os
import librosa
import audonnx
import audinterface

Load the model

model_root = 'model'
model = audonnx.load(model_root)

Define the input signal

input_file = '/content/Pulp Fiction Best Scene - Does He Look Like a Bitch.mp4.39.wav'

signal, sampling_rate = librosa.load(input_file, sr=16000, mono=True)

Create an interface to process the signal

interface = audinterface.Feature(
model.labels('logits'),
process_func=model,
process_func_args={'outputs': 'logits'},
sampling_rate=sampling_rate,
resample=True,
verbose=True
)

Process the signal using the interface

output = interface.process_signal(signal, sampling_rate)

print(output)

Results:
Arousal: 1.078532
Dominance: 1.043147
Valence: -0.137843

I used a rather aggressive line of Samuel L. Jackson of the movie Pulp Fiction.

I used Librosa to downsample the sample Rate to 16000Hz. Was this maybe the Problem for the deviating results?

audEERING GmbH org

It can indeed happen that in rare cases you will observe values slightly out of the expected range of [0..1]. And as you mention, your example is indeed quite extreme. If your application expects [0..1], simply cut the values to fit the interval.

audmax changed discussion status to closed

Sign up or log in to comment