Speaker Embedding

#64

by bertrand-fournel - opened Jan 10

Jan 10

Hi ! Is it possible de perform Speaker Embedding with Whisper ? For example, encode a few seconds of audio (a speaker) to a vector, encode a second audio file with another speaker and get the "distance" (cosine similarity for example) between two voices (or between voice of same speaker), thanks you (excuse my english).

jacov911

Jan 13

use pyannote

Ilianos

Feb 6

e.g. with https://github.com/m-bain/whisperX

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment