File size: 3,713 Bytes
5e2a4f2
bef9f58
 
 
5e2a4f2
bef9f58
 
 
 
 
 
2e16ff6
bef9f58
d2601a3
bef9f58
d1c7f0a
5e2a4f2
bef9f58
5e2a4f2
bef9f58
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
---
license: apache-2.0
language:
- en
tags:
- hearing loss
- challenge
- signal processing
- source separation
- lyrics intelligibility
- audio
- audio-to-audio
widget:
- src: https://cdn-media.huggingface.co/speech_samples/sample1.flac
  example_title: Test
pipeline_tag: audio-to-audio
---
# Cadenza Challenge: CAD2-Task1

A Causal Lyrics/Accompaniment separation model for the CAD2-Task1 baseline system.

## Parameters

* Architecture: ConvTasNet (Kaituo XU) with multichannel support (Alexandre Defossez).
* Parameters:
  * B: 256
  * C: 2
  * H: 512
  * L: 20
  * N: 256
  * P: 3
  * R: 4
  * X: 10
  * audio_channels: 2
  * causal: true
  * mask_nonlinear: relu
  * norm_type: cLN
* training:
  * sample_rate: 44100
  * samples_per_track: 64
  * segment: 4.0
  * aggregate: 1
  * batch_size: 4
  * early_stop: true
  * epochs: 200


## Dataset
The model was trained on the training split of the MUSDB18-HQ dataset.

## How to use

```
from tasnet import ConvTasNetStereo
model = ConvTasNetStereo.from_pretrained(
    "cadenzachallenge/ConvTasNet_LyricsSeparation_Causal"
).cpu()
```

## Results 

| Track | Vocals (SDR) | Accompaniment (SDR) |
|:------|:------------:|:---------:|
| Al James - Schoolboy Facination | 5.733 | 8.049 | 
| AM Contra - Heart Peripheral | 5.887 | 12.691 | 
| Angels In Amplifiers - I'm Alright | 5.901 | 9.124 | 
| Arise - Run Run Run | 5.208 | 14.868 | 
| Ben Carrigan - We'll Talk About It All Tonight | 2.676 | 9.919 | 
| BKS - Bulldozer | 1.523 | 11.488 | 
| BKS - Too Much | 7.005 | 11.087 | 
| Bobby Nobody - Stitch Up | 6.518 | 11.303 | 
| Buitraker - Revo X | 4.242 | 13.763 | 
| Carlos Gonzalez - A Place For Us | 3.882 | 7.57 | 
| Cristina Vane - So Easy | 7.477 | 12.126 | 
| Detsky Sad - Walkie Talkie | 6.214 | 9.47 | 
| Enda Reilly - Cur An Long Ag Seol | 7.329 | 11.51 | 
| Forkupines - Semantics | 4.556 | 11.228 | 
| Georgia Wonder - Siren | 3.165 | 7.622 | 
| Girls Under Glass - We Feel Alright | 3.176 | 11.677 | 
| Hollow Ground - Ill Fate | 5.67 | 14.987 | 
| James Elder & Mark M Thompson - The English Actor | 4.014 | 8.834 | 
| Juliet's Rescue - Heartbeats | 5.317 | 13.101 | 
| Little Chicago's Finest - My Own | 4.409 | 5.378 | 
| Louis Cressy Band - Good Time | 5.903 | 10.918 | 
| Lyndsey Ollard - Catching Up | 7.812 | 10.793 | 
| M.E.R.C. Music - Knockout | 5.663 | 7.815 | 
| Moosmusic - Big Dummy Shake | 7.081 | 12.772 | 
| Motor Tapes - Shore | 1.745 | 8.775 | 
| Mu - Too Bright | 5.518 | 12.242 | 
| Nerve 9 - Pray For The Rain | 5.685 | 11.674 | 
| PR - Happy Daze | -2.89 | 37.274 | 
| PR - Oh No | 0 | 8.987 | 
| Punkdisco - Oral Hygiene | 5.044 | 16.173 | 
| Raft Monk - Tiring | 2.119 | 8.977 | 
| Sambasevam Shanmugam - Kaathaadi | 7.51 | 9.801 | 
| Secretariat - Borderline | 5.068 | 9.195 | 
| Secretariat - Over The Top | 6.278 | 13.556 | 
| Side Effects Project - Sing With Me | 9.637 | 11.222 | 
| Signe Jakobsen - What Have You Done To Me | 6.884 | 9.656 | 
| Skelpolu - Resurrection | 0.053 | 8.272 | 
| Speak Softly - Broken Man | 3.743 | 13.497 | 
| Speak Softly - Like Horses | 4.339 | 7.233 | 
| The Doppler Shift - Atrophy | 2.47 | 12.58 | 
| The Easton Ellises - Falcon 69 | 2.507 | 8.137 | 
| The Easton Ellises (Baumi) - SDRNR | 1.463 | 8.136 | 
| The Long Wait - Dark Horses | 4.784 | 10.964 | 
| The Mountaineering Club - Mallory | 9.015 | 13.26 | 
| The Sunshine Garcia Band - For I Am The Moon | 8.341 | 12.1 | 
| Timboz - Pony | 2.698 | 12.415 | 
| Tom McKenzie - Directions | 7.305 | 15.07 | 
| Triviul feat. The Fiend - Widow | 6.409 | 7.938 | 
| We Fell From The Sky - Not You | 3.661 | 11.403 | 
| Zeno - Signs | 5.291 | 10.178 | 
| **Total (median over frames, median over tracks)** | **5.249** | **11.155** |