rpinto commited on
Commit
b76dc29
1 Parent(s): 7a5f483

Update generation_config.json

Browse files

Correct alignment heads after analyzing the cross-attention weights using DTW averaging 20 samples from "librispeech". Tests showed better timestamp aligments than "whisper-small.en" especially in shorter samples (8-10 seconds).

Files changed (1) hide show
  1. generation_config.json +1 -78
generation_config.json CHANGED
@@ -1,82 +1,5 @@
1
  {
2
- "alignment_heads": [
3
- [
4
- 6,
5
- 6
6
- ],
7
- [
8
- 7,
9
- 0
10
- ],
11
- [
12
- 7,
13
- 3
14
- ],
15
- [
16
- 7,
17
- 8
18
- ],
19
- [
20
- 8,
21
- 2
22
- ],
23
- [
24
- 8,
25
- 5
26
- ],
27
- [
28
- 8,
29
- 7
30
- ],
31
- [
32
- 9,
33
- 0
34
- ],
35
- [
36
- 9,
37
- 4
38
- ],
39
- [
40
- 9,
41
- 8
42
- ],
43
- [
44
- 9,
45
- 10
46
- ],
47
- [
48
- 10,
49
- 0
50
- ],
51
- [
52
- 10,
53
- 1
54
- ],
55
- [
56
- 10,
57
- 2
58
- ],
59
- [
60
- 10,
61
- 3
62
- ],
63
- [
64
- 10,
65
- 6
66
- ],
67
- [
68
- 10,
69
- 11
70
- ],
71
- [
72
- 11,
73
- 2
74
- ],
75
- [
76
- 11,
77
- 4
78
- ]
79
- ],
80
  "begin_suppress_tokens": [
81
  220,
82
  50256
 
1
  {
2
+ "alignment_heads": [[1, 7], [2, 0], [2, 3], [2, 4], [3, 2], [3, 4], [3, 5], [3, 10]],
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  "begin_suppress_tokens": [
4
  220,
5
  50256