mnabihali commited on
Commit
a251204
·
verified ·
1 Parent(s): b4cd64d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -85
README.md CHANGED
@@ -14,120 +14,79 @@ library_name: transformers
14
 
15
  <img src="./EE.gif" align="center" width="70%">
16
 
17
- # Model Card for Model ID
18
-
19
- <!-- Provide a quick summary of what the model is/does. -->
20
-
21
- This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
22
-
23
  ## Model Details
24
 
25
  ### Model Description
26
 
27
- <!-- Provide a longer summary of what this model is. -->
28
-
29
-
30
-
31
- - **Developed by:** [More Information Needed]
32
- - **Funded by [optional]:** [More Information Needed]
33
- - **Shared by [optional]:** [More Information Needed]
34
- - **Model type:** [More Information Needed]
35
- - **Language(s) (NLP):** [More Information Needed]
36
- - **License:** [More Information Needed]
37
- - **Finetuned from model [optional]:** [More Information Needed]
38
-
39
- ### Model Sources [optional]
40
-
41
- <!-- Provide the basic links for the model. -->
42
-
43
- - **Repository:** [More Information Needed]
44
- - **Paper [optional]:** [More Information Needed]
45
- - **Demo [optional]:** [More Information Needed]
46
 
47
- ## Uses
48
 
49
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
50
-
51
- ### Direct Use
52
-
53
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
54
-
55
- [More Information Needed]
56
 
57
  ### Downstream Use [optional]
58
 
59
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
60
-
61
- [More Information Needed]
62
-
63
- ### Out-of-Scope Use
64
-
65
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
66
-
67
- [More Information Needed]
68
-
69
-
70
- ## How to Get Started with the Model
71
-
72
- Use the code below to get started with the model.
73
-
74
- [More Information Needed]
75
 
76
  ## Training Details
77
 
78
  ### Training Data
79
 
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
-
82
- [More Information Needed]
83
 
84
  ### Training Procedure
85
 
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
-
88
- #### Preprocessing [optional]
89
-
90
- [More Information Needed]
91
 
 
 
 
92
 
93
  #### Training Hyperparameters
94
 
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
-
97
- #### Speeds, Sizes, Times [optional]
98
-
99
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
-
101
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
102
 
103
  ## Evaluation
104
 
105
- <!-- This section describes the evaluation protocols and provides the results. -->
106
-
107
- ### Testing Data, Factors & Metrics
108
 
109
- #### Testing Data
110
 
111
- <!-- This should link to a Dataset Card if possible. -->
112
-
113
- [More Information Needed]
114
-
115
-
116
- #### Metrics
117
-
118
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
119
-
120
- [More Information Needed]
121
 
122
  ### Results
123
 
124
- [More Information Needed]
125
-
126
- #### Summary
127
-
128
-
129
-
130
- ## Citation [optional]
 
131
 
132
  ## Citation
133
 
 
14
 
15
  <img src="./EE.gif" align="center" width="70%">
16
 
 
 
 
 
 
 
17
  ## Model Details
18
 
19
  ### Model Description
20
 
21
+ Wav2Vec2.0 model trained with Early-Exit pipeline.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
 
23
 
24
+ - **Developed by:** SpeectTek unit, Fondazione Bruno Kessler
25
+ - **Model type:** Wav2Vec 2.0
26
+ - **Language(s) (NLP):** English
27
+ - **Finetuned from model:** facebook/wav2vec2-base-960h
28
+ - **Repository:** https://github.com/augustgw/wav2vec2-ee
29
+ - **Paper:** Training early-exit architectures for automatic speech recognition: Fine-tuning pre-trained models or training from scratch
 
30
 
31
  ### Downstream Use [optional]
32
 
33
+ The model is trained for computationally efficient ASR tasks.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  ## Training Details
36
 
37
  ### Training Data
38
 
39
+ The model is trained using the LibriSpeech-960h dataset.
 
 
40
 
41
  ### Training Procedure
42
 
43
+ ### Basic training
 
 
 
 
44
 
45
+ - Fine-tuning with only EE loss: `finetune_ee.py`
46
+ - Fine-tuning a model without early exits: `finetune_non-ee.py`
47
+ - Change `model_config = Wav2Vec2Config(num_hidden_layers=X)` to set the number of layers in the encoder. E.g., for 4-layer encoder: `model_config = Wav2Vec2Config(num_hidden_layers=4)`
48
 
49
  #### Training Hyperparameters
50
 
51
+ `training_args = TrainingArguments(
52
+ output_dir="./wav2vec2-ee/checkpoints/",
53
+ evaluation_strategy="no",
54
+ #eval_steps=1000,
55
+ save_strategy = 'epoch',
56
+ #eval_accumulation_steps=10,
57
+ learning_rate=1e-4,
58
+ per_device_train_batch_size=16,
59
+ per_device_eval_batch_size=1,
60
+ num_train_epochs=100,
61
+ weight_decay=0.01,
62
+ push_to_hub=False,
63
+ report_to='wandb',
64
+ logging_strategy='steps',
65
+ logging_steps=1000,
66
+ dataloader_num_workers=1,
67
+ ignore_data_skip=True,)
68
+ `
69
 
70
  ## Evaluation
71
 
72
+ The evaluation scripts create files in the indicated output directory. `wer_results.txt` contains the layerwise WERs on the test sets indicated in the evaluation script. The remaining files contain the layerwise transcriptions of each item in each test set.
 
 
73
 
74
+ ### Basic evaluation
75
 
76
+ - Normal evaluation: `eval.py path/to/model/checkpoint path/to/output/directory`
77
+ - For safetensors checkpoints saved by newer versions of Hugging Face, see note in `eval.py`
78
+ - Evaluation for models without early exits (evaluates only output of final layer): `eval_non-ee.py path/to/model/checkpoint path/to/output/directory`
 
 
 
 
 
 
 
79
 
80
  ### Results
81
 
82
+ | Exit | Test-Clean | Dev-Clean |
83
+ |--------|------------|-----------|
84
+ | Exit(1)| 19.14 | 19.06 |
85
+ | Exit(2)| 8.26 | 8.01 |
86
+ | Exit(3)| 5.93 | 5.57 |
87
+ | Exit(4)| 4.74 | 4.48 |
88
+ | Exit(5)| 3.98 | 3.79 |
89
+ | Exit(6)| 3.95 | 3.69 |
90
 
91
  ## Citation
92