khanhld commited on
Commit
c73f4cb
1 Parent(s): bed9786

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # FINETUNE WAV2VEC 2.0 FOR SPEECH RECOGNITION
2
+ ## Table of contents
3
+ 1. [Documentation](#documentation)
4
+ 2. [Installation](#installation)
5
+ 3. [Usage](#usage)
6
+ 4. [Logs and Visualization](#logs)
7
+
8
+ <a name = "documentation" ></a>
9
+ ## Documentation
10
+ Suppose you need a simple way to fine-tune the Wav2vec 2.0 model for the task of Speech Recognition on your datasets, then you came to the right place.
11
+ </br>
12
+ All documents related to this repo can be found here:
13
+ - [Wav2vec2ForCTC](https://huggingface.co/docs/transformers/model_doc/wav2vec2#transformers.Wav2Vec2ForCTC)
14
+ - [Tutorial](https://huggingface.co/blog/fine-tune-wav2vec2-english)
15
+ - [Code reference](https://github.com/huggingface/transformers/blob/main/examples/pytorch/speech-recognition/run_speech_recognition_ctc.py)
16
+
17
+
18
+ <a name = "installation" ></a>
19
+ ## Installation
20
+ ```
21
+ pip install -r requirements.txt
22
+ ```
23
+
24
+ <a name = "usage" ></a>
25
+ ## Usage
26
+ 1. Prepare your dataset
27
+ - Your dataset can be in <b>.txt</b> or <b>.csv</b> format.
28
+ - <b>path</b> and <b>transcript</b> columns are compulsory. The <b>path</b> column contains the paths to your stored audio files, depending on your dataset location, it can be either absolute paths or relative paths. The <b>transcript</b> column contains the corresponding transcripts to the audio paths.
29
+ - Check out our [data_example.csv](dataset/data_example.csv) file for more information.
30
+ 2. Configure the config.toml file
31
+ 3. Run
32
+ - Start training:
33
+ ```
34
+ python train.py -c config.toml
35
+ ```
36
+ - Continue to train from resume:
37
+ ```
38
+ python train.py -c config.toml -r
39
+ ```
40
+ - Load specific model and start training:
41
+ ```
42
+ python train.py -c config.toml -p path/to/your/model.tar
43
+ ```
44
+
45
+ <a name = "logs" ></a>
46
+ ## Logs and Visualization
47
+ The logs during the training will be stored, and you can visualize it using TensorBoard by running this command:
48
+ ```
49
+ # specify the <name> in config.json
50
+ tensorboard --logdir ~/saved/<name>
51
+
52
+ # specify a port 8080
53
+ tensorboard --logdir ~/saved/<name> --port 8080
54
+ ```