KevinGeng commited on
Commit
cd9d8c6
β€’
1 Parent(s): a131248

fix README configuration

Browse files
Files changed (1) hide show
  1. README.md +2 -117
README.md CHANGED
@@ -1,86 +1,3 @@
1
- # Laronix Data Collection
2
-
3
- This repository contains information about the Laronix data collection process, which involves collecting parallel data from AVA users. The dataset consists of two main sessions: scripted data and conversational data.
4
-
5
- ## Dataset
6
-
7
- The dataset is organized as follows:
8
-
9
- ### 1. Scripted Data
10
-
11
- The scripted data session includes 200 sentences collected from 5 articles. The references for both the audio and text versions of these sentences have already been uploaded or will be uploaded to the Laronix Recording system. (Ask [Kevin](kevin@laronix.com) for these files) The distribution of sentences from each article is as follows:
12
-
13
- - Arthur the Rat: 56 sentences
14
- - Cinder: 19 sentences
15
- - Rainbow: 26 sentences
16
- - Sentences: 59 sentences
17
- - VCTK: 40 sentences
18
-
19
- ### 2. Conversational Data
20
-
21
- The conversational data session focuses on natural conversations and involves the following components:
22
-
23
- #### a. Q&A
24
-
25
- In this component, a set of 50 sentences will be provided, consisting of questions and answers. During the recording, the partner will ask the questions (Q), and the patient will provide the answers (A). Both the questions and answers will be recorded.
26
-
27
- #### b. Freestyle
28
-
29
- The patients will have the freedom to talk about a given topic. They will be asked to respond with 5 to 10 sentences. The structure for this component can be referenced from the [IELTS speaking test](https://www.ieltsbuddy.com/IELTS-speaking-questions-with-answers.html).
30
-
31
- ## Data Inclusion Criteria
32
-
33
- + No hearing loss or history of active cancer.
34
- + 6 weeks of practice with AVA.
35
-
36
- ## Document for Laronix Recording System
37
-
38
- The Laronix recording system is designed for data collection from potential users of the AVA Device, which replaces their voice cord.
39
-
40
- ### Input:
41
-
42
- - Audio signal
43
- - Reference ID
44
- - Reference text
45
- - Reference Phoneme per minute
46
-
47
- ### Output:
48
-
49
- - wav_pause_plot: Wave signal plot with pauses detected by VAD algorithm (SNR = 40dB)
50
- - Predicted Mean Opinion Score: Score estimating data quality on the MOS scale using an ML prediction model (1-5)
51
- - Hypotheses: Text predicted by Automatic Speech Recognition model (wav2vev2.0 + CTC)
52
- - WER: Word Error Rate (lower is better)
53
- - Predicted Phonemes
54
- - PPM: Phonemes per minute
55
- - Message: Feedback from the system
56
-
57
- ## User Instruction
58
-
59
- Please follow the instructions provided at the top of the APP page.
60
-
61
- ```
62
- - Laronix_AUTOMOS
63
- - data
64
- - Template
65
- - ref_wav/
66
- - 1.wav
67
- - 2.wav
68
- - ...
69
- - ref_txt.txt
70
- - ref.csv # audio prosody features reference <generate by script>
71
- - exp
72
- - Template
73
- - Audio_to_evaluate # RAW WAV DATA
74
- - log.csv # Recording log
75
- - output # wav.file <generate by script>
76
- - model
77
- - epoch=3-step=7459.ckpt # MOS estimate model
78
- - wav2vec_small.pt # WER model
79
- - local
80
- - get_ref_PPM.py # script for generating data/<ref_dir>/ref.csv
81
- - post_processing.py # script for generating exp/<ref_dir>/output/*.wav
82
- ```
83
-
84
  ---
85
  title: Laronix Automos
86
  emoji: πŸƒ
@@ -90,37 +7,5 @@ sdk: gradio
90
  sdk_version: 3.40.0
91
  app_file: app.py
92
  license: afl-3.0
93
- ---
94
-
95
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
96
-
97
- # Laronix_AutoMOS
98
-
99
- ## Usage:
100
- ### Step 1: Prepare data and text
101
- `<todo>`
102
- ### Step 2: Preprocessing
103
- ```
104
- ## Generating *.csv, Voice/Unvoice Plot (optional) and config (optional)
105
- python local/get_ref_PPM.py --ref_txt <ref_text> \
106
- --ref_wavs <ref_wavs> \
107
- --output_dir <output_dir> \
108
- --to_config <True/False> \
109
- --UV_flag <True/False> \
110
- --UV_thre <UV_thre>}
111
- ```
112
- ### Step 3: Launch recording session:
113
-
114
- ```
115
- ## Start app.py
116
- python app.py <config.yaml>
117
- ```
118
- + **Find logging below and lick URL to start**
119
- ```
120
- Launch examples
121
- Running on local URL: http://127.0.0.1:7860/
122
- ...
123
- (Logs...)
124
- ...
125
- Running on public URL: https://87abe771e93229da.gradio.app
126
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Laronix Automos
3
  emoji: πŸƒ
 
7
  sdk_version: 3.40.0
8
  app_file: app.py
9
  license: afl-3.0
10
+ pinned: False
11
+ ---