superb-hidden-set commited on
Commit
1ff2c02
β€’
1 Parent(s): c7d9dc5

move model interface functions description from website to here

Browse files
Files changed (1) hide show
  1. README.md +61 -3
README.md CHANGED
@@ -19,13 +19,71 @@ If you are not feasible to submit the pre-trained model, please [fill this form]
19
 
20
  ## Quickstart
21
 
22
- ### 1. Create an account and organization on the Hugging Face Hub
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  First create an account on the Hugging Face Hub and you can sign up [here](https://huggingface.co/join) if you haven't already! Next, create a new organization and invite the SUPERB Hidden Set Committee to join. You will upload your model to a repository under this organization so that members inside it can access the model which is not publicly available.
25
 
26
  * [superb-hidden-set](https://huggingface.co/superb-hidden-set)
27
 
28
- ### 2. Create a template repository on your machine
29
 
30
  The next step is to create a template repository on your local machine that contains various files and a CLI to help you validate and submit your pretrained models. The Hugging Face Hub uses [Git Large File Storage (LFS)](https://git-lfs.github.com) to manage large files, so first install it if you don't have it already. For example, on macOS you can run:
31
 
@@ -72,7 +130,7 @@ my-superb-submission
72
  └── model.pt <- Your model weights
73
  ```
74
 
75
- ### 3. Install the dependencies
76
 
77
  The final step is to install the project's dependencies:
78
 
 
19
 
20
  ## Quickstart
21
 
22
+ ### 1. Add model interfaces
23
+
24
+ #### forward
25
+
26
+ Extract features from waveforms.
27
+
28
+ - **Input:** A list of waveforms in 16000 Hz
29
+
30
+ ```python
31
+ SAMPLE_RATE = 16000
32
+ BATCH_SIZE = 8
33
+ EXAMPLE_SEC = 10
34
+ wavs = [torch.randn(SAMPLE_RATE * EXAMPLE_SEC).cuda() for _ in range(BATCH_SIZE)]
35
+ results = upstream(wavs)
36
+ ```
37
+
38
+ - **Output:** A dictionary with a key for each task. If any task-specific key is not presented, a "hidden_states" key should be provided as the default key. The value for each key is **a list** of padded sequences in the same shape of **(batch_size, max_sequence_length_of_batch, hidden_size)** for weighted-sum to work. It is welcome to perform some preprocessing on the upstream's raw hidden-sets, including upsampling and downsampling. However, all the values must come from **a single upstream model**:
39
+
40
+ ```python
41
+ assert isinstance(results, dict)
42
+ tasks = ["PR", "SID", "ER", "ASR", "ASV", "SD", "QbE", "ST", "SS", "SE"]
43
+ for task in tasks:
44
+ hidden_states = results.get(task, "hidden_states")
45
+ assert isinstance(hidden_states, list)
46
+
47
+ for state in hidden_states:
48
+ assert isinstance(state, torch.Tensor)
49
+ assert state.dim() == 3, "(batch_size, max_sequence_length_of_batch, hidden_size)"
50
+ assert state.shape == hidden_states[0].shape
51
+ ```
52
+
53
+ #### get_downsample_rates
54
+
55
+ Provide the downsample rate **from 16000 Hz waveforms** for each task's representation in the dict. For the standard 10ms stride representation, the downsample rate is 160.
56
+
57
+ ```python
58
+ SAMPLE_RATE = 16000
59
+ MSEC_PER_SEC = 1000
60
+ downsample_rate = SAMPLE_RATE * 10 / MSEC_PER_SEC # 160
61
+ ```
62
+
63
+ The downsample rate will be used to:
64
+
65
+ 1. Calculate the valid representation length of each utterance in the output padded representation.
66
+ 2. Prepare the training materials according to the representation's downsample rate for frame-level tasks, e.g. SD, SE, and SS.
67
+
68
+ - **Input:** the task key (str)
69
+ - **Output:** the downsample rate (int) of the representation for that task
70
+
71
+ ```python
72
+ for task in tasks:
73
+ assert isinstance(task, str)
74
+ downsample_rate = upstream.get_downsample_rate(task)
75
+ assert isinstance(downsample_rate, int)
76
+ print("The upstream's representation for {task}"
77
+ f" has the downsample rate of {downsample_rate}.")
78
+ ```
79
+
80
+ ### 2. Create an account and organization on the Hugging Face Hub
81
 
82
  First create an account on the Hugging Face Hub and you can sign up [here](https://huggingface.co/join) if you haven't already! Next, create a new organization and invite the SUPERB Hidden Set Committee to join. You will upload your model to a repository under this organization so that members inside it can access the model which is not publicly available.
83
 
84
  * [superb-hidden-set](https://huggingface.co/superb-hidden-set)
85
 
86
+ ### 3. Create a template repository on your machine
87
 
88
  The next step is to create a template repository on your local machine that contains various files and a CLI to help you validate and submit your pretrained models. The Hugging Face Hub uses [Git Large File Storage (LFS)](https://git-lfs.github.com) to manage large files, so first install it if you don't have it already. For example, on macOS you can run:
89
 
 
130
  └── model.pt <- Your model weights
131
  ```
132
 
133
+ ### 4. Install the dependencies
134
 
135
  The final step is to install the project's dependencies:
136