drubinstein commited on
Commit
3cf4f08
1 Parent(s): 65c3084

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +218 -0
README.md CHANGED
@@ -1,3 +1,221 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ thumbnail: https://user-images.githubusercontent.com/213293/167478083-de988de2-9137-4325-8a5f-ceeb51233753.png
5
+ tags:
6
+ - audio
7
+ - music
8
+ - lightweight
9
+ - midi
10
+ - transcription
11
+ - pitch- detection
12
+ - polyphonic
13
  license: apache-2.0
14
+ datasets:
15
+ - guitarset
16
+ - iKala
17
+ - Maestro
18
+ - MedleyDBPitch
19
+ - Slakh
20
+ metrics:
21
+ - mir_eval.transcription
22
  ---
23
+
24
+ ![Basic Pitch Logo](https://user-images.githubusercontent.com/213293/167478083-de988de2-9137-4325-8a5f-ceeb51233753.png)
25
+
26
+
27
+ [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
28
+ ![PyPI - Python Version](https://img.shields.io/pypi/pyversions/basic-pitch)
29
+ ![Supported Platforms](https://img.shields.io/badge/platforms-macOS%20%7C%20Windows%20%7C%20Linux-green)
30
+
31
+
32
+ Basic Pitch is a Python library for Automatic Music Transcription (AMT), using lightweight neural network developed by [Spotify's Audio Intelligence Lab](https://research.atspotify.com/audio-intelligence/). It's small, easy-to-use, `pip install`-able and `npm install`-able via its [sibling repo](https://github.com/spotify/basic-pitch-ts).
33
+
34
+ Basic Pitch may be simple, but it's is far from "basic"! `basic-pitch` is efficient and easy to use, and its multipitch support, its ability to generalize across instruments, and its note accuracy competes with much larger and more resource-hungry AMT systems.
35
+
36
+ Provide a compatible audio file and basic-pitch will generate a MIDI file, complete with pitch bends. Basic pitch is instrument-agnostic and supports polyphonic instruments, so you can freely enjoy transcription of all your favorite music, no matter what instrument is used. Basic pitch works best on one instrument at a time.
37
+
38
+ ### Research Paper
39
+ This library was released in conjunction with Spotify's publication at [ICASSP 2022](https://2022.ieeeicassp.org/). You can read more about this research in the paper, [A Lightweight Instrument-Agnostic Model for Polyphonic Note Transcription and Multipitch Estimation](https://arxiv.org/abs/2203.09893).
40
+
41
+ If you use this library in academic research, consider citing it:
42
+ ```bibtex
43
+ @inproceedings{2022_BittnerBRME_LightweightNoteTranscription_ICASSP,
44
+ author= {Bittner, Rachel M. and Bosch, Juan Jos\'e and Rubinstein, David and Meseguer-Brocal, Gabriel and Ewert, Sebastian},
45
+ title= {A Lightweight Instrument-Agnostic Model for Polyphonic Note Transcription and Multipitch Estimation},
46
+ booktitle= {Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
47
+ address= {Singapore},
48
+ year= 2022,
49
+ }
50
+ ```
51
+
52
+ **Note that we have improved Basic Pitch beyond what was presented in this paper. Therefore, if you use the output of Basic Pitch in academic research,
53
+ we recommend that you cite the version of the code that was used.**
54
+
55
+ ### Demo
56
+ If, for whatever reason, you're not yet completely inspired, or you're just like so totally over the general vibe and stuff, checkout our snappy demo website, [basicpitch.io](https://basicpitch.io), to experiment with our model on whatever music audio you provide!
57
+
58
+
59
+ ## Installation
60
+
61
+ `basic-pitch` is available via PyPI. To install the current release:
62
+
63
+ pip install basic-pitch
64
+
65
+ To update Basic Pitch to the latest version, add `--upgrade` to the above command.
66
+
67
+ #### Compatible Environments:
68
+ - MacOS, Windows and Ubuntu operating systems
69
+ - Python versions 3.7, 3.8, 3.9, 3.10
70
+ - **For Mac M1 hardware, we currently only support python version 3.10. Otherwise, we suggest using a virtual machine.**
71
+
72
+
73
+ ## Usage
74
+
75
+ ### Model Prediction
76
+
77
+ #### Command Line Tool
78
+
79
+ This library offers a command line tool interface. A basic prediction command will generate and save a MIDI file transcription of audio at the `<input-audio-path>` to the `<output-directory>`:
80
+
81
+ ```bash
82
+ basic-pitch <output-directory> <input-audio-path>
83
+ ```
84
+
85
+ For example:
86
+ ```
87
+ basic-pitch /output/directory/path /input/audio/path
88
+ ```
89
+
90
+ To process more than one audio file at a time:
91
+
92
+ ```bash
93
+ basic-pitch <output-directory> <input-audio-path-1> <input-audio-path-2> <input-audio-path-3>
94
+ ```
95
+
96
+ Optionally, you may append any of the following flags to your prediction command to save additional formats of the prediction output to the `<output-directory>`:
97
+
98
+ - `--sonify-midi` to additionally save a `.wav` audio rendering of the MIDI file
99
+ - `--save-model-outputs` to additionally save raw model outputs as an NPZ file
100
+ - `--save-note-events` to additionally save the predicted note events as a CSV file
101
+
102
+ To discover more parameter control, run:
103
+ ```bash
104
+ basic-pitch --help
105
+ ```
106
+
107
+ #### Programmatic
108
+
109
+ **predict()**
110
+
111
+ Import `basic-pitch` into your own Python code and run the [`predict`](basic_pitch/inference.py) functions directly, providing an `<input-audio-path>` and returning the model's prediction results:
112
+
113
+ ```python
114
+ from basic_pitch.inference import predict
115
+ from basic_pitch import ICASSP_2022_MODEL_PATH
116
+
117
+ model_output, midi_data, note_events = predict(<input-audio-path>)
118
+ ```
119
+
120
+ - `<minimum-frequency>` & `<maximum-frequency>` (*float*s) set the maximum and minimum allowed note frequency, in Hz, returned by the model. Pitch events with frequencies outside of this range will be excluded from the prediction results.
121
+ - `model_output` is the raw model inference output
122
+ - `midi_data` is the transcribed MIDI data derived from the `model_output`
123
+ - `note_events` is a list of note events derived from the `model_output`
124
+
125
+ **predict() in a loop**
126
+
127
+ To run prediction within a loop, you'll want to load the model yourself and provide `predict()` with the loaded model object itself to be used for repeated prediction calls, in order to avoid redundant and sluggish model loading.
128
+
129
+ ```python
130
+ import tensorflow as tf
131
+
132
+ from basic_pitch.inference import predict
133
+ from basic_pitch import ICASSP_2022_MODEL_PATH
134
+
135
+ basic_pitch_model = tf.saved_model.load(str(ICASSP_2022_MODEL_PATH))
136
+
137
+ for x in range():
138
+ ...
139
+ model_output, midi_data, note_events = predict(
140
+ <loop-x-input-audio-path>,
141
+ basic_pitch_model,
142
+ )
143
+ ...
144
+ ```
145
+
146
+ **predict_and_save()**
147
+
148
+ If you would like `basic-pitch` orchestrate the generation and saving of our various supported output file types, you may use [`predict_and_save`](basic_pitch/inference.py) instead of using [`predict`](basic_pitch/inference.py) directly:
149
+
150
+ ```python
151
+ from basic_pitch.inference import predict_and_save
152
+
153
+ predict_and_save(
154
+ <input-audio-path-list>,
155
+ <output-directory>,
156
+ <save-midi>,
157
+ <sonify-midi>,
158
+ <save-model-outputs>,
159
+ <save-notes>,
160
+ )
161
+ ```
162
+
163
+ where:
164
+ - `<input-audio-path-list>` & `<output-directory>`
165
+ - directory paths for `basic-pitch` to read from/write to.
166
+ - `<save-midi>`
167
+ - *bool* to control generating and saving a MIDI file to the `<output-directory>`
168
+ - `<sonify-midi>`
169
+ - *bool* to control saving a WAV audio rendering of the MIDI file to the `<output-directory>`
170
+ - `<save-model-outputs>`
171
+ - *bool* to control saving the raw model output as a NPZ file to the `<output-directory>`
172
+ - `<save-notes>`
173
+ - *bool* to control saving predicted note events as a CSV file `<output-directory>`
174
+
175
+
176
+
177
+ ### Model Input
178
+
179
+ **Supported Audio Codecs**
180
+
181
+ `basic-pitch` accepts all sound files that are compatible with its version of [`librosa`](https://librosa.org/doc/latest/index.html), including:
182
+
183
+ - `.mp3`
184
+ - `.ogg`
185
+ - `.wav`
186
+ - `.flac`
187
+ - `.m4a`
188
+
189
+ **Mono Channel Audio Only**
190
+
191
+ While you may use stereo audio as an input to our model, at prediction time, the channels of the input will be down-mixed to mono, and then analyzed and transcribed.
192
+
193
+ **File Size/Audio Length**
194
+
195
+ This model can process any size or length of audio, but processing of larger/longer audio files could be limited by your machine's available disk space. To process these files, we recommend streaming the audio of the file, processing windows of audio at a time.
196
+
197
+ **Sample Rate**
198
+
199
+ Input audio maybe be of any sample rate, however, all audio will be resampled to 22050 Hz before processing.
200
+
201
+ ### VST
202
+
203
+ Thanks to DamRsn for developing this working VST version of basic-pitch! - https://github.com/DamRsn/NeuralNote
204
+
205
+
206
+ ## Contributing
207
+
208
+ Contributions to `basic-pitch` are welcomed! See [CONTRIBUTING.md](CONTRIBUTING.md) for details.
209
+
210
+ ## Copyright and License
211
+ `basic-pitch` is Copyright 2022 Spotify AB.
212
+
213
+ This software is licensed under the Apache License, Version 2.0 (the "Apache License"). You may choose either license to govern your use of this software only upon the condition that you accept all of the terms of either the Apache License.
214
+
215
+ You may obtain a copy of the Apache License at:
216
+
217
+ http://www.apache.org/licenses/LICENSE-2.0
218
+
219
+
220
+ Unless required by applicable law or agreed to in writing, software distributed under the Apache License or the GPL License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the Apache License for the specific language governing permissions and limitations under the Apache License.
221
+