nianlong commited on
Commit
3256421
·
verified ·
1 Parent(s): cf250c0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -12
README.md CHANGED
@@ -62,18 +62,17 @@ For more details, please refer to the GitHub repository: https://github.com/nian
62
  ## Citation
63
  When using our code or models for your work, please cite the following paper:
64
  ```
65
- @article {Gu2023.09.30.560270,
66
- author = {Nianlong Gu and Kanghwi Lee and Maris Basha and Sumit Kumar Ram and Guanghao You and Richard Hahnloser},
67
- title = {Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection},
68
- elocation-id = {2023.09.30.560270},
69
- year = {2023},
70
- doi = {10.1101/2023.09.30.560270},
71
- publisher = {Cold Spring Harbor Laboratory},
72
- abstract = {This paper introduces WhisperSeg, utilizing the Whisper Transformer pre-trained for Automatic Speech Recognition (ASR) for human and animal Voice Activity Detection (VAD). Contrary to traditional methods that detect human voice or animal vocalizations from a short audio frame and rely on careful threshold selection, WhisperSeg processes entire spectrograms of long audio and generates plain text representations of onset, offset, and type of voice activity. Processing a longer audio context with a larger network greatly improves detection accuracy from few labeled examples. We further demonstrate a positive transfer of detection performance to new animal species, making our approach viable in the data-scarce multi-species setting.Competing Interest StatementThe authors have declared no competing interest.},
73
- URL = {https://www.biorxiv.org/content/early/2023/10/02/2023.09.30.560270},
74
- eprint = {https://www.biorxiv.org/content/early/2023/10/02/2023.09.30.560270.full.pdf},
75
- journal = {bioRxiv}
76
- }
77
  ```
78
 
79
  ## Contact
 
62
  ## Citation
63
  When using our code or models for your work, please cite the following paper:
64
  ```
65
+ @INPROCEEDINGS{10447620,
66
+ author={Gu, Nianlong and Lee, Kanghwi and Basha, Maris and Kumar Ram, Sumit and You, Guanghao and Hahnloser, Richard H. R.},
67
+ booktitle={ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
68
+ title={Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection},
69
+ year={2024},
70
+ volume={},
71
+ number={},
72
+ pages={7505-7509},
73
+ keywords={Voice activity detection;Adaptation models;Animals;Transformers;Acoustics;Human voice;Spectrogram;Voice activity detection;audio segmentation;Transformer;Whisper},
74
+ doi={10.1109/ICASSP48485.2024.10447620}}
75
+
 
76
  ```
77
 
78
  ## Contact