nccratliri
/

whisperseg-large-ms-ct2

Inference Endpoints

Model card Files Files and versions Community

nianlong commited on Jul 21, 2024

Commit

3256421

·

verified ·

1 Parent(s): cf250c0

Update README.md

Files changed (1) hide show

README.md +11 -12

README.md CHANGED Viewed

@@ -62,18 +62,17 @@ For more details, please refer to the GitHub repository: https://github.com/nian
 ## Citation
 When using our code or models for your work, please cite the following paper:
 ```
-@article {Gu2023.09.30.560270,
-	author = {Nianlong Gu and Kanghwi Lee and Maris Basha and Sumit Kumar Ram and Guanghao You and Richard Hahnloser},
-	title = {Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection},
-	elocation-id = {2023.09.30.560270},
-	year = {2023},
-	doi = {10.1101/2023.09.30.560270},
-	publisher = {Cold Spring Harbor Laboratory},
-	abstract = {This paper introduces WhisperSeg, utilizing the Whisper Transformer pre-trained for Automatic Speech Recognition (ASR) for human and animal Voice Activity Detection (VAD). Contrary to traditional methods that detect human voice or animal vocalizations from a short audio frame and rely on careful threshold selection, WhisperSeg processes entire spectrograms of long audio and generates plain text representations of onset, offset, and type of voice activity. Processing a longer audio context with a larger network greatly improves detection accuracy from few labeled examples. We further demonstrate a positive transfer of detection performance to new animal species, making our approach viable in the data-scarce multi-species setting.Competing Interest StatementThe authors have declared no competing interest.},
-	URL = {https://www.biorxiv.org/content/early/2023/10/02/2023.09.30.560270},
-	eprint = {https://www.biorxiv.org/content/early/2023/10/02/2023.09.30.560270.full.pdf},
-	journal = {bioRxiv}
-}
 ```
 ## Contact

 ## Citation
 When using our code or models for your work, please cite the following paper:
 ```
+@INPROCEEDINGS{10447620,
+  author={Gu, Nianlong and Lee, Kanghwi and Basha, Maris and Kumar Ram, Sumit and You, Guanghao and Hahnloser, Richard H. R.},
+  booktitle={ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
+  title={Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection},
+  year={2024},
+  volume={},
+  number={},
+  pages={7505-7509},
+  keywords={Voice activity detection;Adaptation models;Animals;Transformers;Acoustics;Human voice;Spectrogram;Voice activity detection;audio segmentation;Transformer;Whisper},
+  doi={10.1109/ICASSP48485.2024.10447620}}
 ```
 ## Contact