Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection

This paper introduces WhisperSeg, utilizing the Whisper Transformer pre-trained for Automatic Speech Recognition (ASR) for human and animal Voice Activity Detection (VAD). Contrary to traditional methods that detect human voice or animal vocalizations from a short audio frame and rely on careful th Show more

Publication status

published

External links

https://doi.org/10.1101/2023.09.30.560270

Journal / series

bioRxiv

Publisher

Cold Spring Harbor Laboratory

Subject

Voice Activity Detection; Audio segmentation; Transformer; Whisper

Organisational unit

03774 - Hahnloser, Richard H.R. / Hahnloser, Richard H.R.

More

Show all metadata

ETH Bibliography

yes

Altmetrics

Research Collection

Search

Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection Mendeley CSV RIS BibTeX

Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection

Mendeley

CSV

RIS

BibTeX