Voice Activity Detection (VAD)

Voice Activity Detection is a language-, domain- and channel-independent technology that identifies parts of audio recordings with speech content vs. non-speech content. It creates labels for speech and other signals in the recording; this can then serve as a decision point whether to process the recording by other technologies or not. VAD is usually part of rapid filtration process in deployment.

Typical use cases are:

  • detection of present or absent human speech for voice processing,
  • filtering non-speech parts of the recording,
  • filtering out recordings with not enough net speech to be processed by other technologies
  • voice activated process, etc.

The speed of Voice Activity Detection is 140 ftRT per one instance.

