Phonexia Waveform Denoiser (DENOISER) ensures automatic dereverberation (removal of echoes caused by sound in the rooms) and automatic noise reduction of the speech signal. The data model is usually trained for various types of noise using the latest generation of algorithms based on neural networks. Automatically removed are mainly noises similar to those that was software trained on. Conversely, the software cannot remove unwanted speech or music in the background.
Denoiser is used to remove noise from the recording and at the same time to amplify the speech signal for:
- Better intelligibility when listening by people (recommended use),
- Achieving better results with automatic speech recognition technologies (necessary to test on customer data first).
- audio file (format details – see Speech Engine documentation); stream not supported,
- technology model name to be used for processing.
- audio file (WAV or RAW), together with xml/json report (in SPE only).
Q: What do you recommend for deploying this technology?
It is advisable to use the technology after the acoustic quality check of recordings. If some technical information indicates, for example, low values of signal-to-noise ration (SNR), it is advisable to divert the recording directly into the Denoiser technology to automatic noise reduction. On the other hand, it is not appropriate to send an automatically reconstructed recording subsequently to STT or SID technologies.
Q: How does the Denoiser perform if part of the recording is noisy and part of the speech is good quality?
The technology is being developed to automatically detect low quality audio segments and try to reconstruct them. On the contrary, well-recorded segments should be automatically recognized and retained their original speech quality.
Q: Is there a way to adapt this technology?
No, unfortunately the software does not currently offer easy customization.
Link to API reference