Audio Quality Estimation

This technology provides indication about the audio quality of the speech during enrolments and verifications in real-time via API and for future reference via logs. It is available for both single-server and multi-server deployments. Currently the Audio Quality Estimation feature is limited to polling (SIP, http streams) only, Webhooks support is planned to be added in the future.

Based on the PESQ (Perceptual Evaluation of Speech Quality) metric, the values can range from -0,5 to 4,5. To help interpret the data, there is also a binary verdict present. When the value is below 2, we consider the audio quality bad. Value over 2 means good audio quality of the speech.

Audio Quality estimation is turned off by default. The reason for this is consumption of hardware resources. To enable it, switch to maintenance mode and call endpoint
POST /api/v2/maintenance/technologies with the following JSON:

{
"audio_quality_estimation": "enable"
}

The response should look like this:

{
    "status": "OK",
    "message": "SPE restarted and running."
}

Please note that in case of higher number of expected streams, the initialization of the technology could take some additional time. We recommend waiting for 3-4 seconds before the technology is ready.

To verify the status of the Audio Quality Estimation, please call:
GET /api/v2/maintenance/technologies
Endpoint will return this:

{
    "message": "List of supported technologies and their status.",
    "status": "OK",
    "technologies": {
        "audio_quality_estimation": "enabled"
        "speaker_change_detection": "disabled"
    }
}

Speaker change on enrollments

Speaker Identification

Audio Quality Estimation

Previous Article

Next Article

ABOUT PHONEXIA

LEGAL

ACCOUNT

Previous Article

Next Article

Related Articles

ABOUT PHONEXIA

LEGAL

ACCOUNT

TAGS