SID – Phonexia Partner Portal

Q: Why does the system show high score (>90%) even for non-targets?

A: Threshold for score isn’t set up correctly. Adjust speaker score sharpness value to calibrate the recalculation.

Please see Calibration in technology documentation.

Q: What do LLR, LR and score mean?

A: These abbreviations mean the following:

LR – likelihood ratio, result from statistical test for two models comparison. It returns a number which expresses how many times more likely the data are under one model than the other. LR meets numbers in interval <0;+inf).
LLR – abbreviation for log-likelihood ratio statistic, logarithmic function of LR. LLR meets numbers in interval (-inf;+inf).
Percentage (normalised) score – commonly used mathematical transformation of the LLR to percentage. This number is better for human readability but may bring some doubts if LLR numbers are too high (typically for some non-adapted installations). Interval <0;100> (or sometimes <0;1>), in %. The higher the score, the better the match.

Q: What are the requirements for SID evaluation dataset?

For evaluating the real life scenario of Phonexia Speaker Identification technology, the system needs to be calibrated by SID dataset.

SID dataset (minimum requirements):

To measure SID performance precisely, it’s important to prepare evaluation recordings set very carefully.

The requirements are:

50+ known speakers, 200+ recordings in total (i.e. 3 to 5 recordings per speaker*)
1+ minute of net speech in each recording (i.e. usually 2+ minutes recording length)
only one speaker in each recording
wide variety of gender and age is recommended
recordings should be as similar to the target use case as possible (device, channel, distance from mic, languages distribution)
audio files should be mono, lin16 format, 8 kHz+ sample rate

*Note: splitting single recording into multiple shorter recordings in order to meet the criteria of at least 3 recordings for each speaker is not the right way to proceed. This way you are not adding any details. You are essentially analyzing details of a single recording five times.
In contrast, by using 5 unique recordings coming from different audio environments or even different times of the day, additional details can be analyzed leading to better results.

Warning: Any human error in evaluation set preparation (in speaker uniqueness, placing recordings into wrong folder, etc.) affects the evaluation results, so it’s very important to prepare the data carefully.

See SID Evaluation for more details

Tag: SID

Q: Why does the system show high score (>90%) even for non-targets?

Q: What do LLR, LR and score mean?

Q: What are the requirements for SID evaluation dataset?

ABOUT PHONEXIA

LEGAL

ACCOUNT

ABOUT PHONEXIA

LEGAL

ACCOUNT

TAGS