Skip to content Skip to main navigation Skip to footer

Q: What are the requirements for SID evaluation dataset?

For evaluating the real life scenario of Phonexia Speaker Identification technology, the system needs to be calibrated by SID dataset.

SID dataset (minimum requirements):

  • 500 speakers
  • >5 individual recordings per speaker*
  • >30s per recording (>20s speech on each recording)
  • speaker labels
  • 1 speaker per channel
  • phone or mobile phone source
  • spontaneous dialogue (better than scripted or read text)
  • wav, opus, flac audio format – for best results use only the natively supported audio formats – see list of supported audio formats
  • diversity of age, gender, time of the day

*Note: splitting single recording into multiple shorter recordings in order to meet the criteria of at least 5 recordings for each speaker is not the right way to proceed. This way you are not adding any details. You are essentially analyzing details of a single recording five times. In contrast, by using 5 unique recordings coming from different audio environments or even different times of the day, additional details can be analyzed leading to better results.