Skip to content Skip to main navigation Skip to footer

Search: languages

34 results

Q: What are the requirements for SID evaluation dataset?

…in each recording (i.e. usually 2+ minutes recording length) only one speaker in each recording wide variety of gender and age is recommended recordings should be as similar to the target use case as possible (device, channel, distance from mic, languages distribution) audio files should be mono, lin16 format, 8 kHz+ sample rate *Note: splitting single recording into multiple shorter…

Understand SPE connectors for external TTS

…expected to provide information about actual TTS service capabilities: list of voice names, supported languages and audio quality (sampling frequencies). This info is used during SPE startup sequence – TTS connectors enabled in SPE configuration file are started with –info parameter and SPE reads the connector output. Connectors failing to provide the info won’t be available for use with SPE….

Understand SPE technologies configuration file

This article explains the purpose and structure of SPE technologies configuration file technologies.xml, or technologies.json created by Phonexia Browser. SPE installation includes usually multiple speech technologies (e.g. Speaker Identification, Speech To Text, etc.) in various technological models (e.g. L4, XL4, etc.), or supporting various languages (e.g. 6th generation of EN_US, CS_CZ, etc.) available. You can select from these technologies/models those…

Understand SPE metafiles

…i.e. should be handled by the application built on top of the SPE API. This includes handling of any metadata associated with the processed audiofiles, like phone numbers, source of the recording, date/time the audio was recorded, references to the persons speaking in the recording (names, photos, …), languages spoken in the recording, etc. – all this data is expected…