Skip to content Skip to main navigation Skip to footer

Search: language%20print

60 results

Download Speech Platform

…only English models for Speech To Text and Keyword Spotting. Additional supported languages are available upon request. ⓘ Click to show/hide the package content Speech Engine – technologies included: Speech To Text (STT) – model EN_US_6 (US English) Keyword Spotting (KWS) – model EN_US_6 (US English) Phoneme Recognizer (PHNREC) – model EN_US_6 (US English) Speaker Identification 4 (SID4) – model…

Understand SPE user accounts

…not visible by SPE and by the account. Similar trickery can be done with the data directory, allowing to share LID language models and language packs, or SID speaker models, etc. between accounts. User accounts management SPE user accounts can be managed using REST API (see Administration section of the API documentation), or using command line administration utilities phxadmin or…

Gender Identification (GID)

Gender Identification is a language-, domain- and channel-independent technology that uses the acoustic characteristics of the recording to determine the gender of the speaker in question. This technology is able to distinguish between two genders: Male (M) and Female (F). Minimum of speech signal for identification: 7+ sec recommended with XL5, XL4 and L4 model (9+ sec for previous generation…

Voice Activity Detection (VAD)

Voice Activity Detection is a language-, domain- and channel-independent technology that identifies parts of audio recordings with speech content vs. non-speech content. It creates labels for speech and other signals in the recording; this can then serve as a decision point whether to process the recording by other technologies or not. VAD is usually part of rapid filtration process in…

Speaker Diarization (DIAR)

Speaker Diarization labels segments of the same voice(s) in one mono-channel audio record based by the individual speaker´s voice. It is a language-, domain- and channel-independent technology. It performs not only the segmentation of speakers but of technical signals and silence as well. The outputs of the technology can be both log files with labels and/or split audio files/one new…

Speech Quality Estimation (SQE)

…channels. The statistics of all channels include the numbers for many aspects of recording quality, and the overall global score. Technology The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc. Input Input format for processing: WAV or RAW (8 or 16 bits…

Installation of Phonexia Browser

Some packages are distributed with only a limited set of speech technologies and languages or without speech technologies. First installation Our software is distributed as a ZIP file. Installation procedure is as simple as: unzip the archive paste additional KWS, STT… models paste the license.dat file to the root directory where you have BROWSER folder and run_browser(.exe) script run the…

Age Estimation (AGE)

Phonexia Age Estimation (AGE) estimates the age of a speaker from audio recording or voiceprint. Technology Trained with emphasis on spontaneous telephony conversation The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc. Input Audio: WAV or RAW (8 or 16 bits linear…

Speech Engine update

…up to you, based on the actual content of the directory and your new package NOTE: If you created any user configuration files, or made any changes in configuration files, make sure to keep the respective .bs.usr or .bs files! If you created any customized STT language models using LMC, it’s recommended practice to recreate the STT model using the…

Phonexia technologies introduction

…and their usages Filtering and supporting technologies 04:32 Speech Quality Estimation (SQE) 05:27 Voice Activity Detection (VAD) 06:37 Diarization (DIAR) 07:41 Age Estimation (AGE) 08:14 Waveform Denoiser Voice Biometrics technologies 08:56 Speaker Identification (SID) 10:18 Language Identification (LID) 11:10 Gender Identification (GID) Speech Analytics technologies 11:43 Speech Transcription (STT) 12:30 Keyword Spotting (KWS) 13:32 Phoneme Recognition (PHNREC) 13:54 Time Analysis…

STT: Results explained

…a speaker does not pronounce a word correctly and the one-best results do not correspond to what the speaker actually said. Start– and end time is in HTK units. 1 HTK unit equals to 100 nanoseconds ==> dividing the values by 10000 gives the time in milliseconds. Score is a rate of match with the acoustic and language model from…

Support

Support is available 5 business days a week (Monday – Friday) / 8 business hours (09:00 – 17:00 CET) in English language. If you have issue with Speech Engine, please include a report in the ticket, to help the support staff to resolve your issue faster: Go to the Speech Engine installation directory Open command line/terminal (in Ubuntu Linux Right…

Understand SPE home directory

…Data The data directory holds additional data files for entities created by that user – e.g. SID Speaker Models, or LID language packs. If no such entities exist for that user, this directory is empty. Unlike the storage, content of this directory is intended to be manipulated by SPE only and should not be manipulated directly on the filesystem level….

Q: What are the requirements for SID evaluation dataset?

…in each recording (i.e. usually 2+ minutes recording length) only one speaker in each recording wide variety of gender and age is recommended recordings should be as similar to the target use case as possible (device, channel, distance from mic, languages distribution) audio files should be mono, lin16 format, 8 kHz+ sample rate *Note: splitting single recording into multiple shorter…

Video – Voice Biometrics technologies

MODULE 3: Voice Biometrics technologies (23 min) Common generic rules for CLI, REST and GUI Speaker Identification (SID) in CLI, REST and GUI Language Identification (LID) in CLI, REST and GUI Gender Identification (GID) in CLI, REST and GUI Summary https://www.youtube.com/watch?v=AyEoPfYVel8…