Skip to content Skip to main navigation Skip to footer

Search: language%20pack

66 results

Speaker Diarization (DIAR)

Speaker Diarization labels segments of the same voice(s) in one mono-channel audio record based by the individual speaker´s voice. It is a language-, domain- and channel-independent technology. It performs not only the segmentation of speakers but of technical signals and silence as well. The outputs of the technology can be both log files with labels and/or split audio files/one new…

Installation of Phonexia Browser

Some packages are distributed with only a limited set of speech technologies and languages or without speech technologies. First installation Our software is distributed as a ZIP file. Installation procedure is as simple as: unzip the archive paste additional KWS, STT… models paste the license.dat file to the root directory where you have BROWSER folder and run_browser(.exe) script run the…

Speaker Identification

…in a recording are also unique, thus the technology can be language-, accent-, text-, and channel-independent. How does Speaker Identification work? Automatic speaker recognition systems extract the features from a voice to a voiceprint. A voiceprint is a small numerical representation of the voice, capturing the most unique characteristics of a speaker’s voice. The whole voice verification process consists of…

About Phonexia Orbis

…speakers and their corresponding recordings. Speech Transcription In Orbis edition that includes Speech to Text technology, user may let the audio be automatically or on demand transcribed in language chosen from the portfolio Phonexia Speech to Text offers. Network Map The solution visualizes the relations between persons and assets based on time on a network map. Persons, Assets and Relations…

FAQs (Voice Verify)

…calibration and pulling results to Call Center SW is estimated to take several weeks. The customer’s internal processes can affect the timeframe considerably. Additional info including timelines could be found here.   in FAQ Voice Verify Permalink Q: Does Voice Verify provide information about gender, age or language used by a speaker during a verification? A: No, Voice Verify does…

Orbis 1.4.0 Release Notes

…model. Speech to Text in Orbis New editions of Orbis Investigator may include also Speech to Text technology. This technology enables converting audio into the text for better and faster understanding of the content. Box with transcribed text is straight under the recording itself. Limitation: Transcription of text is provided for one chosen language per one Orbis instance. Search for…

Age Estimation (AGE)

Phonexia Age Estimation (AGE) estimates the age of a speaker from audio recording or voiceprint. Technology Trained with emphasis on spontaneous telephony conversation The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc. Input Audio: WAV or RAW (8 or 16 bits linear…

Speech Engine update

…up to you, based on the actual content of the directory and your new package NOTE: If you created any user configuration files, or made any changes in configuration files, make sure to keep the respective .bs.usr or .bs files! If you created any customized STT language models using LMC, it’s recommended practice to recreate the STT model using the…

Support

Support is available 5 business days a week (Monday – Friday) / 8 business hours (09:00 – 17:00 CET) in English language. If you have issue with Speech Engine, please include a report in the ticket, to help the support staff to resolve your issue faster: Go to the Speech Engine installation directory Open command line/terminal (in Ubuntu Linux Right…

Video – Voice Biometrics technologies

MODULE 3: Voice Biometrics technologies (23 min) Common generic rules for CLI, REST and GUI Speaker Identification (SID) in CLI, REST and GUI Language Identification (LID) in CLI, REST and GUI Gender Identification (GID) in CLI, REST and GUI Summary https://www.youtube.com/watch?v=AyEoPfYVel8…

Q: What are the requirements for SID evaluation dataset?

…in each recording (i.e. usually 2+ minutes recording length) only one speaker in each recording wide variety of gender and age is recommended recordings should be as similar to the target use case as possible (device, channel, distance from mic, languages distribution) audio files should be mono, lin16 format, 8 kHz+ sample rate *Note: splitting single recording into multiple shorter…

Speech Quality Estimation (SQE)

…channels. The statistics of all channels include the numbers for many aspects of recording quality, and the overall global score. Technology The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc. Input Input format for processing: WAV or RAW (8 or 16 bits…

Phonexia technologies introduction

…and their usages Filtering and supporting technologies 04:32 Speech Quality Estimation (SQE) 05:27 Voice Activity Detection (VAD) 06:37 Diarization (DIAR) 07:41 Age Estimation (AGE) 08:14 Waveform Denoiser Voice Biometrics technologies 08:56 Speaker Identification (SID) 10:18 Language Identification (LID) 11:10 Gender Identification (GID) Speech Analytics technologies 11:43 Speech Transcription (STT) 12:30 Keyword Spotting (KWS) 13:32 Phoneme Recognition (PHNREC) 13:54 Time Analysis…

Gender Identification (GID)

Gender Identification is a language-, domain- and channel-independent technology that uses the acoustic characteristics of the recording to determine the gender of the speaker in question. This technology is able to distinguish between two genders: Male (M) and Female (F). Minimum of speech signal for identification: 7+ sec recommended (with XL4 and L4 model (9+ sec for previous generation of…

STT: Results explained

…a speaker does not pronounce a word correctly and the one-best results do not correspond to what the speaker actually said. Start– and end time is in HTK units. 1 HTK unit equals to 100 nanoseconds ==> dividing the values by 10000 gives the time in milliseconds. Score is a rate of match with the acoustic and language model from…