Search: voice verify.yml

57 results

Input audio quality

Quality of the audio is extremely important for satisfactory results of any speech processing technology, being it simple voice activity detection, speech transcription, voice biometry, or other. There are two main aspects of audio quality: technical quality of the audio data (format, codec, bitrate, SNR, …) sound quality of the actual content (background noise, reverberations, …) Technical quality Using inappropriate…

Phonexia Speech Engine

…✓ Voice Activity Detection (VAD) ✓ ✓ Time Analysis Extraction (TAE) ✓ ✓ Speech Quality Estimation (SQE) ✓ ✓ Language Identification (LID) ✓ Gender Identification (GID) ✓ Age Estimation (AGE) ✓ Speaker Diarization (DIAR) ✓ Results caching Processing results can be optionally stored in results cache database to speed up eventual re-processing of the same recordings by the same technology…

Speech to Text (STT)

…language model to be used for transcription. As an output the transcription in one of the formats is provided. The technology extract features out of voice, using acoustic and language models together with pronunciation all in recognition network creates a hypothesis of transcribed words and „decode“ the most possible transcription. Based on requested output types one or more transcribed text…

Speaker Diarization (DIAR)

Speaker Diarization labels segments of the same voice(s) in one mono-channel audio record based by the individual speaker´s voice. It is a language-, domain- and channel-independent technology. It performs not only the segmentation of speakers but of technical signals and silence as well. The outputs of the technology can be both log files with labels and/or split audio files/one new…

Designing and Developing Application

…specific hardware (mainly CPU, virtualized infrastructure vs. HW) or are you going to buy specific HW for customer? What is short/long time storage requirements (ie. audio and results availability, desktop vs. distributed system)? Is there any synchronization required (ie. voiceprint database to clients)? What is the topology of the solution/app (ie. where to store audio, voiceprints, results, …)? How to…

SID: TUTORIAL: Speaker Identification – How to Do a Basic Test

Phonexia Speaker Identification is a voice biometry tool for recognition of speakers by their voice. In this video, we will show you how to start using this technology! You will learn how to create a “Speaker Model” to identify a speaker in a set of data. Ready to test it? Start with our video: What else is needed? 1. Phonexia…

Measuring of a software processing speed – what is the FtRT (Faster than Real Time)

…noise, technical signals like ringing, DTMF tones, etc). This metric is useful for finding performance on actual audio data coming into audio processing pipeline. Regular recording with Voice and Silence segments in waveform Net Speech based FtRT is conservative, purely technical number. It is calculated from only spoken speech data, i.e. with all non-speech parts (silence, noise, DTMF tones, etc.)…

Understand SPE technologies, instances and workers

…(or bank branch): Post office is a place providing different kinds of services – one can go there to send letters, send or pick up packages, get a POBox, get some financial services, insurance, etc.). Speech Engine has various speech technologies configured – one can analyze the audio quality, extract voiceprints from recordings, compare voiceprints, transcribe audio to text, etc….

Recommended OS and HW (PSP)

…external dependencies like databases, storages, etc.) would require additional resources. Therefore you should always perform a proper load test using your entire system to determine the actual HW requirements. To give you a picture, here are recommendations for typical configurations: Voice Biometrics, basic 100 hours/day package (***) files processing CPU: 8 physical cores, 1x Intel® Xeon E5-2640 v4 or similar…

Time Analysis Extraction (TAE)

…in the particular direction and details about crosstalk, for example where the other speaker is talking “over” this speaker Segmentation This section is optional and need to be explicitly turned on. It describes segments of detected voice and silence (the same as Voice Activity Detection technology). More information You can find more information in corresponding chapter of API documentation: https://download.phonexia.com/docs/spe/#Time%20Analysis…

Phonexia Academy

About Main idea of the Phonexia Academy is to help partners to understand the market, Phonexia’s products and technologies. Sell more, deliver your projects on time and at the highest quality, and support your clients effectively. We provide following trainings: Phonexia technologies introduction (online video course) Technical Training Essentials (online video course) Technical Training Advanced – 2 courses: Voice Biometrics…

Q: I am getting the error message “Your license is not for this application.”

A: Check your license file (license.dat) by opening it in Notepad. Make sure the license contains records for all required modules. See Licensing article for additional information…

Q: When I start VIN software, it gives me “License is not for this application” error.

A: Please attach the licensing file (license.dat) to the support ticket at our Service Desk….

Download Speech Platform

…XL5 Diarization (DIAR) – model XL4 Language Identification (LID) – model L4 Gender Identification (GID) – model XL5 Age Estimation (AGE) ) – model XL5 Voice Activity Detection (VAD) – model GENERIC_3 and SID4_XL5 Speech Quality Estimation (SQE) Time Analysis Extraction (TAE) Waveform Denoiser (DENOISER) Phonexia Browser example audio (in ./BROWSER/example/ and ./SPE/bsapi/{technology}/example/) Step #2 – First start To get…

Understand SPE configuration

…of MySQL database connections at the time. Default is 32 # server.db.mysql.max_connections = 32 # Maximum size of in-memory cache for calibrated voice-prints of speaker models. Default is 100 # server.db.sid_model_calib_vp_cache_size = 100 Sizing of the system The selection of speech technologies and the number of instances per technology which are instantiated when starting the SPE is configured by the…