Skip to content Skip to main navigation Skip to footer

Search: quality

24 results

Speech Quality Estimation (SQE)

Phonexia’s Speech Quality Estimation quantifies the acoustic quality of recordings. This helps the user to quickly determine whether the acoustic quality of a recording is good for processing with other speech technologies or not. As an answer for SQE, the SPE returns a json/xml file. This file includes general information about the technology and statistics of all (one or two)…

Input audio quality

Quality of the audio is extremely important for satisfactory results of any speech processing technology, being it simple voice activity detection, speech transcription, voice biometry, or other. There are two main aspects of audio quality: technical quality of the audio data (format, codec, bitrate, SNR, …) sound quality of the actual content (background noise, reverberations, …) Technical quality Using inappropriate…

Q: How do you calculate SNR in Speech Quality Estimation?

A: Signal-to-Noise Ratio (SNR) is an important metric of whether a recording is worth further processing by other speech technologies, so it is part of our Speech Quality Estimation. However, calculating SNR automatically is not a trivial task. We use the fact that the statistical distribution of the frequencies in the waveform of speech has Gamma distribution. In contrast, noise…

Speech to Text (STT)

…business value for measuring the accuracy? What type of output will be used in the use case for which accuracy is measured? It may be that only nouns, verbs and adjectives are important for machine understanding of speech context, or all of the words are important when the output text is intended for human processing. Data quality Every metric requires…

Waveform Denoiser (DENOISER)

…or SID technologies. Q: How does the Denoiser perform if part of the recording is noisy and part of the speech is good quality? The technology is being developed to automatically detect low quality audio segments and try to reconstruct them. On the contrary, well-recorded segments should be automatically recognized and retained their original speech quality. Q: Is there a…

Release Notes

…PESQ is a standard way of expressing speech quality as perceived by human beings. SQE: Real-time processing A new technology model SQE_STREAM was added for real-time quality estimation on streams. Added Speaker Clustering endpoint for SID4 (SURPRISE of this release) Allows to compare a set of voiceprints and receive clusters of those. It will bring another level of effectiveness in…

SPE and Browser installation: standalone SPE

…10) Speaker Identification 4 VoicePrint Extractor [active model: XL5(1x)] 11) Speaker Identification 4 VoicePrint Comparator [active model: XL5(1x)] 12) Speaker Identification 4 VoicePrint Calibration [active model: XL5(1x)] 13) Speaker Identification 4 VoicePrint Stream Extractor [active model: XL5(1x)] 14) Speaker Identification 4 VoicePrint Stream Comparator [active model: XL5(1x)] 15) Speech Quality Estimation [active model: GENERIC(1x)] 16) Speech Quality Estimation Stream [active…

Releases and Changelogs (SPE)

…Evaluation of Speech Quality (PESQ) score estimation (PESQ is turned off by default for performance reasons) Fixed: Empty “info” in VAD result when recording contains seconds of speech for model GENERIC_3 Fixed: Incorrect timestamps in PHNREC results Fixed: Segmentation fault when dynamically changing preferred phrases with new STT decoder (new decoder is currently used only in CS_CZ_6) Fixed: Word separator…

Speaker Identification (SID)

…amount of evidence (amount of speech), channel, speech quality, etc. This step is very important for speaker spotting, or even in some forensic cases, so it is integrated in the SID technology and is performed in each voiceprint comparison. In the 4th generation of Phonexia SID, we have introduced the possibility to easily enhance the results with Audio Source Profiles….

Key Features (PSP)

…– detects the audio part that contains voice, Speech Quality Estimation (SQE) – measures the quality of speech, Phoneme Recognizer (PHNREC) – several languages supported – converts speech into phonemes (written characters representing pronunciation), Waveform Denoiser (DENOISER) – automatically improves the audibility of speech for human listeners. Supported Languages The LID, STT and KWS technologies support various languages as listed…

Understand SPE technologies configuration file

…Diarization GID Gender Identification KWS Keyword Spotting KWS_STREAM Keyword Spotting Stream LIDC Language Identification Languageprint Comparator LIDE Language Identification Languageprint Extractor PHNREC Phoneme Recognition SID4C Speaker Identification 4 Voiceprint Comparator SID4C_STREAM Speaker Identification 4 Voiceprint Stream Comparator SID4CALIB Speaker Identification 4 VoicePrint Calibration SID4E Speaker Identification 4 Voiceprint Extractor SID4E_STREAM Speaker Identification 4 Voiceprint Stream Extractor SQE Speech Quality Estimation…

Phonexia Speech Engine

…✓ Voice Activity Detection (VAD) ✓ ✓ Time Analysis Extraction (TAE) ✓ ✓ Speech Quality Estimation (SQE) ✓ ✓ Language Identification (LID) ✓ Gender Identification (GID) ✓ Age Estimation (AGE) ✓ Speaker Diarization (DIAR) ✓ Results caching Processing results can be optionally stored in results cache database to speed up eventual re-processing of the same recordings by the same technology…

Phonexia Academy

About Main idea of the Phonexia Academy is to help partners to understand the market, Phonexia’s products and technologies. Sell more, deliver your projects on time and at the highest quality, and support your clients effectively. We provide following trainings: Phonexia technologies introduction (online video course) Technical Training Essentials (online video course) Technical Training Advanced – 2 courses: Voice Biometrics…