Skip to content Skip to main navigation Skip to footer

Search: mean%20normalization

35 results

SID: Speaker Identification: Results Enhancement

…PSTN profile on the side of the PSTN voiceprint, and the YouTube profile on the side of the YouTube voiceprint). Results Enhancement Modes There are currently 3 modes of result enhancement: Mean Normalization False Acceptance Calibration User Calibration These modes can be used either separately, or in combinations. Possible combinations are: Mean Normalization + False Acceptance Calibration Mean Normalization +…

STT: Configuring word detection parameters for stream transcription

…i.e. the backward extension value actually says for how long the processing must be delayed (processing has to wait until that much input signal arrives) ⇒ increasing this value means that speech activity is detected with longer delay (e.g. means delayed barge-in detection in voicebot implementation). The forward extension value basically means “add this much of a following signal to…

Q: What do LLR, LR and score mean?

A: These abbreviations mean the following: LR – likelihood ratio, result from statistical test for two models comparison. It returns a number which expresses how many times more likely the data are under one model than the other. LR meets numbers in interval <0;+inf). LLR – abbreviation for log-likelihood ratio statistic, logarithmic function of LR. LLR meets numbers in interval…

Speaker Identification (SID)

…well-calibrated system, the score of 1000 means that the user can be 1000 times more sure that the speaker in the questioned recording is suspected speaker rather than someone else. Technically, it also means, that 1 out of 1000 speakers was incorrectly detected in the development set. Another reason for calibration is for the score to be independent of the…

Terms of Service

…any law, regulation or guideline in appropriate jurisdictions. Services provided by PHONEXIA will not be accessed or attempted to be accessed through any means other than the user interface provided by PHONEXIA, including through any automated means. This includes, but is not limited to, any attempt to breach security features, to probe or test the website’s defenses, or to access…

Understand SPE configuration file

…authentication is not available. Using the authentication token mode (default) means that you need to call the /login endpoint (using HTTP Basic authentication) to get the X-SessionID token and then use this token in HTTP request header for any subsequent REST queries which require authentication. Using the HTTP Basic authentication (when this setting is set to false) means that you…

Understand SPE workers configuration

…technologies) to ensure optimal performance and server utilization. These new defaults make the content of this article below obsolete, however, we keep it here for those who still want to fine-tune the configuration manually. The default workers configuration in settings/phxspe.properties is as shown below – 8 workers for files processing and 8 workers for realtime streams processing. These numbers mean

Release Notes

…dialog are updated to better express their meaning For a complete list of the changes in SW, see these changelogs: SPE → CHANGELOG.txt included in the distribution, or Releases and Changelogs (SPE) BROWSER → CHANGELOG.txt included in the distribution, or Releases and Changelogs (Browser)   Previous Releases Speech Platform Public Release Fall 2022 (SPE v3.55) Welcome to the page introducing…

Speech Quality Estimation (SQE)

…is usually considered to be a useful signal the higher the better SNR > 15 usually means signal with good quality SNR = means the same amount of speech and noise waveform_max_abs_value – the maximum amplitude; spread of the signal without measure usual encoding is in range from -32.768 to +32.767; the ideal usage is all across the range if…

FAQs (PSP)

…FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What do LLR, LR and score mean? A: These abbreviations mean the following: LR – likelihood ratio, result from statistical test for two models comparison. It returns a number which expresses how many times more likely the data are under one model than the other. LR meets numbers in interval <0;+inf). LLR…

Speech to Text (STT)

…list of words also n-grams of words are present. N-grams are useful during decoding and making a decision. The technology takes into account the word sequences gained from training to „decide“ which of the possible transcriptions are most accurate. Language models can differ for the same acoustic models. This means that they can include different words and different weights for…

Phonexia End User License Agreement

…UPDATES & UPGRADES. Updates means new version of the Software with same major version number with minor changes (correct program bugs etc.). The generic naming used by Phonexia for Software Updates is X.y (X = major version, y = minor version). Upgrade means new generation version of the Software with major changes (new generation of algorithm, expanded functionalities, etc.). Client…

STT: Results explained

…outputs The outputs can contain the following special tokens: Token (5th STT generation and newer) Token (legacy STT generations) Meaning <segment> <s> start of utterance </segment> </s> end of utterance <silence/> _SILENCE_ or <sil/> silent part (or no speech detected) <null/> _DELETE_ time slot should not go to one-best output Realtime stream processing output modes NOTE: Only single-channel (mono) audio…