Skip to content Skip to main navigation Skip to footer

Search: log

54 results

Understand SPE executable files

…See POST /audiofile endpoint documentation for details. phxclient: example 2 phxclient /login=admin /password=phonexia /method=GET /uri=”127.0.0.1:8600/technologies/stt/?path=/myfile.wav&model=en_us_6&result_type=one_best,n_best&cache_disable=true” ./phxclient –-login=admin –password=phonexia –method=GET –uri=”127.0.0.1:8600/technologies/stt/?path=/myfile.wav&model=en_us_6&result_type=one_best,n_best&cache_disable=true” Process myfile.wav file stored in the root of SPE internal storage – e.g. uploaded using the previous example – using the Speech To Text (STT) technology model EN_US_6 (6th generation English), returning one_best and n_best result types, and disabling any…

Open Source Acknowledgement

…License) link mman-win32 (Windows only) MIT ogg BSD-style license onnxruntime MIT, link openfst Apache License openssl OpenSSL opus BSD range-v3 BSL-1.0 scnlib Apache License 2.0 spdlog MIT speex revised BSD license speexdsp BSD utfcpp BSL-1.0 zlib Zlib stdlibc++, libgcc, libwinpthread (Windows only) GNU GPL with GCC Runtime Library Exception, link SPE dependencies Name License ADVobfuscator GitHub – andrivet/ADVobfuscator: Obfuscation library…

KWS: Results explained

…before the keyword (1), the Keyword model (2) and a Background model of any speech parallel with the keyword model (3). Models 2 and 3 produce two likelihoods – Lkw and Lbg (any speech = background). Raw score is calculated as log likelihood ratio (LLR): score = loge(Lkw/Lbg) Confidence is calculated from the raw score using a sigmoid function: where:…

FAQs (PSP)

…– abbreviation for log-likelihood ratio statistic, logarithmic function of LR. LLR meets numbers in interval (-inf;+inf). Percentage (normalised) score – commonly used mathematical transformation of the LLR to percentage. This number is better for human readability but may bring some doubts if LLR numbers are too high (typically for some non-adapted installations). Interval <0;100> (or sometimes <0;1>), in %. The…

Get better support

…information. It will help both of our parties to provide fastest and most efficient technical support to your customer: Issue data – required: LOG files or Console output from failed speech technology (for the command line) – usually in ./log/ directory) configuration files (technologies.xml from SPE is minimum) – usually in ./settings/ directory licensing file (license.dat) – usually along the…

Q: What do LLR, LR and score mean?

A: These abbreviations mean the following: LR – likelihood ratio, result from statistical test for two models comparison. It returns a number which expresses how many times more likely the data are under one model than the other. LR meets numbers in interval <0;+inf). LLR – abbreviation for log-likelihood ratio statistic, logarithmic function of LR. LLR meets numbers in interval…

Speaker Identification (SID)

…are the same or if they are two different people. The ratio between these two probabilities is called the Likelihood Ratio (LR), which is often expressed in the form of a logarithm as Log Likelihood Ratio (LLR). Transformation to confidence (or percentage) is usually done using a sigmoid function: where: shift shifts the score to be at ideal decision point…

Understand SPE administration and backup

…All other should be with “user” role (one user does not see content of other user). See Understand SPE user accounts for details. user.home – where the server stores the users data, see Understand SPE home directory for details LOG files – log file rotation is configured in phxspe.properties, see Understand SPE configuration file for details SPE database administration –…

Q: How to fix Error 1007: Unsupported audio format?

…ffmpeg utility, powerful and well documented. Please find your distribution package at http://ffmpeg.org Then continue as described below: Using Phonexia Browser with embed SPE Open the Browser configuration dialog by click on button “Settings” located in tool ribbon. Select tab “Speech Engine” and configure SPE as described in documentation. Don’t forget select checkbox “Enable audio converter”. Using SPE as service/daemon…

SID4 performance on Intel® Xeon® Platinum 8124M

Benchmark goals Find realistic performance using total recording length Find FTRT based exactly on net_speech (engineering sizing data) Find system performance using all physical cores Find system performance using all logical cores Infrastructure setup Intel® Xeon® Platinum 8124M is used in virtual machine with 8 physical cores reserved exclusively for this VM, Hyper Threading is enabled [16 logical cores available],…

What is User configuration file and how to use it

…directory and put the following lines in it (changing the forward extension parameter from default 750 to 1500): [vad.online_segmenter:SOnlineVoiceActivitySegmenterI] forward_extensions_length_ms=1500 Then after restarting SPE – and optionally checking in SPE log that user configuration file stt_cs_cz_5_online.bs.usr was really loaded (this information is available at the ‘trace’ logging level only) – the STT results should show end of segment less frequently….

STT: What is Preferred Phrases feature and how to use it

…details in Adding words to STT language model article. Legacy preferred phrases (SPE 3.32 – 3.42) have a number of limitations: adding words to dictionary is not supported only words already known by the language model are allowed in preferred phrases Phrases containing unknown words are ignored and a warning message is logged to SPE log. Therefore, to be able…

Understand SPE database scripts

…for SPE database content update As the SPE evolves and new features come or functionality gets improved, the database structure needs to change from time to time. So, when updating from older SPE version to newer version, the database content needs to be updated. Therefore, the database structure is versioned – database version is listed in SPE changelog together with…

Key Features (VIN)

…speakers) Supported audio format: MS Wave or RAW with linear coding (8 or 16 bits), A-law, Mu-law; Sampling frequency 8kHz or higher Output: A scoring table with the results of comparisons in a Likelihood Ratio, Log-Likelihood Ratio (decimal or natural logarithm), and Verbal Ratio The graphical presentation of results in the form of a Probability Density Function plot and a…

FAQs (Browser)

…are under one model than the other. LR meets numbers in interval <0;+inf). LLR – abbreviation for log-likelihood ratio statistic, logarithmic function of LR. LLR meets numbers in interval (-inf;+inf). Percentage (normalised) score – commonly used mathematical transformation of the LLR to percentage. This number is better for human readability but may bring some doubts if LLR numbers are too…