Search: Configuración del servidor

64 results

Understand SPE technologies configuration file

…technologies.xml file containing the following setup: STT (Speech To Text) with 8 instances of SK_SK_5 model STT_STREAM (Speech To Text for stream processing) with 2 instances of CS_CZ_6 model SID4E (Speaker Identification 4 Voiceprint Extractor) with 2 instances of L4 model 3 instances of XL4 model SID4C (Speaker Identification 4 Voiceprint Comparator) with 2 instances of L4 model 3 instances…

Terms of Service

…PHONEXIA. To terminate your account, Member must contact PHONEXIA’s Customer Service at phonexia.com or [email protected], or Member can cancel their account via the PHONEXIA website directly. 10.2. Deletion of Content for Expired or Terminated Accounts. If Member’s account should expire or be terminated, PHONEXIA has the right to delete account content. 10.3. Account Access Upon Expiration or Termination. Upon expiration…

Releases and Changelogs (Browser)

…Added speech length to SQE detail view [G#84] Support for phrases in keyword lists [G#76] Fixed remove metafile from speaker model during creating new one [G#74] Fixed SID Evaluation results are different from comandline if calibrated VPs are used Phonexia Browser v3.12.1, BSAPI 3.16.0 – Sep 04 2018 [#70] Fixed Denoiser is not recognized in a embeded SPE Phonexia Browser…

Phonexia technology models EoL

…4th generation models, typically marked with a number 1, 2, 3 or 4 in the model name. Other technology models (SID, LID, GID, DIAR, AGE, SQE, VAD, DENOISE) Tech. models supported (generation specified by number in “Tech. model name”). Technology Tech. model name Released End of support Maintenance SID4 XL5 2022-09 6th gen. SID 5th gen. SID XL4 2020-03…

Support Lifecycle Policy (PSP)

…3 or 4 in the model name. Other technology models (SID, LID, GID, DIAR, AGE, SQE, VAD, DENOISE) Tech. models supported (generation specified by number in “Tech. model name”). Technology Tech. model name Released End of support Maintenance SID4 XL5 2022-09 6th gen. SID 5th gen. SID XL4 2020-03 6th gen. SID 5th gen. SID L4 2019-02 6th gen….

Arabic dialects in Phonexia LID and STT

…code ar-XL, where the XL means “cross-Levantine” 😉 NOTE: To get the best STT results, use the model that corresponds to given dialect. The AR_XL_* model is best suitable for Levantine dialect recordings. When using AR_XL_* model for neighbor dialect, e.g. Iraqi, the results will be much worse… and for e.g. Maghrebi, the results will be most probably completely unusable….

FAQs (PSP)

…hours of audio for each new language model (or 25+ hours of audio containing 80% of speech) Only 1 language per record For adapting the existing language model (discriminative training) 10+ hours of audio for each language May be done on customer site. May be done in Phonexia using anonymized data (= language-prints extracted from a .wav audio) in FAQ…

Understand SPE multithreaded technologies initialization

…rather for stable environments, e.g. production deployments. Note that separate threads are used only for distinct technology–model combinations. Multiple instances of the same technology–model combination are NOT initialized in parallel. The number of threads used for the multi-threaded initialization can be configured using server.technology_multithread_initialization.n_threads setting. Default value is 0, which determines the number of threads automatically according to number of…

Phonexia End User License Agreement

…running on servers providing services for Phonexia Partner’s customers. 2.4 Production license which can be used for commercial exploitation. Within the delivery, Client will receive Production license(s) corresponding with mutual agreed payment model for particular type of Software. Unless otherwise agreed, the standard Production license validity is set for twenty (20) years from the date the license commences. 2.5 Special…

Understand SPE home directory

…uploading file using POST /audiofile physically creates the file on filesystem in the storage location… and the file stays there until it’s explicitly deleted using DELETE /audiofile. There might be various reasons to NOT use the REST API for uploading files to the Speech Engine, e.g. to save the server from unwanted burden caused by many uploads and/or big files……

Understand SPE benchmark

…use the benchmark to compare performance of new SPE version, new technology generation, or different technology model, on the same HW configuration… and so on. Running benchmark Benchmark can be run in two ways: by calling …/benchmark endpoint as documented by calling …/benchmark endpoint as documented, with additional path parameter The first option uses set of audio files supplied with…

Understand SPE configuration

…# The value specifies the maximum number of archived log files. If the number is exceeded, # archived log files are deleted, starting with the oldest. # server.logging.file.purge_count = 5 The following directive is important mainly for tuning and initiation of the SPE. Turn it on when the system is prepared to go live. # Use separate thread for logging….

Speaker Identification (SID)

…technological model and can range from 5 to 50 times faster than real time on 1 server CPU core. Voiceprint extraction is the most time-consuming part of the process. Voiceprint comparison, on the other hand, is extremely fast – a millions of voiceprint comparisons can be done in 1 second. Voiceprint extraction (Speaker enrollment) Speaker enrollment starts with the extraction…

Speech Engine update

…original words list (which is copied in the root of your customized model) Also, before deleting the directory make sure that new SPE package contains the same set of technologies (i.e. subdirectories of bsapi directory) as your existing installation Unzip new SPE package (package must contain all needed technologies!) Compare the settings file template from new package (data/phxspe.properties.default) against your…

KWS: Results explained

…before the keyword (1), the Keyword model (2) and a Background model of any speech parallel with the keyword model (3). Models 2 and 3 produce two likelihoods – Lkw and Lbg (any speech = background). Raw score is calculated as log likelihood ratio (LLR): score = loge(Lkw/Lbg) Confidence is calculated from the raw score using a sigmoid function: where:…