Skip to content Skip to main navigation Skip to footer

Search: spe

127 results

FAQs (PSP)

…play with “pure” daemon installation, then phxspe.properties file should exist in ./settings subdirectory. File phxspe.properties is created by phxadmin utility or can be created from ./data/phxspe.properties.default template file. Copy template file to ./settings directory Rename it to phxspe.properties Check for server.enable_authentication_token directive and setup it as needed. Restart phxspe Basic installation steps are described in ./doc/INSTALL.html document. in FAQ Phonexia…

Understand SPE workers configuration

Worker is a working thread performing the actual files- or realtime streams processing in Speech Engine. This article helps to understand the Speech Engine workers and provides information how to configure workers for optimal performance and server utilization. Starting from SPE 3.51, new defaults in settings/phxspe.properties make SPE to configure workers automatically according to local conditions (physical CPU cores, configured…

Understand SPE processing queue

This article explains details about SPE asynchronous requests processing queue, the processing task lifecycle and its handling states. When SPE receives an asynchronous API request, a task is created and put in a queue. Size of the queue is defined by server.n_task_limit setting in SPE configuration file, default value is 1000 tasks. Tasks in queue are then handled according to…

Sizing of the computing units for speech technologies

…cores = 64 GB Conclusion: The best computing performance can be expected from a CPU with: l3_cache_size/#_of_physical_CPU_cores=>2.5 MB Memory bandwidth & speed is more important than CPU base frequency. Intel fixes on TLB due to Meltdown and Spectre issues matters in performance. Important notice (valid for SPE3) – due to internal SPE3 requirements you must multiple the required number of…

Speech Quality Estimation (SQE)

Phonexia’s Speech Quality Estimation quantifies the acoustic quality of recordings. This helps the user to quickly determine whether the acoustic quality of a recording is good for processing with other speech technologies or not. As an answer for SQE, the SPE returns a json/xml file. This file includes general information about the technology and statistics of all (one or two)…

Speaker Diarization (DIAR)

…silence as well. The outputs of the technology can be both log files with labels and/or split audio files/one new multichannel audio file. Typical use cases: Preprocessing for other speech recognition technologies, labeling the parts of the utterance according to the speakers, splitting telephone conversations recorded in mono into several channels, identifying how many speakers are speaking in the recording….

Documentation (SPE)

Partners and customers are encouraged to read Speech Engine (PhxSpe | PhxSpe.exe) software API reference and various manuals available as files in [SPE]/doc in standard software package and installation. You can also find REST API reference (Speech Engine) documentation online. You might be interested in reading the following information in manual: REST API reference Structure of API queries Asynchronous request…

Releases and Changelogs (Browser)

SPE starts (the –-spe-output parameter is not needed anymore and is ignored) Improved: SPE debug output is now configurable in the Settings dialog and is enabled by default (the –-spe-debug parameter is ignored) Improved: Quick Setup Guide is now displayed automatically when Browser is run for the first time (the –guide parameter is ignored) Fixed: Quick Setup Guide dialog items…

Understand SPE processing priority

…enabled in SPE configuration file (enabled by default, see server.task_priorities_enable option) and default priority value set the prioritize role enabled for SPE user creating the processing task If prioritization is enabled and processing task is started by a user without the “prioritize” role, task is started with default priority. Task priority is defined by a number from (highest priority) to…

Understand SPE multithreaded technologies initialization

The server.technology_multithread_initialization setting in SPE configuration allows SPE to initialize instances of technologies during startup using multiple parallel threads. Default setting is OFF, i.e. instances of technologies are initialized using single thread, one-by-one. This allows easier tracking of eventual issues during SPE startup and better readability of technologies initialization log messages (only single initialization happens at a time). The downside…

Q: How do you calculate SNR in Speech Quality Estimation?

A: Signal-to-Noise Ratio (SNR) is an important metric of whether a recording is worth further processing by other speech technologies, so it is part of our Speech Quality Estimation. However, calculating SNR automatically is not a trivial task. We use the fact that the statistical distribution of the frequencies in the waveform of speech has Gamma distribution. In contrast, noise…

Video – Speech Analytics technologies

MODULE 4: Speech Analytics technologies (23 min) Common generic rules for CLI, REST and GUI Speech To Text (STT) in CLI, REST and GUI Keyword Spotting (KWS) in CLI, REST and GUI Phoneme Recognizer (PHNREC) in CLI, REST and GUI Time Analysis Extraction (TAE) in CLI, REST and GUI Summary https://www.youtube.com/watch?v=-FAoRywqv7U…

LID: Terminology and adaptation

…tool which SPE configuration file to use Default Speech Engine configuration file is settings/phxspe.properties. However, when using Phonexia Browser in “SPE on localhost” mode (also known as “Embedded SPE”), the configuration file is settings/phxspe.browser.properties. (Make sure to use the right configuration file, otherwise you might register the language pack to different configuration and therefore it won’t be visible where you…