Skip to content Skip to main navigation Skip to footer

Search: task

18 results

Understand SPE configuration file

…are then injected to the queue front and processed as soon as possible. You can get more information about task priorities in Task prioritization section of REST API documentation and in Understanding SPE processing priority article. server.task_default_priority # Task default priority. Priority must be number from to 99. This priority will be set # to each task if request doesn’t…

Releases and Changelogs (SPE)

tasks can leave files locked forever Fixed: Files are locked forever when task limit is reached Fixed: Misleading and confusing licensing error when HW profile does not match Fixed: Trace log messages during model initialization say “Unknown interface” Fixed: STT_STREAM fails to process request with preferred phrases when no instance of STT with the same model is running Fixed: STT:…

Understand SPE processing queue

…workers are not shown): non-colored pending tasks are waiting in the queue (task state in the response to GET /pending/{ID} request says “waiting“) colored pending tasks are being handled by processing workers (task state in the response to GET /pending/{ID} request says “running“) SPE is configured with 3 standard and 2 realtime workers, i.e. 5 tasks are handled simultaneously (one…

Understand SPE processing priority

…enabled in SPE configuration file (enabled by default, see server.task_priorities_enable option) and default priority value set the prioritize role enabled for SPE user creating the processing task If prioritization is enabled and processing task is started by a user without the “prioritize” role, task is started with default priority. Task priority is defined by a number from (highest priority) to…

Understand SPE configuration

…Realtime workers must be set separately to enable RTP or HTTP stream processing. Principally, it is the same as setting normal workers: server.n_realtime_workers = The SPE maintains its own queue of task requests waiting to be processed and the following directive determines the allowed number of tasks in the queue. You can do fine-tuning for each SPE instance load and…

Understand SPE workers configuration

…the maximum number of simultaneously running tasks. # Multithread settings server.n_workers = 8 server.n_realtime_workers = 8 Requests for additional file processing tasks are put in a queue and processed according their order and priorities. Requests for additional stream processing tasks are refused with HTTP status 403 (the realtime nature of stream processing does not allow any queuing). File processing can…

LID: Terminology and adaptation

…to train a language using just a few and long audio files (like 5 files, 1 hour each) Acoustic channels should be as close as possible to channel of intended deployment Adaptation using REST API (SPE 3.38 or newer) SPE 3.38 and newer include LID adaptation tasks in REST API, which makes the adaptation significantly easier than in previous versions….

STT: What is Preferred Phrases feature and how to use it

Preferred phrases is a feature, available for 5th or newer generation of STT models and Speech Engine 3.32 or later. This article explains what is the feature good for, how does it work internally and gives some tips for practical implementation. What are preferred phrases In the speech transcription tasks, there may be situations where similarly sounding words get confused,…

Understand SPE user accounts

…name and password (obviously 😉) whether the account is active or not – accounts may be turned off and on user role – any combination of user – allowed to use all SPE functionality, except for the /admin/… endpoints admin – allowed to use all SPE functionality, including the /admin/… endpoints prioritize – allowed to prioritize SPE tasks, see Task

Release Notes

…consuming) from audio only once and sent for comparison (fast) to both SID4_XL4 and GID_XL4. VAD has been upgraded to a new generation (tech. model GENERIC_3) The model (GENERIC_3) was released for standalone Voice Activity Detection (VAD as part of SPE). It brings higher accuracy in such a fundamental task to recognize speech and non-speech (silence, ringing, etc.) correctly. Using…

Q: How do I get results for a pending operation?

A: If server responds on pending request by status 200 – OK, the body of the response will have the result inside (server already has the result in cache memory and there is no need to process the file again). If server responds on pending request by status 202 – Accepted, server will create task and server will begin to…

Sizing of the computing units for speech technologies

…VT features can’t help in performance) Also seek for CPUs with a large L3 cache. And the better CPUs are those with higher l3_cache_size/#_of_physical_CPU_cores ratio. We currently assume that CPUs from the current Intel Xeon Family in the 4th generation are the best. For small computation tasks, i7 family CPUs also have reasonable price/performance ratio) Big challenge: correct SPE3/Speech platform…

Speaker Identification (SID)

…technical capabilities of text-independent speaker recognition. The objective is to drive the technology forward and through the competing find the most promising algorithmic approaches for our future production-grade technology. Basic use cases and application areas The technology can be used for various speaker recognition tasks. One basic distinction is based on the kind of question we want to answer. Speaker…

Q: How do you calculate SNR in Speech Quality Estimation?

A: Signal-to-Noise Ratio (SNR) is an important metric of whether a recording is worth further processing by other speech technologies, so it is part of our Speech Quality Estimation. However, calculating SNR automatically is not a trivial task. We use the fact that the statistical distribution of the frequencies in the waveform of speech has Gamma distribution. In contrast, noise…

Measuring of a software processing speed – what is the FtRT (Faster than Real Time)

…configurations. And vice versa – using the same metric, you can compare software from different vendors on the same HW configuration and for the same processing task. We recognize two measurable metrics: Audio based FtRT is calculated from actual audio in its original form, i.e. containing parts with spoken speech and also parts with silence or other non-speech signal (background…