Search: multi

40 results

Understand SPE multithreaded technologies initialization

…of single-threaded initialization is that it may take longer time to fully initialize the whole system, depending on the actual technologies configuration (number of initialized technologies and instances). In multi-threaded configuration, instances of each technology are initialized in multiple parallel threads, one separate thread for each technology–model combination. This, in general, results in faster initialization of the whole system. On…

Understand SPE configuration file

…multiple thread to speed up initialization of technologies server.technology_multithread_initialization = false This setting controls whether technologies initialization during SPE startup should run in multiple parallel threads, or not. Default value is false, i.e. the initialization runs in single thread. See Understanding SPE multithreaded technologies initialization article for more details. server.technology_multithread_initialization.n_threads # Number of threads for initialization of technologies # Use…

Releases and Changelogs (SPE)

…– the following technologies/models used multiple threads when they should not: STT/KWS – all 6th generation models LID – L4 model DIAR, GID, SID4 – XL4 model SQE – GENERIC model VAD – GENERIC3 model Speech Engine 3.45.6, DB v1701, BSAPI 3.45.7 (2022-04-14) Fixed: Licensing subsystem fails to get license when multiple applications run under different OS user accounts Speech…

Understand SPE configuration

…0022 Data storage and multithread settings The home directory of SPE contains all user data including audio recordings and metadata files from speech processing (speaker models, description etc.). This is another good example of using environment variables if your topology design requires multiple instances of SPE processing the same payload. This is great for sharing raw data between multiple physical…

STT: Results explained

This article aims on giving more details about Speech To Text outputs and hints on how to tailor Speech To Text to suit best your needs. In the process of transcribing speech, the Speech To Text technology usually identifies multiple alternatives for individual speech segments, as multiple phrases can have similar pronunciations, possibly with different word boundaries, e.g. “eight tea…

Releases and Changelogs (Browser)

…Public 3.7 2017-03-29 2018-09-29 3.8 Public 3.6 2016-12-15 2019-06-15 3.7 Public 3.5 2016-10-19 2018-04-19 3.6 Public 3.4 2016-09-21 2018-03-21 3.5 Public 3.3 2016-07-13 2018-01-13 3.4 Public 3.1 2016-02-25 2017-08-25 3.3 Public 1.2 2015-10-07 2017-04-07 2017-04-07 Public Changelogs Phonexia Browser 3.60 (Public release) Phonexia Browser 3.60.0, BSAPI 3.60.0 (2023-12-05) New: Transcriptions of multiple files can be saved using the context menu…

Understand SPE directory structure

…on which technologies are included in the particular SPE installation. For testing and first-time evaluation we usually include the full set of technologies, other installations may contain only limited subset. Location of bsapi directory can be modified using bsapi.path option in SPE configuration file. This might be useful in complex network infrastructure, for sharing technologies between multiple SPEs, and similar…

Understand SPE database

SPE database serves multiple purposes: stores SPE internal data stores various information about SPE entities created by SPE user audio files metadata speaker models and their voiceprints speaker groups and their voiceprints calibration sets keyword lists language packs audio source profiles stores cached processing results (ON by default, can be set in SPE configuration file) optionally also stores SPE log…

Keyword Spotting (KWS)

…experts. Typical use cases Call centers increase operator and supervisor efficiency by searching calls identify inappropriate expressions from operators check marketing campaigns with automatic script-compliance control Mass media and web search servers index and search multimedia by keyword route multimedia files and streams according to their content Security/defense maintain fast reaction times by routing calls with specific content to human…

STT: Language Model Customization tutorial

…Phonemes_for_STT_and_KWS (or Annex2 in older versions) PDF file. If pronunciation is not explicitly specified, a default one generated internally will be used. To add multiple pronunciation variants for the same word, enter multiple word–pronunciation pairs, each on a separate line. An example of English word list: the words iPhone and contract don’t have any specific pronunciations defined the word schneider…

Language Identification (LID)

Phonexia Language Identification (LID) will help you distinguish the spoken language or dialect. It will enable your system to automatically route valuable calls to your experts in the given language or to send them to other software for analysis. Application areas Preselecting multilingual sources and routing audio files to language-dependent technologies (transcribing, indexing, etc.) Analyzing network traffic media (language statistics)…

Releases and Changelogs (VIN)

…for Expert and Organization Voice Inspector 5.0 Voice Inspector 5.0.0, BSAPI 3.57.0 (2023-06-20) New: Speaker Identification XL5 technology model New: Data in lists/tables are now sorted alphabetically New: Enlarge the initial set of speakers included in examples; some of the speakers are multilingual ❗❗❗ Voice Inspector 5.0 requires a new license. To upgrade from version 4 or 3, please contact…

LID: Terminology and adaptation

This article describes various ways of Language Identification adaptation. Basic terminology Languageprint (*.lp file) – numeric representation of the audio, extracted from audio file for language identification purpose of (similar to “voiceprint”, but representing sound of the spoken language, not sound of the speaking person) Languageprint archive (*.lpa file) – multiple languageprints combined into single archive Languageprint archives come pre-created…

Understand SPE workers configuration

…the maximum number of simultaneously running tasks. # Multithread settings server.n_workers = 8 server.n_realtime_workers = 8 Requests for additional file processing tasks are put in a queue and processed according their order and priorities. Requests for additional stream processing tasks are refused with HTTP status 403 (the realtime nature of stream processing does not allow any queuing). File processing can…

STT: What is Preferred Phrases feature and how to use it

…support in that model. Using a class token in preferred phrases allows to improve transcription accuracy using a rather generic sentence that represents multiple variants of the sentence. Use the class name prefixed with $ as class token. Example: My name is $first_name $surname and I live in $municipality at $street street. The words to be added, listed in the…