Search: spe-3.11.1-win64

127 results

Understand SPE multithreaded technologies initialization

The server.technology_multithread_initialization setting in SPE configuration allows SPE to initialize instances of technologies during startup using multiple parallel threads. Default setting is OFF, i.e. instances of technologies are initialized using single thread, one-by-one. This allows easier tracking of eventual issues during SPE startup and better readability of technologies initialization log messages (only single initialization happens at a time). The downside…

Input audio quality

…of speech technologies (precision of speaker identification, transcription accuracy, etc.). Therefore it is essential to have as clean audio as possible. ? DO’S ? DON’TS Capture the sound as close to the source as possible, i.e. as close to the speaker’s mouth as possible as close to the recording source as possible to minimize the amount of ambient sounds and…

Q: How do you calculate SNR in Speech Quality Estimation?

A: Signal-to-Noise Ratio (SNR) is an important metric of whether a recording is worth further processing by other speech technologies, so it is part of our Speech Quality Estimation. However, calculating SNR automatically is not a trivial task. We use the fact that the statistical distribution of the frequencies in the waveform of speech has Gamma distribution. In contrast, noise…

STT: Configuring word detection parameters for stream transcription

…the processing, to check if the utterance maybe continues“. Decreasing this value means that even shorter pauses between words are detected as end of the segment. And vice versa – increasing this value means that longer pauses between words are not identified as end of a sentence. Speech threshold is a unitless value specifying threshold between silence and speech. Default…

Waveform Denoiser (DENOISER)

…software cannot remove unwanted speech or music in the background. Denoiser is used to remove noise from the recording and at the same time to amplify the speech signal for: Better intelligibility when listening by people (recommended use), Achieving better results with automatic speech recognition technologies (necessary to test on customer data first). Input: audio file (format details – see…

Video – Speech Analytics technologies

MODULE 4: Speech Analytics technologies (23 min) Common generic rules for CLI, REST and GUI Speech To Text (STT) in CLI, REST and GUI Keyword Spotting (KWS) in CLI, REST and GUI Phoneme Recognizer (PHNREC) in CLI, REST and GUI Time Analysis Extraction (TAE) in CLI, REST and GUI Summary https://www.youtube.com/watch?v=-FAoRywqv7U…

Q: While trying to install SPE3, I get the error for loading libasound.so.2 libraries

Currently I’m trying to install the provided binaries for Linux, but I do get the following when running phxadmin: ./phxadmin: error while loading shared libraries: libasound.so.2: cannot open shared object file: No such file or directory I’m trying to run this under CentOS 7. A: Please install the right libraries required for manipulation with audio files from official repository into…

Support Lifecycle Policy (PSP)

General Lifecycle of Phonexia products is driven by Phonexia Product Support and Lifecycle Policy (valid from Q3/2019). Content of our support and software versioning approach is defined as well in this document. Specific versions of our products and languages are supported and maintained according to following tables. Phonexia Speech Engine Version Release Date End of Support Maintained Until Release type…

Video – Getting started with SPE

MODULE 1: Getting started with Speech Engine (19 min) Installation Technologies configuration Server and database configuration Users configuration Files processing Synchronous and asynchronous requests, results polling Stream processing https://youtu.be/4qrB-GfFdWY…

Q: What are the requirements for SID evaluation dataset?

For evaluating the real life scenario of Phonexia Speaker Identification technology, the system needs to be calibrated by SID dataset. SID dataset (minimum requirements): To measure SID performance precisely, it’s important to prepare evaluation recordings set very carefully. The requirements are: 50+ known speakers, 200+ recordings in total (i.e. 3 to 5 recordings per speaker*) 1+ minute of net speech…

Speech Engine

To create the SPE report: Go to the SPE installation directory Open command line/terminal (in Ubuntu Linux Right click + press E, in Windows type cmd in the address bar) Run ./phxadmin –report (Linux) or phxadmin.exe /report (Windows) Zip up the created directory with report and attach the ZIP file to your issue description The Report functionality is not present…

Speech To Text / Keyword Spotting supported languages

Languages supported by Speech To Text and Keyword Spotting Standard = Maintained until newer generation is released, or end of support is reached. Language generation is specified by the number in “Model name”. Language (region) Model name Released End of support Maintenance Arabic (Gulf, Kuwait) AR_KW_6 2022-04 8th gen. Standard Arabic (Levantine) AR_XL_6 2021-05 8th gen. Standard AR_XL_5 2020-08 7th…

Manuals

This section collects links or locations of manuals for specific Phonexia Speech Platform components. API Phonexia Speech Engine REST API – SPE – latest version manual online (api_reference.html for your version is located in doc subdirectory in SPE folder or distribution ZIP) Brno Speech Application Interface v3 – BSAPI3 – latest version manual online Applications and Tools Phonexia Browser –…

Q: What to do with the ApplicationStartup: Unhandled exception: BsapiException error?

When running SPE, the following error occurs: [Error] ApplicationStartup: Unhandled exception: BsapiException: SWaveformSegmenterI(/mnt/phxspe/home/phx/storage/dfs/a1cabcf7-c761-49f1 -a9bc-0a8209a09fd9.opus Requested segment (78056, 102056) is out of waveform range (0,91840). A: It means that this opus file is created improperly and declares internally (in header) much more audio than available in real file. Please check your audio source/originator for proper functionality. Or use ffmpeg / sox…

Q: I can’t manage to run Phonexia Browser software. I always get an error.

…happen if the initialization of SPE engine takes too long. Phonexia Browser software treats it as initialization failure and kills the server. You can fix this by doing the following: Increase timeout in Settings > Speech Engine tab > First connection timeout Use fewer instances of technologies, thus letting the Speech Engine to start faster Use smaller models of technologies…