Skip to content Skip to main navigation Skip to footer

Search: STT performance

10 results

Releases and Changelogs (SPE)

…models for STT and KWS (new VAD generation, dynamic adding of words in preferred phrases, increased transcription precision via updated decoder) VI_VN_6 FR_FR_6 CS_CZ_6 (updated VAD, tuned LM) ES_6 (STT only) EN_US_A_6 (STT only) Fixed: Bad example voiceprints in SID4 L4 model Fixed: STT grapheme checking inconsistent behavior Fixed: STT NL_NL_5 (possibly also other 5th gen models) is much slower…

STT: Language Model Customization tutorial

STT model, put its name in the model parameter, like this: GET /technologies/stt?path=foobar.wav&model=<customized_model_name> Using customized STT model in command line STT To use customized STT model in command line STT, simply specify the new configuration file belonging to the customized STT model in the -config parameter. For example, assuming that original pl_pl_5 model was customized, specifying updated as the model…

Understand SPE database

…is intended mainly as lightweight storage for configuration data. Still, it can handle also the results caching of course… unless we talk about real mass-processing. When using results caching AND processing like hundreds of thousands or millions of audio files per day, the SQLite’s locking mechanism (simple global database lock) can become a performance bottleneck… and choosing a higher-performance MariaDB…

FAQs (PSP)

…Browser, FAQ Speech Platform Permalink Q: What is the difference between on-the-fly and off-line type of speech to text transcription (STT)? A: Similarly as human, the ASR (STT) engine is doing the adaptation to an acoustic channel, environment and speaker. Also the ASR (STT) engine is learning more information about the content during time, that is used to improve recognition….

Understand SPE workers configuration

…CPU cores in the server. Example: Czech STT on stream is approx. 4 times faster than realtime, i.e. 1 CPU core can process 4 realtime streams simultaneously. So a server with 8 CPU cores running only STT stream can be configured as follows: keep 1 core dedicated for operating system and SPE remaining 7 cores can handle 28 realtime workers…

Recommended OS and HW (PSP)

…Intel® Core Processor RAM: 16 GB Storage: 100 GB (depends on your audio retention policy) SSD strongly recommended for superior performance over HDD Configuration includes: STT 6th generation – 2 languages (half load each), KWS 6th generation – 2 languages, LID L4, VAD, SQE Voice Biometrics + Transcription System, basic 100 hours/day package (***) files processing CPU: 14 physical cores,…

Understand SPE benchmark

…(FtRT) processing speed. You can run this benchmark on machines with different CPUs to compare the performace of various Phonexia technologies on them… e.g. to see difference between Intel processors (for which are our technologies optimized) and AMD processors. You can use the benchmark to check if the planned HW upgrade will get you the expected performance gain… You can…

FAQs (Browser)

…details, see KWS technology documentation. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What languages are supported by STT? A: Please see List of supported STT Languages. For more details, see STT technology documentation. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: I am getting SPE related error after starting the Browser (e.g. SPE server crashed, Error Downloading…,…

Speech Engine update

…up to you, based on the actual content of the directory and your new package NOTE: If you created any user configuration files, or made any changes in configuration files, make sure to keep the respective .bs.usr or .bs files! If you created any customized STT language models using LMC, it’s recommended practice to recreate the STT model using the…

Understand SPE multithreaded technologies initialization

…the other hand, parallel threads may cause very intensive disk activity when the system reads source data for multiple technologies at the same time. This is notable especially by technologies like STT, where initialization of each model typically needs to read approx. 1 GB of data from disk. Depending on the disk subsystem performance, fragmentation, etc., this high disk activity…