Skip to content Skip to main navigation Skip to footer

Search: STT%20performance

43 results

Understand SPE technologies configuration file

…technologies.xml file containing the following setup: STT (Speech To Text) with 8 instances of SK_SK_5 model STT_STREAM (Speech To Text for stream processing) with 2 instances of CS_CZ_6 model SID4E (Speaker Identification 4 Voiceprint Extractor) with 2 instances of L4 model 3 instances of XL4 model SID4C (Speaker Identification 4 Voiceprint Comparator) with 2 instances of L4 model 3 instances…

Releases and Changelogs (Browser)

…Fixed multi-channel recordings might not be processed by STT for the first time Added “Copy text” to context menu of STT widget in Waveform editor Support for SPE 3.5.x Phonexia Browser v3.4.0, BSAPI 3.8.0 – Sep 21 2016 Fixed do not show second label panel in waveform editor when double-click on result Fixed import of the speaker models may get…

STT: How to properly convert Confusion Network results to One-best

Confusion Network output is the most detailed Speech Engine STT output as it provides multiple word alternatives for individual timeslots of processed speech signal. Therefore many applications want use it as the main source of speech transcription and perform eventual conversion to less verbose output formats internally. This article provides the recommended way to do the conversion. Time slots and…

Understand SPE workers configuration

…CPU cores in the server. Example: Czech STT on stream is approx. 4 times faster than realtime, i.e. 1 CPU core can process 4 realtime streams simultaneously. So a server with 8 CPU cores running only STT stream can be configured as follows: keep 1 core dedicated for operating system and SPE remaining 7 cores can handle 28 realtime workers…

FAQs (Browser)

…details, see KWS technology documentation. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What languages are supported by STT? A: Please see List of supported STT Languages. For more details, see STT technology documentation. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: I am getting SPE related error after starting the Browser (e.g. SPE server crashed, Error Downloading…,…

Understand SPE executable files

…See POST /audiofile endpoint documentation for details. phxclient: example 2 phxclient /login=admin /password=phonexia /method=GET /uri=”127.0.0.1:8600/technologies/stt/?path=/myfile.wav&model=en_us_6&result_type=one_best,n_best&cache_disable=true” ./phxclient –login=admin –password=phonexia –method=GET –uri=”127.0.0.1:8600/technologies/stt/?path=/myfile.wav&model=en_us_6&result_type=one_best,n_best&cache_disable=true” Process myfile.wav file stored in the root of SPE internal storage – e.g. uploaded using the previous example – using the Speech To Text (STT) technology model EN_US_6 (6th generation English), returning one_best and n_best result types, and disabling any…

Q: What languages do you offer?

It depends on the technology. Phonexia Language Identification (LID) is pre-trained for 60+ languages. Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT) for 20+ languages including English, French, German, Russian, Spanish and many more….

Support Lifecycle Policy (PSP)

…3.7 2017-03-27 2018-09-27 3.8 Public 3.6 2016-12-14 2018-06-14 3.7 Public 3.5 2016-10-04 2018-04-04 3.6 Public 3.4 2016-09-19 2018-03-19 3.5 Public 3.3 2016-07-11 2018-02-11 3.4 Public 3.2 2016-04-22 2017-10-22 3.3 Public 3.1 2016-02-15 2017-08-15 3.2 Public 3.0 2016-02-09 2017-08-09 3.1 Public 2.1 2015-09-16 2017-09-16 2017-09-16 Public 2.0 2015-01-06 2016-07-06 2.1 Public Speech to Text (STT) and Keyword Spotting (KWS) models Languages…

Key Features (PSP)

…recording, Speech to Text (STT) – several languages supported – converts speech into plain text (words or sentences) automatically, Keyword Spotting (KWS) – several languages supported – detects specific keywords/phrases automatically without conversion to text, Gender identification (GID) – identifies whether a speaker is male or female, Age Estimation (AGE) – estimates the speaker´s age group, Voice Activity Detection (VAD)…

Phonexia Speech Engine

Phonexia Speech Engine (SPE) is main part of Phonexia Speech Platform. SPE is a server application for 64-bit Linux or Windows, providing REST API to entire portfolio of Phonexia speech technologies. SPE capabilities overview: Audio files and stream processing Audio files RTP / HTTP streams Speaker Identification (SID) ✓ ✓ Speech To Text (STT) ✓ ✓ Keyword Spotting (KWS) ✓…

Speech Engine update

…up to you, based on the actual content of the directory and your new package NOTE: If you created any user configuration files, or made any changes in configuration files, make sure to keep the respective .bs.usr or .bs files! If you created any customized STT language models using LMC, it’s recommended practice to recreate the STT model using the…

Understand SPE database

…results – file, used technology model, used speaker model, used FAR calibration set, max. FAR, results JSON data rest_result_sid4 SID4 processing results – file, used technology model, used speaker model, used file- and speaker model Audio Source Profile, results JSON data rest_result_sqe SQE processing results – file, used technology model, results JSON data rest_result_stt STT processing results – file, used…

Recommended OS and HW (PSP)

…Intel® Core Processor RAM: 16 GB Storage: 100 GB (depends on your audio retention policy) SSD strongly recommended for superior performance over HDD Configuration includes: STT 6th generation – 2 languages (half load each), KWS 6th generation – 2 languages, LID L4, VAD, SQE Voice Biometrics + Transcription System, basic 100 hours/day package (***) files processing CPU: 14 physical cores,…

Phoneme Recogniser (PHNREC)

…output) Note: The outputs can contain the following special tokens: sil silent part (or no speech detected) The list of phonemes is available in the document phonemes_for_stt_and_kws.pdf (delivered as part of manuals in SPE or STT or KWS).   Languages Supported List of supported languages in Phoneme Recogniser is same as in Keyword Spotting.   Link to API reference https://download.phonexia.com/docs/spe/#%2Ftechnologies%2Fphnrec…