…technologies.xml file containing the following setup: STT (Speech To Text) with 8 instances of SK_SK_5 model STT_STREAM (Speech To Text for stream processing) with 2 instances of CS_CZ_6 model SID4E (Speaker Identification 4 Voiceprint Extractor) with 2 instances of L4 model 3 instances of XL4 model SID4C (Speaker Identification 4 Voiceprint Comparator) with 2 instances of L4 model 3 instances…
Search: STT%20Stream
43 results
…Fixed multi-channel recordings might not be processed by STT for the first time Added “Copy text” to context menu of STT widget in Waveform editor Support for SPE 3.5.x Phonexia Browser v3.4.0, BSAPI 3.8.0 – Sep 21 2016 Fixed do not show second label panel in waveform editor when double-click on result Fixed import of the speaker models may get…
Confusion Network output is the most detailed Speech Engine STT output as it provides multiple word alternatives for individual timeslots of processed speech signal. Therefore many applications want use it as the main source of speech transcription and perform eventual conversion to less verbose output formats internally. This article provides the recommended way to do the conversion. Time slots and…
…CPU cores in the server. Example: Czech STT on stream is approx. 4 times faster than realtime, i.e. 1 CPU core can process 4 realtime streams simultaneously. So a server with 8 CPU cores running only STT stream can be configured as follows: keep 1 core dedicated for operating system and SPE remaining 7 cores can handle 28 realtime workers…
A: Yes, you can use Language Model Customization (LMC). For more details please read STT Language Model Customization tutorial….
…details, see KWS technology documentation. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What languages are supported by STT? A: Please see List of supported STT Languages. For more details, see STT technology documentation. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: I am getting SPE related error after starting the Browser (e.g. SPE server crashed, Error Downloading…,…
…See POST /audiofile endpoint documentation for details. phxclient: example 2 phxclient /login=admin /password=phonexia /method=GET /uri=”127.0.0.1:8600/technologies/stt/?path=/myfile.wav&model=en_us_6&result_type=one_best,n_best&cache_disable=true” ./phxclient –login=admin –password=phonexia –method=GET –uri=”127.0.0.1:8600/technologies/stt/?path=/myfile.wav&model=en_us_6&result_type=one_best,n_best&cache_disable=true” Process myfile.wav file stored in the root of SPE internal storage – e.g. uploaded using the previous example – using the Speech To Text (STT) technology model EN_US_6 (6th generation English), returning one_best and n_best result types, and disabling any…
It depends on the technology. Phonexia Language Identification (LID) is pre-trained for 60+ languages. Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT) for 20+ languages including English, French, German, Russian, Spanish and many more….
…3.7 2017-03-27 2018-09-27 3.8 Public 3.6 2016-12-14 2018-06-14 3.7 Public 3.5 2016-10-04 2018-04-04 3.6 Public 3.4 2016-09-19 2018-03-19 3.5 Public 3.3 2016-07-11 2018-02-11 3.4 Public 3.2 2016-04-22 2017-10-22 3.3 Public 3.1 2016-02-15 2017-08-15 3.2 Public 3.0 2016-02-09 2017-08-09 3.1 Public 2.1 2015-09-16 2017-09-16 2017-09-16 Public 2.0 2015-01-06 2016-07-06 2.1 Public Speech to Text (STT) and Keyword Spotting (KWS) models Languages…
…recording, Speech to Text (STT) – several languages supported – converts speech into plain text (words or sentences) automatically, Keyword Spotting (KWS) – several languages supported – detects specific keywords/phrases automatically without conversion to text, Gender identification (GID) – identifies whether a speaker is male or female, Age Estimation (AGE) – estimates the speaker´s age group, Voice Activity Detection (VAD)…
Phonexia Speech Engine (SPE) is main part of Phonexia Speech Platform. SPE is a server application for 64-bit Linux or Windows, providing REST API to entire portfolio of Phonexia speech technologies. SPE capabilities overview: Audio files and stream processing Audio files RTP / HTTP streams Speaker Identification (SID) ✓ ✓ Speech To Text (STT) ✓ ✓ Keyword Spotting (KWS) ✓…
…up to you, based on the actual content of the directory and your new package NOTE: If you created any user configuration files, or made any changes in configuration files, make sure to keep the respective .bs.usr or .bs files! If you created any customized STT language models using LMC, it’s recommended practice to recreate the STT model using the…
…results – file, used technology model, used speaker model, used FAR calibration set, max. FAR, results JSON data rest_result_sid4 SID4 processing results – file, used technology model, used speaker model, used file- and speaker model Audio Source Profile, results JSON data rest_result_sqe SQE processing results – file, used technology model, results JSON data rest_result_stt STT processing results – file, used…
…Intel® Core Processor RAM: 16 GB Storage: 100 GB (depends on your audio retention policy) SSD strongly recommended for superior performance over HDD Configuration includes: STT 6th generation – 2 languages (half load each), KWS 6th generation – 2 languages, LID L4, VAD, SQE Voice Biometrics + Transcription System, basic 100 hours/day package (***) files processing CPU: 14 physical cores,…
…output) Note: The outputs can contain the following special tokens: sil silent part (or no speech detected) The list of phonemes is available in the document phonemes_for_stt_and_kws.pdf (delivered as part of manuals in SPE or STT or KWS). Languages Supported List of supported languages in Phoneme Recogniser is same as in Keyword Spotting. Link to API reference https://download.phonexia.com/docs/spe/#%2Ftechnologies%2Fphnrec…