Search: STT%20vs.%20STT_STREAM

58 results

Speech to Text (STT)

About STT Phonexia Speech to Text (STT) converts speech in audio signals into plain text. Technology works with both acoustics as well as dictionary of words, acoustic model and pronunciation. This makes it dependent on language and dictionary – only some set of words can be transcribed. As an input, audio file or stream is needed, together with selection of…

What is User configuration file and how to use it

…sentence. So, following the How to configure STT realtime stream word detection parameters article, we create a stt_cs_cz_5_online.bs.usr text file along the original stt_cs_cz_5_online.bs configuration file in <SPE directory>/bsapi/stt/settings directory and put the following lines in it (changing the forward extension parameter from default 750 to 1500): [vad.online_segmenter:SOnlineVoiceActivitySegmenterI] forward_extensions_length_ms=1500 Then after restarting SPE – and optionally checking in SPE log…

Q: What languages are supported by STT?

A: Please see List of supported STT Languages. For more details, see STT technology documentation….

Q: What is the difference between on-the-fly and off-line type of speech to text transcription (STT)?

A: Similarly as human, the ASR (STT) engine is doing the adaptation to an acoustic channel, environment and speaker. Also the ASR (STT) engine is learning more information about the content during time, that is used to improve recognition. The dictate engine, also known as on-the-fly transcription, does not look to the future and has information about just a few…

Understand SPE directory structure

…in root of SPE installation (see above). The below example shows shared customized STT CS_CZ_6 model. Location of shared directory can be modified using server.shared.path option in SPE configuration file. This might be useful in complex network infrastructure, multi-SPE deployments, and similar advanced configurations. shared └── bsapi └── stt ├── data │ └── models_cs_cz_6_customized └── settings ├── stt_cs_cz_6_customized.bs └── stt_cs_cz_6_customized_online.bs…

Phonexia Speech Engine

Phonexia Speech Engine (SPE) is main part of Phonexia Speech Platform. SPE is a server application for 64-bit Linux or Windows, providing REST API to entire portfolio of Phonexia speech technologies. SPE capabilities overview: Audio files and stream processing Audio files RTP / HTTP streams Speaker Identification (SID) ✓ ✓ Speech To Text (STT) ✓ ✓ Keyword Spotting (KWS) ✓…

Releases and Changelogs (Browser)

…show wrong time labels for long recordings [#4325, #4292] Support for SPE 3.6.x Phonexia Browser v3.5.0, BSAPI 3.9.1 – Oct 19 2016 Fixed multi-channel recordings might not be processed by STT for the first time Added “Copy text” to context menu of STT widget in Waveform editor Support for SPE 3.5.x Phonexia Browser v3.4.0, BSAPI 3.8.0 – Sep 21 2016…

FAQs (PSP)

…Browser, FAQ Speech Platform Permalink Q: What is the difference between on-the-fly and off-line type of speech to text transcription (STT)? A: Similarly as human, the ASR (STT) engine is doing the adaptation to an acoustic channel, environment and speaker. Also the ASR (STT) engine is learning more information about the content during time, that is used to improve recognition….

STT: How to properly convert Confusion Network results to One-best

Confusion Network output is the most detailed Speech Engine STT output as it provides multiple word alternatives for individual timeslots of processed speech signal. Therefore many applications want use it as the main source of speech transcription and perform eventual conversion to less verbose output formats internally. This article provides the recommended way to do the conversion. Time slots and…

Key Features (PSP)

…in the Languages Available section. Speech To Text (STT) and Keyword Spotting (KWS) languages Language Identification (LID) languages Supported Audio input The Speech Engine server supports various audio formats as listed in API reference > Audio requirements. It also supports the RTP/HTTP stream processing as listed in API reference > RTP/HTTP streams. The Speech Engine allows the usage of some…

Understand SPE database

…unregistering files after processing (if using the files registering technique instead of uploading the audio files – see the Understanding SPE home directory article). This makes the files information AND the cached processing results to be kept in database. Or, you may be saving stream data to file, but not deleting the created stream audio files using the REST API…

Q: Can I add words into dictionary?

A: Yes, you can use Language Model Customization (LMC). For more details please read STT Language Model Customization tutorial….

FAQs (Browser)

…details, see KWS technology documentation. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What languages are supported by STT? A: Please see List of supported STT Languages. For more details, see STT technology documentation. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: I am getting SPE related error after starting the Browser (e.g. SPE server crashed, Error Downloading…,…

Phoneme Recogniser (PHNREC)

Phonexia Phoneme Recogniser (PHNREC) converts speech signals into pronunciation characters (so called phonemes). After the conversion, the pronunciation (text) can be easily indexed and searched by third party text data mining tools. The technology is optimized for noisy recordings and colloquial speech, can process audio files as well as audio streams and can provide results in several output formats. Phoneme…

Q: What languages do you offer?

It depends on the technology. Phonexia Language Identification (LID) is pre-trained for 60+ languages. Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT) for 20+ languages including English, French, German, Russian, Spanish and many more….