Skip to content Skip to main navigation Skip to footer

Search: voicebot

6 results

STT: Configuring word detection parameters for stream transcription

…i.e. the backward extension value actually says for how long the processing must be delayed (processing has to wait until that much input signal arrives) ⇒ increasing this value means that speech activity is detected with longer delay (e.g. means delayed barge-in detection in voicebot implementation). The forward extension value basically means “add this much of a following signal to…

Phonexia Speech Engine

…audio manipulation SPE has built-in basic audio files manipulation functionality, like separating individual channels from stereo recordings, cut one audio to several files, save audio from incoming stream to file and others. Stream audio player To support voicebot scenarios, SPE has the ability to play audiofiles directly to output RTP stream External Text-to-speech (TTS) integration Easy integration with external TTS…

Releases and Changelogs (SPE)

…of all word confidences in a sentence – helps in judging the results ‘credibility’ Reduced delay of obtaining results in output – allows for faster detection of barge-in, e.g. in voicebot application New: All 5th generation STT models now use Minimum Bayes-Risk Decoding for Confusion Network construction Confusion Network results now contain precise start- and end times for each individual…

STT: Results explained

…available in ouput. To better support voicebot applications, following additions were implemented: sentence_info array, containing confidence value for each sentence present in the one-best results (since version 3.24) (a sentence is a part of output from <segment> to </segment> token… i.e. if there are 2 such sentences in the results, the sentence_info array contains 2 elements) n_best_result object, containing additional…

STT: What is Words-To-Numbers feature and how to use it

…point zero three ⇒ 1586.03 sixty four million seven hundred thousand ninety ⇒ 64700090 This should help to simplify processing of the transcribed texts by text analytic layers or NLP (Natural Language Processing) engines, e.g. in voicebot applications. Where is the converted output available? The words to numbers conversion is available only in n-best output (i.e. where the entire sentence…

STT: What is Preferred Phrases feature and how to use it

…may come handy, allowing to prompt the speech transcription with phrases or words which are expected to appear in the utterance, thus increasing the chances of correctly transcribed words, increasing the overall transcription accuracy. The intended application of this feature is mainly voicebots, i.e. in questions-driven dialogues, where the probable answers to each individual question are predictable and expected. But…