Search: engine%20channels

65 results

Video – Getting started with SPE

MODULE 1: Getting started with Speech Engine (19 min) Installation Technologies configuration Server and database configuration Users configuration Files processing Synchronous and asynchronous requests, results polling Stream processing https://youtu.be/4qrB-GfFdWY…

SID: Speaker Identification: Results Enhancement

…(if applicable) parameters for all calibration types (shifts, mean vectors etc.) optional user comments Creating Audio Source Profile An Audio Source Profile can be created either via the aspcreate4 command-line tool or via Speech Engine using the /technologies/speakerid4/audiosourceprofiles/{name} endpoint. Profiles created from the same data are identical, regardless of the interface used to create them. On the creation, content of…

Voice Inspector – Interpretation of results

This part requires higher (and non-anonymous) access level.
How to solve this situation:

Log in here if you are not logged in.
Register here. It takes just a few clicks and it’s free.

STT: Configuring word detection parameters for stream transcription

One of the improvements implemented since Speech Engine 3.24 is neural-network based VAD, used for word- and segment detection. This article describes the segmenter configuration parameters and how they are affecting the realtime stream STT results. The default segmenter parametrs are as shown below: [vad.online_segmenter:SOnlineVoiceActivitySegmenterI] backward_extensions_length_ms=150 forward_extensions_length_ms=750 speech_threshold=0.5 Backward- and forward extension are intervals in miliseconds, which extend the part…

What is User configuration file and how to use it

…name User configuration file name stt_cs_cz_5_online.bs stt_cs_cz_5_online.bs.usr kws_nl_nl_5.bs kws_nl_nl_5.bs.usr phnrec_pashto.bs phnrec_pashto.bs.usr vpextract4_xl4.bs vpextract4_xl4.bs.usr During technology initialization (e.g. during Speech Engine startup), the initialization routine checks for existence of such user config file. If found, it’s automatically loaded after loading the main configuration file and the settings from user config is automatically applied over the setings from main configuration file. Usage…

STT: How to properly convert Confusion Network results to One-best

Confusion Network output is the most detailed Speech Engine STT output as it provides multiple word alternatives for individual timeslots of processed speech signal. Therefore many applications want use it as the main source of speech transcription and perform eventual conversion to less verbose output formats internally. This article provides the recommended way to do the conversion. Time slots and…

Phonexia Partner Program for Government Partners

…and cooperation history as part of the decision to grant/extend Gold partnership. Contact your Phonexia representative for more details. How long do I keep my Gold partner status? Phonexia understands that the sales cycle in the government segment can be long, therefore, your partner status is reviewed once a year in July. How can I test the Phonexia Speech Engine…

STT: What is Words-To-Numbers feature and how to use it

…point zero three ⇒ 1586.03 sixty four million seven hundred thousand ninety ⇒ 64700090 This should help to simplify processing of the transcribed texts by text analytic layers or NLP (Natural Language Processing) engines, e.g. in voicebot applications. Where is the converted output available? The words to numbers conversion is available only in n-best output (i.e. where the entire sentence…

Understand SPE directory structure

Good understanding of SPE directory structure helps to better understand the inner workings of SPE and simplifies troubleshooting. It’s also useful for expert-level tuning of parameters of individual technologies and optimizing SPE configuration e.g. for deployments with shared resources, or deployments in virtualized environments, etc. The SPE directory structure looks like this (the tree depth is limited for better readability):…

Understand SPE database scripts

This article explains details and usage of SQL database scripts stored in SPE installation directory in /data/database subdirectory. These scripts are intended for setup and maintenance of SPE database for supported database types, currently SQLite and MariaDB (from SPE 3.46) / MySQL (up to SPE 3.45). Script types For each database type, there are two directories with two types of…

Waveform Denoiser (DENOISER)

…Speech Engine documentation); stream not supported, technology model name to be used for processing. Output: audio file (WAV or RAW), together with xml/json report (in SPE only). Fig.: Comparison of original recording (david_noisy.wav, top half of image) and same recording processed by Denoiser (david_denoised.wav, bottom half of the image). Typical Questions Q: What do you recommend for deploying this technology?…

Phoneme Recogniser (PHNREC)

…user can add to language model of speech-to-text technology (better accuracy of KWS technology). Input audio file (format details – see Speech Engine documentation); stream not supported, technology model name (i.e. language code) to be used for phoneme transcription. Output In the process of transcribing speech-to-phonemes, the Phoneme Recogniser usually identifies individual speech segments and convert it to pronunciation. Example…

Understand SPE user accounts

SPE has a simple built-in system of user accounts and user roles. This allows for flexible usage of SPE in your projects – you can use it e.g. for different individual applications (each application uses its own SPE user), or simply for different user roles within single application (standard users, administrators). Each user account has the following attributes defined: login…

Understand SPE processing priority

SPE has a simple built-in system of task prioritization. This allows for flexible management of processing queue, which is useful especially in mass audio processing. For example, if there is a long queue of files waiting to be processed, and one needs to urgently process another bunch of files, these files can be sent for processing using higher priority… and…

Understand SPE connectors for external TTS

SPE can be easily connected with external Text-To-Speech (TTS) services using simple connector system. This article describes the principles and how-tos; following this instructions you can create your own connector, allowing to use a custom 3rd party TTS service via SPE. The TTS connector should be a command line (CLI) application or script, which communicates with the external TTS service…