Search: speech%20to%20text

121 results

Understand SPE administration and backup

Each Partner has their own administration and back up policy. Here, we highlight the most important SPE components to be administrated and backed up. Administration It is strongly recommended to describe your own administration approach with the following components SPE users (accounts) – Partner should maintain list of SPE users (accounts). There should be only few persons with “admin” role….

Designing and Developing Application

…measure evaluation results and how to process calibration? Etc. We encourage Partner to become familiar also with the following points: Phonexia Speech Engine features and list of the technologies Best practices -typical processing flows and architecture from our previous projects Databases schema Other Phonexia components and tools as example application that can give you inspiration Licensing possibilities of the Phonexia…

Q: How to fix the Error 1013: Unsupported: Server does not support authentication with token?

A: Please check SPE subdirectory ./settings for configuration files. If only phxspe.browser.properties exists, then your Browser uses SPE as embedded component and set inside the file this directive: server.enable_authentication_token = false In that case you can still use SPE with Basic HTTP authentication, as described in documentation, section “Basic authentication“ If you would like to play with “pure” daemon installation,…

Q: What are the supported audio formats?

Formats supported directly and natively are: WAVE (*.wav) container including any of: unsigned 8-bit PCM (u8) unsigned 16-bit PCM (u16le) IEEE float 32-bit (f32le) A-law (alaw) µ-law (mulaw) ADPCM FLAC codec inside FLAC (*.flac) container OPUS codec inside OGG (*.opus) container Other audio formats must be converted to one of those natively supported using external tools. SPE server can be…

Video – Getting started with SPE

MODULE 1: Getting started with Speech Engine (19 min) Installation Technologies configuration Server and database configuration Users configuration Files processing Synchronous and asynchronous requests, results polling Stream processing https://youtu.be/4qrB-GfFdWY…

What is User configuration file and how to use it

…name User configuration file name stt_cs_cz_5_online.bs stt_cs_cz_5_online.bs.usr kws_nl_nl_5.bs kws_nl_nl_5.bs.usr phnrec_pashto.bs phnrec_pashto.bs.usr vpextract4_xl4.bs vpextract4_xl4.bs.usr During technology initialization (e.g. during Speech Engine startup), the initialization routine checks for existence of such user config file. If found, it’s automatically loaded after loading the main configuration file and the settings from user config is automatically applied over the setings from main configuration file. Usage…

Understand SPE database scripts

This article explains details and usage of SQL database scripts stored in SPE installation directory in /data/database subdirectory. These scripts are intended for setup and maintenance of SPE database for supported database types, currently SQLite and MariaDB (from SPE 3.46) / MySQL (up to SPE 3.45). Script types For each database type, there are two directories with two types of…

Solving everyday challenges through voice

Phonexia Products Speech Platform Voice Inspector…

FAQs (VIN)

…our Service Desk. in FAQ Voice Inspector Permalink Q: I am getting the error message “Your license is not for this application.” A: Check your license file (license.dat) by opening it in Notepad. Make sure the license contains records for all required modules. See Licensing article for additional information in FAQ Phonexia Browser, FAQ Speech Platform, FAQ Voice Inspector Permalink…

Download Voice Inspector 5.2

…models VIN application (graphical user interface, GUI) with the following technologies in-build Speaker Identification (SID4_XL5) Speaker Diarization (DIAR) Voice Activity Detection (VAD) Speech Quality Estimator (SQE) Phoneme Recogniser (PHNREC) example population sets and audio (in ./examples/) and example report templates (in ./templates/) Hardware requirements minimum – CPU: Intel® Core™ i5, RAM: 4 GB, Required HDD space: 0.5 GB for software…

Understand SPE user accounts

SPE has a simple built-in system of user accounts and user roles. This allows for flexible usage of SPE in your projects – you can use it e.g. for different individual applications (each application uses its own SPE user), or simply for different user roles within single application (standard users, administrators). Each user account has the following attributes defined: login…

Understand SPE processing priority

SPE has a simple built-in system of task prioritization. This allows for flexible management of processing queue, which is useful especially in mass audio processing. For example, if there is a long queue of files waiting to be processed, and one needs to urgently process another bunch of files, these files can be sent for processing using higher priority… and…

Understand SPE multithreaded technologies initialization

The server.technology_multithread_initialization setting in SPE configuration allows SPE to initialize instances of technologies during startup using multiple parallel threads. Default setting is OFF, i.e. instances of technologies are initialized using single thread, one-by-one. This allows easier tracking of eventual issues during SPE startup and better readability of technologies initialization log messages (only single initialization happens at a time). The downside…

Understand SPE audio converter

SPE directly supports limited list of audio formats (codecs and containers), see Supported audio formats FAQ. Other audio formats must be converted using external tools. This conversion can be done either completely outside of SPE, before passing the files to SPE, or you can set up SPE to convert the files automatically. Then, depending on the capabilities of the conversion…

STT: Adding words to language model on the fly

…using the input example shown above. The added parts are highlighted. { “result”: { “version”: 5, “name”: “SpeechRecognitionResult“, “file”: “/test.wav”, “model”: “EN_US_6”, . . . “phrases”: [ { “phrase”: “this is preferred phrase” }, { “phrase”: “and some other phrase” } ], “dictionary”: [ { “word”: “preferred”, “pronunciations”: [ { “phonemes”: “p r ih f er d”, “out_of_vocabulary”: false, “class”:…