SPE3 – Releases and Changelogs

Posted on: 2019-03-14

Phonexia Speech Engine (SPE) is developed as RESTfull API on top of Phonexia BSAPI. SPE was formerly known as BSAPI-rest (up to v2.x) or as Phonexia Server (up to v3.2.x). This page lists changes in SPE releases. Releases Changelogs == SPE v3.15.x == Phonexia Speech Engine 3.15.6 (03/14/2018) - DB v1101, BSAPI 3.19.2 * [BSAPI#370] Added SK_SK 5th generation of STT, Dictate, KWS and PHNREC NOTE: STT output format has changed in 5th generation: * _DELETE_ token was changed to <null/> * _SILENCE_ and <sil/> tokens were changed to <silence/> * <s> and </s> tokens were changed to <segment> and…

Browser3 – Releases and Changelogs

Posted on: 2019-03-14

Phonexia Browser v3 (Browser3) is developed as client on top of Phonexia Speech Engine v3. Phonexia Browser is a successor of Phonexia Speech Intelligence Resolver v1 (SIR1). This page lists changes in Browser releases. Releases Changelogs Phonexia Browser v3.15.0, BSAPI 3.19.1 - Mar 08 2019 - [#115] First support for SID v4 - [#117] Improved behavior of Enable/Disable All buttons in Settings->Speech Engine dialog - [#116] 32-bit Windows builds are not provided anymore - [#103] Waveform editor can open STT Confusion Network in label panel Phonexia Browser v3.14.0, BSAPI 3.18.0 - Jan 29 2019 - [#14] Password for server is…

Phonexia technology models EoL

Posted on: 2018-07-11

Information about release dates, support and maintenance periods of Phonexia technology models.

Voice Biometrics

Posted on: 2018-04-07

Overview Phonexia Voice Biometrics is a special edition of Phonexia Speech Platform which allows you to understand the nature of audio without having to listen to it. The product helps people to utilize the power of voice biometrics to verify speaker or identify crimes. The technologies reveals automatically WHO, what GENDER, what LANGUAGE is speaking, and many other metadata. Voice Biometrics - Typical Use-Cases Use case Speaker Verification is tailored to banks/insurance companies/money lending companies and others, where is needed to confirm if caller/voice in audio file is the same person who is known to the customer. For this use…

Speech Analytics

Posted on: 2018-04-06

Overview Phonexia Speech Analytics allows you to understand the  content of audio without having to listen to it. The results help both commercial entities and security/defense forces for immediate precise decision and response. The technologies reveal automatically WHAT content, TOPIC and KEY PHRASES are spoken, and many other metadata.   Speech Analytics - Typical Use-Cases Speech transcription is used in various application. Knowledge of content of whole call is bringing business value to the customer, comparing to listening the audio files by analytic or supervisor. Reading the text is also faster than listening the audio. Speech Analytics output is often…

Speech Quality Estimator – Essential

Posted on: 2018-04-04

Phonexia’s Speech Quality Estimator quantifies the acoustic quality of recordings. This helps the user to quickly determine whether the acoustic quality of a recording is good for processing with other speech technologies or not. As an answer for SQE, the SPE returns a json/xml file. This file includes general information about the technology and statistics of all (one or two) channels. The statistics of all channels include the numbers for many aspects of recording quality, and the overall global score. Technology The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies…

Speech Quality Estimation

Posted on: 2018-04-02

Speech Quality Estimation is a language-, domain- and channel-independent technology that serves to quantify the quality of an audio recording. 2 most important statistics that it bases its score on are SNR (Speech-to-noise ratio) and bitrate of the recording. SQE is usually part of rapid filtration process in deployment. SQE also measures over 20 other properties of the recording, all of which can be found in the output file and further processed. See description in SPE documentation. Typical use cases are: verification of recordings' quality on the input, searching based on quality of the recording, noise of environment or speaker's…


Posted on: 2018-03-23

Prefiltering is a very important part of basically any speech technology architecture. These 2 technologies are very fast and can significantly decrease the load and increase the precision of the following technologies (the exact number depends on the type of your data), thanks to sorting out the files with unacceptable quality or not enough net speech. The 2 technologies in question are Speech Quality Estimation (SQE) and Voice Activity Detection (VAD).  

SPE configuration

Posted on: 2018-02-02

Basic explanation of configuration directives for SPE with hints & tips. Overview of for beginners.

Sizing of the computing units for speech technologies

Posted on: 2018-02-02

Best practices for good sizing of Phonexia technologies depend on a few facts: Intense work with large data sets requires good performance and bandwidth between RAM and CPU. It all depends on the size of the files with technological models data, usually loaded into RAM and used intensively for computing operations Always think only about physical cores of CPU (HT, VT features can't help in performance) Also seek for CPUs with a large L3 cache. And the better CPUs are those with higher l3_cache_size/#_of_physical_CPU_cores ratio. We currently assume that CPUs from the current Intel Xeon Family in the 4th generation…