Search Results for: wer

Results 1 - 10 of 55 Page 1 of 6
Results per-page: 10 | 20 | 50 | 100

SPE3 – Releases and Changelogs

     Posted on: 2021-06-11

Speech Engine (SPE) is developed as RESTfull API on top of Phonexia BSAPI. SPE was formerly known as BSAPI-rest (up to v2.x) or as Phonexia Server (up to v3.2.x). Releases Changelogs Speech Engine 3.40.5, DB v1700, BSAPI 3.40.4 (2021-05-09) Public release Fixed: When trying to register webhook over existing webhook for any stream technology, SPE returns HTTP 400 (1069) error instead of HTTP 500 Fixed: Invalid SQL syntax when overwriting voiceprint in a database Speech Engine 3.35.7, DB v1601, BSAPI 3.35.5 (2021-05-09) Public release Fixed: Invalid SQL syntax when overwriting voiceprint in a database Speech Engine 3.40.4, DB v1700, BSAPI…

Browser3 – Releases and Changelogs

     Posted on: 2021-06-10

Phonexia Browser v3 (Browser3) is developed as client on top of Phonexia Speech Engine v3. Phonexia Browser is a successor of Phonexia Speech Intelligence Resolver v1 (SIR1). This page lists changes in Browser releases. Releases Changelogs Phonexia Browser 3.40.4, BSAPI 3.40.4 (2021-06-10) Public release Changed: SID Evaluator - do not interrupt processing when an error occurs, but view all errors and continue creating the evaluation set Fixed: SID Evaluator - invalid GID score values Fixed: SID Evaluator - missing SQE information in report Fixed: SID Evaluator - don't save disabled recordings to evaluation set Phonexia Browser 3.40.3, BSAPI 3.40.4 (2021-05-28)…

Understanding SPE database

     Posted on: 2021-06-05

SPE database serves multiple purposes: stores SPE internal data stores various information about SPE entities created by SPE user audio files metadata speaker models and their voiceprints speaker groups and their voiceprints calibration sets keyword lists language packs audio source profiles stores cached processing results (optional, can be set in SPE configuration file) stores SPE log data (optional and MySQL only, can be set in SPE configuration file) To cache or not to cache? Well, that's a question... ;-) It depends on the particular use case AND on the design of your application, whether using the built-in results caching would be…

Understanding SPE directory structure

     Posted on: 2021-05-15

Good understanding of SPE directory structure helps to better understand the inner workings of SPE and simplifies troubleshooting. It's also useful for expert-level tuning of parameters of individual technologies and optimizing SPE configuration e.g. for deployments with shared resources, or deployments in virtualized environments, etc. The SPE directory structure looks like this (the tree depth is limited for better readability): {SPE_installation_directory} ├── bsapi │ ├── age │ │ ├── data │ │ ├── example . . └── settings . . . . │ └── vad │ ├── data │ ├── example │ └── settings ├── data │ ├── benchmark │…

SPE configuration file explained

     Posted on: 2021-05-03

In this article we explain details of the Speech Engine configuration file phxspe.properties, located in settings subdirectory in SPE installation location. Settings in this configuration file affect the Speech Engine behavior and performance. The configuration file is usually created after SPE installation – on first use of phxadmin, a default configuration filephxspe.properties is created in the settings directory. The file is loaded during SPE startup, i.e. you need to restart SPE to apply any changes made in the file. If Speech Engine is used together with Phonexia Browser in so-called "embedded" mode (see details about "embedded SPE" mode in Browser…

What is STT words-to-numbers feature and how to use it

     Posted on: 2021-04-22

Speech Engine 3.30 and later includes new STT feature for native numbers and dates in n‍-best output. This article explains details of the feature and gives some tips for fine-tuning the results. NOTE: The feature is currently implemented for Czech and Slovak language only! If you would like to help adding support for other languages (available in 5th or newer generation), please contact your Phonexia sales representative, or [email protected] What is the words-to-numbers feature Words-to-numbers feature allows to convert raw transcription of numbers, dates (or similar patterns like credit card numbers) to their native form: two thousand twenty one ⇒…

Phonexia Voice Inspector v3

     Posted on: 2021-04-09

About Phonexia Voice Inspector v3 (VIN3) provides police forces and forensic experts with a highly accurate speaker identification tool during investigation of criminal matters. It uses the power of voice biometry to automatically recognize speakers by their voice. Main features of the VIN3 application: Automatic speaker identification tool to strengthen results of the standard linguistics- and phonetics-based approach Scoring in Likelihood Ratio (LR) – result from a statistical test for a comparison of two hypotheses. The system returns a number from the interval <0, +∞>, which expresses how many times more likely the data are under one hypothesis than the…

LID adaptation

     Posted on: 2021-03-02

This article describes various ways of Language Identification adaptation. Basic terminology Languageprint (*.lp file) – numeric representation of the audio, extracted from audio file for language identification purpose of (similar to “voiceprint”, but representing the spoken language, not the speaking person) Languageprint archive (*.lpa file) – multiple languageprints combined into single archive Creation of languageprint archives is not supported by SPE, these are supported as input only.   Language model – digital characteristics of a specific language Language model can be trained from languageprints (*.lp), language prints archives (*.lpa), or from combination of both. LID language model should not be…

Arabic dialects in Phonexia LID and STT

     Posted on: 2021-01-18

Arabic language has (a) one standardised variety, and (b) many non-standard varieties (dialects). In this article, our linguistic team explains differences between Modern Standard Arabic and Arabic dialects in the context of Phonexia Arabic models. Standard variety:  Modern Standard Arabic (MSA) All Arabs learn it at school (not from their parents, so we cannot say it is their native variety) It is lingua franca (common language) for the Arabic world – like English for Europeans; however, Arabs speak it much better since they are schooled in MSA from early age MSA is more similar to some dialects (e.g. Levantine), but…

What are STT preferred phrases and how to use them

     Posted on: 2020-11-26

Speech Engine version 3.32 and later includes new STT feature called Preferred phrases. This article explains what is the feature good for, how does it work internally and gives some tips for practical implementation. What are preferred phrases In the speech transcription tasks, there may be situations where similar sounding words get confused, e.g. "WiFi" vs. "HiFi", "route" vs. "root", "cell" vs. "sell", etc. Normally, the language model part of the Speech To Text does its job here and in the context of longer phrase or entire sentence prefers the correct word:  ×    I'm going to cell my car. Hmmm, such…