Search Results for: language pack

Results 1 - 20 of 53 Page 1 of 3
Results per-page: 10 | 20 | 50 | 100

Language Identification (LID)

Relevance: 100%      Posted on: 2020-07-09

Phonexia Language Identification (LID) will help you distinguish the spoken language or dialect. It will enable your system to automatically route valuable calls to your experts in the given language or to send them to other software for analysis. Phonexia uses state-of-the-art language identification (LID) technology based on iVectors that were introduced by NIST (National Institute of Standards and Technology, USA) during the 2010 evaluations. The technology is independent on text and channel. This highly accurate technology uses the power of voice biometrics to automatically recognize spoken language. Application areas Preselecting multilingual sources and routing audio streams/files to language dependent…

Language Identification results explained

Relevance: 55%      Posted on: 2019-05-20

This article aims on giving more details about Language Identification scoring and hints on how to tailor Language Identification to suit best your needs. Scoring and results explanation When Phonexia Language Identification identifies a language in audio recording (or languageprint) using a language pack, it creates languageprint of the recording (if input is audio recording) compares that languageprint with each language in a language pack and calculates probability that these two languages are the same The final scores are returned as logarithms of these individual probabilities – i.e. as values from {-inf,0} interval – for each language in the language pack.…

STT Language Model Customization tutorial

Relevance: 25%      Posted on: 2019-04-24

Language Model Customization tool (LMC) provides a way to improve the Speech To Text performance by creating customized language model. Language model is an important part of Phonexia Speech To Text. In a simplified way it can be imagined as a large dictionary with multiple statistics. The Speech To Text technology uses this dictionary and statistical model to convert audio signals into the proper text equivalents. Due to general diversity of spoken speech, the default generic language model may not acknowledge the importance of certain words over other words in certain situations. Language model customization is a way to inform the…

SPE3 – Quick Start Guide

Relevance: 19%      Posted on: 2018-04-16

Do you want to run the SPE3 for the first time? This post can help you. Distribution, installation and configuration SPE is distributed by Phonexia in .zip archives. These are downloaded from Phonexia package manager using link provided by Phonexia employee. Installation is done by simple unzipping the content of the downloaded .zip archive to SPE installation folder. Configuration of SPE is done at two places. First is executable file ./phxadmin or .\phxadmin.exe serving to set file to configuration and license files configure speech technologies configure user accounts set up of few various setting Running the ./phxadmin or .\phxadmin.exe command…

Q: How can I add new language to LID?

Relevance: 18%      Posted on: 2017-06-27

A: There are multiple methods to train a new language, please see article in Components > Speech Technologies > LID.

SPE configuration

Relevance: 18%      Posted on: 2018-02-02

Basic explanation of configuration directives for SPE with hints & tips. Overview of for beginners.

Speech To Text

Relevance: 16%      Posted on: 2019-05-27

Phonexia Speech To Text – also known as a voice-to-text or speech recognition – converts speech signals into plain text. After the conversion, text can be easily read, edited, searched, processed by text-based data mining tools or archived. Phonexia Speech To Text is optimized for noisy recordings and colloquial speech, can process audio files as well as audio streams and can provide results in several output formats. Typical use cases look for specific information in large call archives (e.g., claims inspection) get additional value by advanced analysis of call traffic (e.g., topic detection) maintain short reaction times by routing calls…


Relevance: 16%      Posted on: 2017-06-15

Document which briefly describes processes and relations in Phonexia Technologies with consideration on correct word usage.   SID - Speaker Identification Technology (about SID technology) which recognize the speaker in the audio based on the input data (usually database of voiceprints). XL3, L3,L2,S2 - Technology models of SID. Speaker enrollment - Process, where the speaker model is created (usually new record in the voiceprint database). Speaker model: 1/ should reach recommended minimums (net speech, audio quality), 2/ should be made with more net speech and thus be more robust. The test recordings (payload) are then compared to the model (see…

Q: Please give me a recommendation for LID adaptation set.

Relevance: 7%      Posted on: 2017-06-27

A: The following is recommended: For adding new language to language pack 20+ hours of audio for each new language model (or 25+ hours of audio containing 80% of speech) Only 1 language per record For adapting the existing language model (discriminative training) 10+ hours of audio for each language May be done on customer site. May be done in Phonexia using anonymized data (= language-prints extracted from a .wav audio)

Speech To Text results explained

Relevance: 7%      Posted on: 2019-05-27

This article aims on giving more details about Speech To Text outputs and hints on how to tailor Speech To Text to suit best your needs. In the process of transcribing speech, the Speech To Text technology usually identifies multiple alternatives for individual speech segments, as multiple phrases can have similar pronunciations, possibly with different word boundaries, e.g. “eight tea machines” vs. “eighty machines”. The technology provides various output types which show only single or multiple transcription alternatives. For processing realtime streams, two result modes are supported – one mode provides complete transcription, second mode provides incremental results. Output types…

SPE3 – Releases and Changelogs

Relevance: 6%      Posted on: 2020-07-30

Speech Engine (SPE) is developed as RESTfull API on top of Phonexia BSAPI. SPE was formerly known as BSAPI-rest (up to v2.x) or as Phonexia Server (up to v3.2.x). This page lists changes in SPE releases. Releases Changelogs Speech Engine 3.30.10 (07/29/2020) - DB v1401, BSAPI 3.30.10 Public release New: Updated STT model RU_RU_A to version 4.4.0 Speech Engine 3.31.1 (07/02/2020) - DB v1500, BSAPI 3.31.0 Non-public Feature Preview release Fixed: SQLite database update from version v1401 fails Speech Engine 3.31.0 (07/01/2020) - DB v1500, BSAPI 3.31.0 Non-public Feature Preview release New: SPE now requires CentOS 7 or other Linux…

Speaker Identification: Results Enhancement

Relevance: 5%      Posted on: 2019-05-29

Speaker Identification (SID) Results Enhancement is a process that adjusts the score threshold for detecting/rejecting speakers by removing the effect of speech length and audio quality. This is achieved by use of Audio Source Profiles, that represent as closely as possible the source of the speech recording (device, acoustic channel, distance from microphone, language, gender, etc.). Although the out-of-the-box system is robust in such factors, several result enhancement procedures can provide even better results and stronger evidence. Audio Source Profile An Audio Source Profile is a representation of the speech source, e.g., device, acoustic channel, distance from microphone, language, gender,…


Relevance: 5%      Posted on: 2018-02-01

Language Print Archive - pack of language prints from the recordings spoken in the same language/dialect. Used for the language identification in LID comparison.

Phonexia Speech Engine

Relevance: 3%      Posted on: 2017-05-18

About Phonexia Speech Engine v3 (SPE3) is a main executive part of the Phonexia Speech Platform. It is a server application with REST API interface through which you can access all available speech technologies. Both, Linux 64bit and Windows 64bit operating systems are supported. Phonexia Speech Engine (SPE3) is adjustable server component which houses all speech technologies. SPE3 provides RESTfull application programming interface to access various technologies. Aside from technologies themselves the SPE has implemented other various functionality supporting work with speech technologies, recordings and streams, and others. Features Main purpose of SPE is to work as processing unit for…

Speaker Identification (SID)

Relevance: 3%      Posted on: 2019-06-13

Phonexia Speaker Identification uses the power of voice biometry to recognize speakers by their voice... i.e. to decide whether the voice in two recordings belongs to the same person or two different people. High accuracy of Speaker Identification, the Phonexia's flagship technology, has been validated in a NIST Speaker Recognition Evaluations. Basic use cases and application areas The technology can be used for various speaker recognition tasks. One basic distinction is based on the kind of question we want to answer. Speaker Identification is the case when we are asking "Whose voice is this?", such as in fake emergency calls.…

Speech Intelligence Resolver v1

Relevance: 3%      Posted on: 2017-05-18

About Phonexia Speech Intelligence Resolver v1 (SIR1) combines the power of speech technologies within a single application. The application automatically performs visualization of the record as well as filtering the speech metadata uncovered from your records effectively. Speech technologies implemented: Phonexia Speaker Identification (SID2) Phonexia Language Identification (LID2) Phonexia Gender identification (GID) Phonexia Voice Activity Detection (VAD) Phonexia Speaker Diarization (DIAR) Phonexia Keyword Spotting (KWS) Phonexia Speech Quality Estimator (SQE) Phonexia Speech Transcription (STT) SIR is a client application cooperating with REST servers. It can be used as a standalone application due to the integrated local REST server. It was…

Voice Biometrics

Relevance: 3%      Posted on: 2018-04-07

Overview Phonexia Voice Biometrics is a special edition of Phonexia Speech Platform which allows you to understand the nature of audio without having to listen to it. The product helps people to utilize the power of voice biometrics to verify speaker or identify crimes. The technologies reveals automatically WHO, what GENDER, what LANGUAGE is speaking, and many other metadata. Voice Biometrics - Typical Use-Cases Use case Speaker Verification is tailored to banks/insurance companies/money lending companies and others, where is needed to confirm if caller/voice in audio file is the same person who is known to the customer. For this use…

Browser3 – Releases and Changelogs

Relevance: 2%      Posted on: 2020-07-24

Phonexia Browser v3 (Browser3) is developed as client on top of Phonexia Speech Engine v3. Phonexia Browser is a successor of Phonexia Speech Intelligence Resolver v1 (SIR1). This page lists changes in Browser releases. Releases Changelogs Phonexia Browser v3.31.2, BSAPI 3.31.0 - Jul 24 2020 Non-public Feature Preview release Fixed: STT result version mismatch Phonexia Browser v3.31.1, BSAPI 3.31.0 - Jul 08 2020 Non-public Feature Preview release New: Browser now requires CentOS 7 or other Linux based OS with glibc >= 2.17 Version 3.31.0 was skipped Phonexia Browser v3.30.8, BSAPI 3.30.8 - Jun 29 2020 Public release Fixed: SID Evaluator…

Product Portfolio

Relevance: 2%      Posted on: 2018-04-02

Phonexia Speech Platform is an umbrella concept for all Phonexia’s products and services related to speech technologies. It gives us the ability to customize various products to a wide range of customer needs. Platform Edition is an encapsulation of specific setup of speech technologies, modules, applications, utilities and services designed for a specific market segment. We distinguish Speech Analytics (SAL) and Voice Biometrics (VBS) as most common domain of usage. It is also a tool for marketing and sales. Voice Biometrics is focused more on identifying speaker, gender, language spoken and more. Speech Analytics focuses on gathering information about content…