Search Results for: ROM

Results 1 - 20 of 65 Page 1 of 4
Results per-page: 10 | 20 | 50 | 100

SPE3 – Releases and Changelogs

     Posted on: 2019-08-22

Speech Engine (SPE) is developed as RESTfull API on top of Phonexia BSAPI. SPE was formerly known as BSAPI-rest (up to v2.x) or as Phonexia Server (up to v3.2.x). This page lists changes in SPE releases. Releases Changelogs == SPE v3.17.x == Speech Engine 3.17.3 (08/22/2019) - DB v1200, BSAPI 3.21.3 [G_#191] Fixed: KWS getting phonemes/graphemes in specific circumstances returns unknown error [G_BSAPI#413] Fixed: duplicated output from KWS Speech Engine 3.17.2 (08/02/2019) - DB v1200, BSAPI 3.21.2 [G_BSAPI#300] Fixed: KWS stream results are displayed with a delay Speech Engine 3.17.1 (07/22/2019) - DB v1200, BSAPI 3.21.1 Added 5th generation of…

Phonexia Workflow

     Posted on: 2019-08-06

About Phonexia Workflow combines Phonexia technologies into scenarios, which can be easily configured and deployed. Phonexia Workflow uses Phonexia Speech Engine internally. Provided Phonexia Workflow scenarios: SalEssentials - Speech Analytics Essentials filter out low quality audio files, provides demographic information, age estimation and speech to text processing. VbsEssentials - Voice Biometrics Essentials filter out low quality audio files, provides gender identification, age estimation and speaker identification. Our team can help you implementing your custom scenario. The scenario is a tiny Java application which interacts with Phonexia technologies and optionally can use your service or database. First steps Installation Go through…

Workflow – Releases and Changelogs

     Posted on: 2019-08-06

Phonexia Workflow combines Phonexia technologies into scenarios. It uses Phonexia Speech Engine internally. This page lists changes in Workflow releases. Releases n/a Changelogs == Phonexia Workflow v1 == Phonexia Workflow 1.4.0 - SPE 3.16 - 3.17 Support for SID4. Rapid filtering component enhanced by more options on channel selection Phonexia Workflow 1.3.0 - SPE 3.13 - 3.17 Internal stuff. Phonexia Workflow 1.2.0 - SPE 3.13 - 3.14 * Scenarios support more repositories. * parameter `repositoryType` renamed to `repositoryTypes` * e.g.: repositoryTypes: [file, memory] * Scenarios no longer support parameter `repositoryFormat`. Phonexia Workflow 1.1.0 - SPE 3.12 new type of Phonexia…

Browser3 – Releases and Changelogs

     Posted on: 2019-07-03

Phonexia Browser v3 (Browser3) is developed as client on top of Phonexia Speech Engine v3. Phonexia Browser is a successor of Phonexia Speech Intelligence Resolver v1 (SIR1). This page lists changes in Browser releases. Releases Changelogs Phonexia Browser v3.17.0, BSAPI 3.21.0 - Jul 01 2019 [G#106] Added possibility to activate/deactivate created filter rules [G#125] Running Browser in "embedded SPE" mode now creates SPE log file (phxspe.browser.log located in SPE log directory) Phonexia Browser v3.16.1, BSAPI 3.20.1 - May 17 2019 [G#112] Fixed Denoiser which created duplicate recordings under specific circumstances [G#127] Fixed comparison of SID Evaluation sets using Audio Source…

Voice Inspector – supporting technologies

     Posted on: 2019-06-28

Automatic Speaker Identification (SID) is the most important but not the only Phonexia technology that is implemented in Voice Inspector (VIN). Apart from SID, forensic experts, users of VIN, can benefit from automatic Signal-to-Noise Ratio calculation, Voice Activity detection, Phoneme search, and a Wave editor which incorporates the waveform, spectrum and power panel. Let's have a look on how to utilize individual technologies. Signal-to-Noise Ratio Recording quality can strongly influence the reliability of SID results and so the outcome of a forensic case. Therefore, VIN uses a module of Phonexia Speech Quality Estimation (SQE) to calculate the Signal-to-Noise Ratio (SNR)…

Voice Inspector – Interpretation of results

     Posted on: 2019-06-24

Introduction Phonexia Voice Inspector (VIN) is a tool for forensic automatic speaker identification, compliant with the Methodological Guidelines for Best Practice in Forensic Semiautomatic and Automatic Speaker Recognition, published by the European Network of Forensic Science Institutes.  This post explains individual SID score types and ways to visualize the results in a speaker identification case implemented in Voice Inspector. Evidence In VIN, the term evidence has two meanings. In general, it refers to any SID score that the system calculates for any pair of recordings in the case. These scores are the output of the Phonexia SID technology which runs…

Speaker Identification (SID)

     Posted on: 2019-06-13

Phonexia Speaker Identification uses the power of voice biometry to recognize speakers by their voice... i.e. to decide whether the voice in two recordings belongs to the same person or two different people. High accuracy of Speaker Identification, the Phonexia's flagship technology, has been validated in a NIST Speaker Recognition Evaluations. Basic use cases and application areas The technology can be used for various speaker recognition tasks. One basic distinction is based on the kind of question we want to answer. Speaker Identification is the case when we are asking "Whose voice is this?", such as in fake emergency calls.…

Keyword Spotting results explained

     Posted on: 2019-06-12

This article aims on giving more details about Keyword Spotting outputs and hints on how to tailor Keyword Spotting to suit best your needs. Scoring and results explanation Keyword Spotting works by calculating likelihoods that at a given spot occurs a keyword or just any other speech, and comparing those two likelihoods. The following scheme shows Background model for anything before the keyword (1), the Keyword model (2) and a Background model of any speech parallel with the keyword model (3). Models 2 and 3 produce two likelihoods – Lkw and Lbg (any speech = background). Raw score is calculated…

Keyword Spotting

     Posted on: 2019-06-03

Phonexia Keyword Spotting (KWS) identifies occurrences of keywords and/or keyphrases in audio recordings. It can help you to get valuable information from huge quantities of speech recordings. You only need to specify the keywords or phrases you wish to find. This technology identifies all recordings with keyword occurrences and allows you to automatically route important recordings or calls to your experts. Typical use cases Call centers increase operator and supervisor efficiency by searching calls identify inappropriate expressions from operators check marketing campaigns with automatic script-compliance control Mass media and web search servers index and search multimedia by keyword route multimedia…

Speaker Identification: Results Enhancement

     Posted on: 2019-05-29

Speaker Identification (SID) Results Enhancement is a process that adjusts the score threshold for detecting/rejecting speakers by removing the effect of speech length and audio quality. This is achieved by use of Audio Source Profiles, that represent as closely as possible the source of the speech recording (device, acoustic channel, distance from microphone, language, gender, etc.). Although the out-of-the-box system is robust in such factors, several result enhancement procedures can provide even better results and stronger evidence. Audio Source Profile An Audio Source Profile is a representation of the speech source, e.g., device, acoustic channel, distance from microphone, language, gender,…

Speech To Text results explained

     Posted on: 2019-05-27

This article aims on giving more details about Speech To Text outputs and hints on how to tailor Speech To Text to suit best your needs. In the process of transcribing speech, the Speech To Text technology usually identifies multiple alternatives for individual speech segments, as multiple phrases can have similar pronunciations, possibly with different word boundaries, e.g. “eight tea machines” vs. “eighty machines”. The technology provides several types of output to show only one or more transcription alternatives. One-best output 1-best output provides transcription containing only the highest-scoring words. Each segment provides information about the transcribed word itself, the…

Speech To Text

     Posted on: 2019-05-27

Phonexia Speech To Text – also known as a voice-to-text or speech recognition – converts speech signals into plain text. After the conversion, text can be easily read, edited, searched, processed by text-based data mining tools or archived. Phonexia Speech To Text is optimized for noisy recordings and colloquial speech, can process audio files as well as audio streams and can provide results in several output formats. Typical use cases look for specific information in large call archives (e.g., claims inspection) get additional value by advanced analysis of call traffic (e.g., topic detection) maintain short reaction times by routing calls…

Language Identification (LID)

     Posted on: 2019-05-20

Phonexia Language Identification (LID) will help you distinguish the spoken language or dialect. It will enable your system to automatically route valuable calls to your experts in the given language or to send them to other software for analysis. Phonexia uses state-of-the-art language identification (LID) technology based on iVectors that were introduced by NIST (National Institute of Standards and Technology, USA) during the 2010 evaluations. The technology is independent on any text, language, dialect, or channel. This highly accurate technology uses the power of voice biometrics to automatically recognize spoken language. Application areas Preselecting multilingual sources and routing audio streams/files…

Language Identification results explained

     Posted on: 2019-05-20

This article aims on giving more details about Language Identification scoring and hints on how to tailor Language Identification to suit best your needs. Scoring and results explanation When Phonexia Language Identification identifies a language in audio recording (or languageprint) using a language pack, it creates languageprint of the recording (if input is audio recording) compares that languageprint with each language in a language pack and calculates probability that these two languages are the same The final scores are returned as logarithms of these individual probabilities – i.e. as values from {-inf,0} interval – for each language in the language pack.…

STT Language Model Customization tutorial

     Posted on: 2019-04-24

Language Model Customization tool (LMC) provides a way to improve the Speech To Text performance by creating customized language model. Language model is an important part of Phonexia Speech To Text. In a simplified way it can be imagined as a large dictionary with multiple statistics. The Speech To Text technology uses this dictionary and statistical model to convert audio signals into the proper text equivalents. Due to general diversity of spoken speech, the default generic language model may not acknowledge the importance of certain words over other words in certain situations. Language model customization is a way to inform the…

Phonexia End User License Agreement

     Posted on: 2019-02-27

Please read the terms and conditions of this End User License Agreement (the “Agreement”) carefully before you use the Phonexia proprietary software providing speech solutions, technologies and accompanying services (the “Software”) delivered and marketed by Phonexia s.r.o.

Phonexia technologies introduction

     Posted on: 2019-01-25

Core objective: Basic understanding of Phonexia speech technologies and products; typical use cases, implementations and deployment topologies Duration: 35 minutes intended for idea makers and product designers assumes generic knowledge of Phonexia and speech technologies in general Content 00:00 Introduction What information can we get from speech? Overview of basic use cases Phonexia Speech Platform brief 4:21 Phonexia technologies overview and their usages Filtering and supporting technologies 04:32 Speech Quality Estimation (SQE) 05:27 Voice Activity Detection (VAD) 06:37 Diarization (DIAR) 07:41 Age Estimation (AGE) 08:14 Waveform Denoiser Voice Biometrics technologies 08:56 Speaker Identification (SID) 10:18 Language Identification (LID) 11:10 Gender…

Error 1013: Unsupported: Server does not support authentication with token

     Posted on: 2018-12-10

Please check SPE subdirectory ./settings for configuration files. If only phxspe.browser.properties exists, then your Browser uses SPE as embedded component and set inside the file this directive: server.enable_authentication_token = false In that case you can still use SPE with Basic HTTP authentication, as described in documentation, section "Basic authentication" If you would like to play with "pure" daemon installation, then phxspe.properties file should exist in ./settings subdirectory. File phxspe.properties is created by phxadmin utility or can be created from ./data/phxspe.properties.default template file. Copy template file to ./settings directory Rename it to phxspe.properties Check for server.enable_authentication_token directive and setup it as…

Phonexia technology models EoL

     Posted on: 2018-07-11

Information about release dates, support and maintenance periods of Phonexia technology models.

SPE3 – Quick Start Guide

     Posted on: 2018-04-16

Do you want to run the SPE3 for the first time? This post can help you. Distribution, installation and configuration SPE is distributed by Phonexia in .zip archives. These are downloaded from Phonexia package manager using link provided by Phonexia employee. Installation is done by simple unzipping the content of the downloaded .zip archive to SPE installation folder. Configuration of SPE is done at two places. First is executable file ./phxadmin or .\phxadmin.exe serving to set file to configuration and license files configure speech technologies configure user accounts set up of few various setting Running the ./phxadmin or .\phxadmin.exe command…