Search Results for: speaker

Results 1 - 39 of 39 Page 1 of 1
Results per-page: 10 | 20 | 50 | 100

Speaker Diarization (DIAR)

Relevance: 100%      Posted on: 2017-06-26

About DIAR Phonexia Speaker Diarization (DIAR) enables segmentation of voices in one monochannel audio record. Technology Trained with emphasis on spontaneous telephony conversation The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc. Input Input format for processing: WAV or RAW (8 or 16 bits linear coding), A-law or Mu-law, PCM, 8kHz+ sampling Output Log file with processed information (segmentation of speech, silence, and technical signals – ie. elimination of phone lines beeps, DTMF tones, music, pauses, etc.) Audio file extracted for each…

Speaker Identification (SID)

Relevance: 100%      Posted on: 2019-06-13

Phonexia Speaker Identification uses the power of voice biometry to recognize speakers by their voice... i.e. to decide whether the voice in two recordings belongs to the same person or two different people. High accuracy of Speaker Identification, the Phonexia's flagship technology, has been validated in a NIST Speaker Recognition Evaluations. Basic use cases and application areas The technology can be used for various speaker recognition tasks. One basic distinction is based on the kind of question we want to answer. Speaker Identification is the case when we are asking "Whose voice is this?", such as in fake emergency calls.…

Speaker Identification: Results Enhancement

Relevance: 100%      Posted on: 2019-05-29

Speaker Identification (SID) Results Enhancement is a process that adjusts the score threshold for detecting/rejecting speakers by removing the effect of speech length and audio quality. This is achieved by use of Audio Source Profiles, that represent as closely as possible the source of the speech recording (device, acoustic channel, distance from microphone, language, gender, etc.). Although the out-of-the-box system is robust in such factors, several result enhancement procedures can provide even better results and stronger evidence. Audio Source Profile An Audio Source Profile is a representation of the speech source, e.g., device, acoustic channel, distance from microphone, language, gender,…

Speaker Diarization

Relevance: 100%      Posted on: 2018-04-02

Speaker Diarization labels segments of the same voice(s) in one mono channel audio record based by the individual speaker´s voice. It is a language-, domain- and channel-independent technology. It performs not only the segmentation of speakers, but of technical signals and silence as well. The outputs of the technology can be both log file with labels and/or split audio files/one new multichannel audio file. The correct speaker diarization is still research task nowadays. Typical use cases: Preprocessing for other speech recognition technologies, labeling the parts of the utterance according to the speakers, splitting telephone conversation recorded in mono into several…

Phonexia Voice Inspector v3

Relevance: 9%      Posted on: 2018-04-02

About Phonexia Voice Inspector v3 (VIN3) provides police forces and forensic experts with a highly accurate speaker identification tool during investigation of criminal matters. It uses the power of voice biometry to automatically recognize speakers by their voice. Main features of the VIN3 application: Automatic speaker identification tool to strengthen results of the standard linguistics- and phonetics-based approach Scoring in Likelihood Ratio (LR) – result from a statistical test for a comparison of two hypotheses. The system returns a number from the interval <0, +∞>, which expresses how many times more likely the data are under one hypothesis than the…

Phonexia Speech Platform for Commerce

Relevance: 9%      Posted on: 2017-05-18

Phonexia Speech Analytics is a special edition of Phonexia Speech Platform COM which allows you to boost analysis of your call traffic. It is effective solution for commercial, telecom, utilities, financial sector, and other contact centers. It provides 4 main parts: Dialog Analysis, Demographic Information, Script Alignment, Speech Transcription (automatic).   Phonexia Voice Biometrics is a special edition of Phonexia Speech Platform COM which allows you to boost security and enhance customer experience with voice biometrics technologies. It is effective solution for commercial and financial sectors, especially for banks, insurance companies, and call centers. It covers both usecases: Fraud Detection…

SPE3 – Releases and Changelogs

Relevance: 9%      Posted on: 2019-08-13

Speech Engine (SPE) is developed as RESTfull API on top of Phonexia BSAPI. SPE was formerly known as BSAPI-rest (up to v2.x) or as Phonexia Server (up to v3.2.x). This page lists changes in SPE releases. Releases Changelogs == SPE v3.17.x == Speech Engine 3.17.2 (08/02/2019) - DB v1200, BSAPI 3.21.2 [G_BSAPI#300] Fixed: KWS stream results are displayed with a delay Speech Engine 3.17.1 (07/22/2019) - DB v1200, BSAPI 3.21.1 Added 5th generation of ES_ES of STT/Dictate/KWS/PHNREC NOTE: STT output format has changed in 5th generation: _DELETE_ token was changed to <null/> _SILENCE_ and <sil/> tokens were changed to <silence/> <s>…

Software Vetting (Best Practice)

Relevance: 9%      Posted on: 2017-06-15

The purpose of this document is to help client to satisfy their high security standards during integration of Phonexia software to their critical infrastructure. The vetting ensures that Phonexia software is not dangerous to the client’s infrastructure in any way. It means there are no backdoors, viruses, worms, Trojan horses, spyware, adware, critical bugs, unwanted functionality, no information is sent outside the client’s infrastructure. Vetting context Speech technology is a very dynamic area with a very fast development. For example the speaker identification error rate decreases to half between each two evaluations organized by National Institute of Standards and Technology,…

Speech Analytics

Relevance: 9%      Posted on: 2018-04-06

Overview Phonexia Speech Analytics allows you to understand the  content of audio without having to listen to it. The results help both commercial entities and security/defense forces for immediate precise decision and response. The technologies reveal automatically WHAT content, TOPIC and KEY PHRASES are spoken, and many other metadata.   Speech Analytics - Typical Use-Cases Speech transcription is used in various application. Knowledge of content of whole call is bringing business value to the customer, comparing to listening the audio files by analytic or supervisor. Reading the text is also faster than listening the audio. Speech Analytics output is often…

Phonexia Speech Platform for Government

Relevance: 9%      Posted on: 2017-05-18

Phonexia Voice Biometrics GOV is a special edition of Phonexia Speech Platform for Government which allows you to understand the nature of audio without having to listen to it. The product helps people to utilize the power of voice biometrics to filter audio and prevent or identify crimes. The technologies reveal automatically WHO, what GENDER, what LANGUAGE is speaking, and many other metadata. The product can be used typically for investigation support, SIGINT or other types of operations. It serves 4 main use-cases: Voice Biometrics - Speaker Search in Archive (Investigation) Voice Biometrics - Speaker Spotting Tactical Voice Biometrics -…

Phonexia technologies introduction

Relevance: 9%      Posted on: 2019-01-25

Core objective: Basic understanding of Phonexia speech technologies and products; typical use cases, implementations and deployment topologies Duration: 35 minutes intended for idea makers and product designers assumes generic knowledge of Phonexia and speech technologies in general Content 00:00 Introduction What information can we get from speech? Overview of basic use cases Phonexia Speech Platform brief 4:21 Phonexia technologies overview and their usages Filtering and supporting technologies 04:32 Speech Quality Estimation (SQE) 05:27 Voice Activity Detection (VAD) 06:37 Diarization (DIAR) 07:41 Age Estimation (AGE) 08:14 Waveform Denoiser Voice Biometrics technologies 08:56 Speaker Identification (SID) 10:18 Language Identification (LID) 11:10 Gender…

Phonexia Speech Engine

Relevance: 9%      Posted on: 2017-05-18

About Phonexia Speech Engine v3 (SPE3) is a main executive part of the Phonexia Speech Platform. It is a server application with REST API interface through which you can access all available speech technologies. Both, Linux 64bit and Windows 64bit operating systems are supported. Phonexia Speech Engine (SPE3) is adjustable server component which houses all speech technologies. SPE3 provides RESTfull application programming interface to access various technologies. Aside from technologies themselves the SPE has implemented other various functionality supporting work with speech technologies, recordings and streams, and others. Features Main purpose of SPE is to work as processing unit for…

Phonexia Workflow

Relevance: 9%      Posted on: 2019-08-06

About Phonexia Workflow combines Phonexia technologies into scenarios, which can be easily configured and deployed. Phonexia Workflow uses Phonexia Speech Engine internally. Provided Phonexia Workflow scenarios: SalEssentials - Speech Analytics Essentials filter out low quality audio files, provides demographic information, age estimation and speech to text processing. VbsEssentials - Voice Biometrics Essentials filter out low quality audio files, provides gender identification, age estimation and speaker identification. Our team can help you implementing your custom scenario. The scenario is a tiny Java application which interacts with Phonexia technologies and optionally can use your service or database. First steps Installation Go through…

Gender Identification

Relevance: 9%      Posted on: 2018-04-16

Gender Identification is a language-, domain- and channel-independent technology that uses the acoustic characteristics of the recording to determine the gender of the speaker in question. This technology is able to distinguish between two genders: Male (M) and Female (F). Minimum of speech signal for identification: 9+ sec recommended Output scoring: likelihood ratio and percentage metric (0-100%) Typical use cases: filtering calls by gender, playing advertisement focused on specific gender, getting quick demographic analysis of the recordings. The speed of Gender Identification is up to 150 FtRT (depending on the model).

Time Analysis (TAE)

Relevance: 9%      Posted on: 2017-05-18

Technology description Technology Time Analysis Extraction by Phonexia extracts base information from dialogue in a recording, providing essential knowledge about conversation flow. That makes it easy to identify long reaction time, crosstalk, or responses of speakers in both channels.  This technology is only meaningful when used on recordings with 2 channels. As an answer to the TAE technology, SPE returns a json/xml file. This file includes general information about the technology and details of the time analysis. The technology can work either with a closed recording or with a stream. Monologue Describes the statistics of a recording related to one…

Difference between on-the-fly and off-line type of transcription (STT)

Relevance: 9%      Posted on: 2017-12-11

Similarly as human, the ASR (STT) engine is doing the adaptation to an acoustic channel, environment and speaker. Also the ASR (STT) engine is learning more information about the content during time, that is used to improve recognition. The dictate engine, also known as on-the-fly transciption, does not look to the future and has information about just a few seconds of speech at the beginning of recordings. As the output is requested immediately during processing of the audio, recording engine can't predict what will come in next seconds of the speech. When access to the whole recording is granted during off-line transcription…

Speech Quality Estimation

Relevance: 9%      Posted on: 2018-04-02

Speech Quality Estimation is a language-, domain- and channel-independent technology that serves to quantify the quality of an audio recording. 2 most important statistics that it bases its score on are SNR (Speech-to-noise ratio) and bitrate of the recording. SQE is usually part of rapid filtration process in deployment. SQE also measures over 20 other properties of the recording, all of which can be found in the output file and further processed. See description in SPE documentation. Typical use cases are: verification of recordings' quality on the input, searching based on quality of the recording, noise of environment or speaker's…

Phonexia Browser

Relevance: 9%      Posted on: 2017-05-18

About Phonexia Browser v3 (Browser v3) software that combines the power of speech technologies in a single desktop application. The application automatically  performs visualization of records as well as effective filtration of speech metadata uncovered from the user´s records. Speech technologies implemented: Speaker Identification (SID) Language Identification (LID) Gender identification (GID) Voice Activity Detection (VAD) Speaker Diarization (DIAR) Keyword Spotting (KWS, 10+ languages available) Speech Quality Estimator (SQE) Speech to Text (STT, 10+ languages available) Age Estimation (AGE) Browser v3 is a client application cooperating with Speech Engine v3 (SPE3). It is possible to use it as a client -…

Speech To Text results explained

Relevance: 9%      Posted on: 2019-05-27

This article aims on giving more details about Speech To Text outputs and hints on how to tailor Speech To Text to suit best your needs. In the process of transcribing speech, the Speech To Text technology usually identifies multiple alternatives for individual speech segments, as multiple phrases can have similar pronunciations, possibly with different word boundaries, e.g. “eight tea machines” vs. “eighty machines”. The technology provides several types of output to show only one or more transcription alternatives. One-best output 1-best output provides transcription containing only the highest-scoring words. Each segment provides information about the transcribed word itself, the…

DIAR

Relevance: 9%      Posted on: 2018-02-01

Phonexia Speaker Diarization

Voice Inspector

Relevance: 9%      Posted on: 2017-05-18

About Phonexia Voice Inspector v3 (VIN3) provides police forces and forensic experts with a highly accurate speaker identification tool during investigation of criminal matters. It uses the power of voice biometry to automatically recognize speakers by their voice. Main features of the VIN3 application: Automatic speaker identification tool to strengthen results of the standard phonetics-based approaches Scoring in likelihood ratio (LR) – Result from statistical test for two models comparison. It gives back number which expresses how many times more likely the data are under one model than the other. LnLR or LogLR meets numbers in interval <-∞;+∞>...), and verbal…

Voice Inspector – Interpretation of results

Relevance: 9%      Posted on: 2019-06-24

Introduction Phonexia Voice Inspector (VIN) is a tool for forensic automatic speaker identification, compliant with the Methodological Guidelines for Best Practice in Forensic Semiautomatic and Automatic Speaker Recognition, published by the European Network of Forensic Science Institutes.  This post explains individual SID score types and ways to visualize the results in a speaker identification case implemented in Voice Inspector. Evidence In VIN, the term evidence has two meanings. In general, it refers to any SID score that the system calculates for any pair of recordings in the case. These scores are the output of the Phonexia SID technology which runs…

SID

Relevance: 9%      Posted on: 2018-02-01

Phonexia Speaker Identification, multiple generations available marked by version like SIDv2 or SIDv3

Software Vetting

Relevance: 9%      Posted on: 2018-04-06

The purpose of this document is to help client to satisfy their high security standards during integration of Phonexia software to their critical infrastructure. The vetting ensures that Phonexia software is not dangerous to the client’s infrastructure in any way. It means there are no backdoors, viruses, worms, Trojan horses, spyware, adware, critical bugs, unwanted functionality, no information is sent outside the client’s infrastructure. Vetting context Speech technology is a very dynamic area with a very fast development. For example the speaker identification error rate decreases to half between each two evaluations organized by National Institute of Standards and Technology,…

Speech Intelligence Resolver v1

Relevance: 9%      Posted on: 2017-05-18

About Phonexia Speech Intelligence Resolver v1 (SIR1) combines the power of speech technologies within a single application. The application automatically performs visualization of the record as well as filtering the speech metadata uncovered from your records effectively. Speech technologies implemented: Phonexia Speaker Identification (SID2) Phonexia Language Identification (LID2) Phonexia Gender identification (GID) Phonexia Voice Activity Detection (VAD) Phonexia Speaker Diarization (DIAR) Phonexia Keyword Spotting (KWS) Phonexia Speech Quality Estimator (SQE) Phonexia Speech Transcription (STT) SIR is a client application cooperating with REST servers. It can be used as a standalone application due to the integrated local REST server. It was…

Voice Inspector – supporting technologies

Relevance: 9%      Posted on: 2019-06-28

Automatic Speaker Identification (SID) is the most important but not the only Phonexia technology that is implemented in Voice Inspector (VIN). Apart from SID, forensic experts, users of VIN, can benefit from automatic Signal-to-Noise Ratio calculation, Voice Activity detection, Phoneme search, and a Wave editor which incorporates the waveform, spectrum and power panel. Let's have a look on how to utilize individual technologies. Signal-to-Noise Ratio Recording quality can strongly influence the reliability of SID results and so the outcome of a forensic case. Therefore, VIN uses a module of Phonexia Speech Quality Estimation (SQE) to calculate the Signal-to-Noise Ratio (SNR)…

VP

Relevance: 9%      Posted on: 2018-02-01

Voice Print – output from spoken speech extraction process of SID. Unique mathematical representation of the specific speaker. It is created from iVectors.

Browser3 – Releases and Changelogs

Relevance: 9%      Posted on: 2019-07-03

Phonexia Browser v3 (Browser3) is developed as client on top of Phonexia Speech Engine v3. Phonexia Browser is a successor of Phonexia Speech Intelligence Resolver v1 (SIR1). This page lists changes in Browser releases. Releases Changelogs Phonexia Browser v3.17.0, BSAPI 3.21.0 - Jul 01 2019 [G#106] Added possibility to activate/deactivate created filter rules [G#125] Running Browser in "embedded SPE" mode now creates SPE log file (phxspe.browser.log located in SPE log directory) Phonexia Browser v3.16.1, BSAPI 3.20.1 - May 17 2019 [G#112] Fixed Denoiser which created duplicate recordings under specific circumstances [G#127] Fixed comparison of SID Evaluation sets using Audio Source…

Phonexia Voice Inspector v1

Relevance: 9%      Posted on: 2017-05-18

About Phonexia Voice Inspector v1 (VIN1) provides police forces and forensic experts with highly accurate speaker identification tools to be used during the investigation of criminal matters. It utilizes the power of voice biometry to automatically recognize the speaker by their voice. Main features of the VIN1 application: An automatic speaker identification tool to strengthen the results of the standard phonetic based approaches Scoring of the likelihood ratio (LR), log-likelihood ratio (LLR), and an option of a verbal presentation of the results Graphic presentation of the likelihood ratio (LR), probability density function and Tippett plot Generating detailed reports (expert opinion…

SPE configuration

Relevance: 9%      Posted on: 2018-02-02

Basic explanation of configuration directives for SPE with hints & tips. Overview of phxspe.properties for beginners.

VIN3 – Releases and Changelogs

Relevance: 9%      Posted on: 2018-04-08

Phonexia Voice Inspector v3 (VIN) is developed as a desktop application on top of Phonexia BSAPI. This page lists changes in VIN releases. Releases Changelogs Voice Inspector v3.2.2, BSAPI 3.15.0 - Jun 5 2018 - Fixed possible application crash on Windows - Added phoneme type 'affricate' and fixed phoneme types: * phoneme 'C' changed from 'fricative' to 'affricate' * phoneme 'D' changed from 'fricative to 'plosive' * phoneme 'T' changed from 'fricative to 'plosive' * phoneme 'c' changed from 'plosive' to 'affricate' Voice Inspector v3.2.1, BSAPI 3.15.0 - Mar 16 2018 - Export of Speakers/Populations allows export only voiceprints -…

Voice Biometrics Course (technical training)

Relevance: 9%      Posted on: 2017-05-18

The Voice Biometrics course consist of the following modules. Please ask your Phonexia contact for detailed description. (YES = this part is mandatory for course)   VBS course Required time [h] Block name Block description YES 0,5 Intro & Phonexia Portfolio Intro & Phonexia Portfolio YES 0,5 Project focus - Explain basic needs Partner project related discussion focused mainly to finalizing training topics and agenda YES 0,75 Apps Designing and Developing - Licensing Gives trainee knowledge about type of licensing, and how to use the license file YES 0,75 Technologies - Data gathering and Quality measurement - basic Data gathering…

Product Portfolio

Relevance: 9%      Posted on: 2018-04-02

Phonexia Speech Platform is an umbrella concept for all Phonexia’s products and services related to speech technologies. It gives us the ability to customize various products to a wide range of customer needs. Platform Edition is an encapsulation of specific setup of speech technologies, modules, applications, utilities and services designed for a specific market segment. We distinguish Speech Analytics (SAL) and Voice Biometrics (VBS) as most common domain of usage. It is also a tool for marketing and sales. Voice Biometrics is focused more on identifying speaker, gender, language spoken and more. Speech Analytics focuses on gathering information about content…

Age Estimation

Relevance: 9%      Posted on: 2018-04-12

Phonexia Age Estimation (AGE) estimates the age of a speaker from audio recording. The process of voiceprint extraction is similar to the extraction of SID, but as a result different features get extracted; therefore, the voiceprints extracted from AGE and SID are not mutually compatible. Technology Trained with emphasis on spontaneous telephony conversation The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc. Input Input format for processing: WAV or RAW (8 or 16 bits linear coding), A-law or Mu-law, PCM, 8kHz+ sampling…

Speech Analytics Course (technical training)

Relevance: 9%      Posted on: 2017-05-18

The Speech Analytics course consists of the following modules. Please ask your Phonexia contact for detailed description. (YES = this part of the course is obligatory)   SAL course Required time [h] Block name Block description YES 0,5 Intro & Phonexia Portfolio Intro & Phonexia Portfolio YES 0,5 Project focus – Explain basic needs Discussion of partner project focused mainly on finalizing the training topics and agenda. YES 0,75 Application Design & Development – Licensing Presentation of types of licensing, and how to use the license file. YES 0,75 Technologies – Data gathering and Quality measurement – basic Description of…

Voice Biometrics

Relevance: 9%      Posted on: 2018-04-07

Overview Phonexia Voice Biometrics is a special edition of Phonexia Speech Platform which allows you to understand the nature of audio without having to listen to it. The product helps people to utilize the power of voice biometrics to verify speaker or identify crimes. The technologies reveals automatically WHO, what GENDER, what LANGUAGE is speaking, and many other metadata. Voice Biometrics - Typical Use-Cases Use case Speaker Verification is tailored to banks/insurance companies/money lending companies and others, where is needed to confirm if caller/voice in audio file is the same person who is known to the customer. For this use…

Time Analysis

Relevance: 9%      Posted on: 2018-04-15

Time Analysis Extraction (TAE) by Phonexia extracts base information from dialogue in a recording, providing essential knowledge about conversation flow. That makes it easy to identify long reaction time, crosstalk, or responses of speakers in both channels. This technology is only meaningful when used on recordings with 2 channels. As an answer to the TAE technology, SPE returns a json/xml file. This file includes general information about the technology and details of the time analysis. The technology can work either with a closed recording or with a stream. Monologue Describes the statistics of a recording related to one channel. channel…

Terminology

Relevance: 9%      Posted on: 2017-06-15

Document which briefly describes processes and relations in Phonexia Technologies with consideration on correct word usage.   SID - Speaker Identification Technology (about SID technology) which recognize the speaker in the audio based on the input data (usually database of voiceprints). XL3, L3,L2,S2 - Technology models of SID. Speaker enrollment - Process, where the speaker model is created (usually new record in the voiceprint database). Speaker model: 1/ should reach recommended minimums (net speech, audio quality), 2/ should be made with more net speech and thus be more robust. The test recordings (payload) are then compared to the model (see…