Search Results for: Audio Source Profile

Results 41 - 53 of 53 Page 3 of 3
Results per-page: 10 | 20 | 50 | 100

Speech Quality Estimation

Relevance: 9%      Posted on: 2018-04-02

Speech Quality Estimation is a language-, domain- and channel-independent technology that serves to quantify the quality of an audio recording. 2 most important statistics that it bases its score on are SNR (Speech-to-noise ratio) and bitrate of the recording. SQE is usually part of rapid filtration process in deployment. SQE also measures over 20 other properties of the recording, all of which can be found in the output file and further processed. See description in SPE documentation. Typical use cases are: verification of recordings' quality on the input, searching based on quality of the recording, noise of environment or speaker's…

Terminology

Relevance: 9%      Posted on: 2017-06-15

Document which briefly describes processes and relations in Phonexia Technologies with consideration on correct word usage.   SID - Speaker Identification Technology (about SID technology) which recognize the speaker in the audio based on the input data (usually database of voiceprints). XL3, L3,L2,S2 - Technology models of SID. Speaker enrollment - Process, where the speaker model is created (usually new record in the voiceprint database). Speaker model: 1/ should reach recommended minimums (net speech, audio quality), 2/ should be made with more net speech and thus be more robust. The test recordings (payload) are then compared to the model (see…

Age Estimation

Relevance: 9%      Posted on: 2018-04-12

Phonexia Age Estimation (AGE) estimates the age of a speaker from audio recording. The process of voiceprint extraction is similar to the extraction of SID, but as a result different features get extracted; therefore, the voiceprints extracted from AGE and SID are not mutually compatible. Technology Trained with emphasis on spontaneous telephony conversation The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc. Input Input format for processing: WAV or RAW (8 or 16 bits linear coding), A-law or Mu-law, PCM, 8kHz+ sampling…

Voice Activity Detection

Relevance: 9%      Posted on: 2018-04-02

Voice Activity Detection is a language-, domain- and channel-independent technology that identifies parts of audio recordings with speech content vs. non-speech content. It creates labels for speech and other signals in the recording; this can then serve as a decision point whether to process the recording by other technologies or not. VAD is usually part of rapid filtration process in deployment. Typical use cases are: detection of present or absent human speech for voice processing, filtering non-speech parts of the recording, filtering out recordings with not enough net speech to be processed by other technologies voice activated process, etc. The…

Software Vetting (Best Practice)

Relevance: 9%      Posted on: 2017-06-15

The purpose of this document is to help client to satisfy their high security standards during integration of Phonexia software to their critical infrastructure. The vetting ensures that Phonexia software is not dangerous to the client’s infrastructure in any way. It means there are no backdoors, viruses, worms, Trojan horses, spyware, adware, critical bugs, unwanted functionality, no information is sent outside the client’s infrastructure. Vetting context Speech technology is a very dynamic area with a very fast development. For example the speaker identification error rate decreases to half between each two evaluations organized by National Institute of Standards and Technology,…

Licensing (technical details)

Relevance: 9%      Posted on: 2018-03-02

This document describes all licensing types for Phonexia product licensing available to our partners and customers. Each partner/customer can choose the licensing variant which best fits the current project or infrastructure. The document does not describe business conditions of Phonexia licensing. What is the License? The License is a formal agreement regarding “The Product Usage Rights” between Phonexia s.r.o. and a user of any Phonexia technology or Phonexia product. Licenses are issued by the Business Department for all speech technologies and products, and may be required in order to use utilities and tools developed by Phonexia or partners. For technical…

Voice Inspector – supporting technologies

Relevance: 9%      Posted on: 2019-06-28

Automatic Speaker Identification (SID) is the most important but not the only Phonexia technology that is implemented in Voice Inspector (VIN). Apart from SID, forensic experts, users of VIN, can benefit from automatic Signal-to-Noise Ratio calculation, Voice Activity detection, Phoneme search, and a Wave editor which incorporates the waveform, spectrum and power panel. Let's have a look on how to utilize individual technologies. Signal-to-Noise Ratio Recording quality can strongly influence the reliability of SID results and so the outcome of a forensic case. Therefore, VIN uses a module of Phonexia Speech Quality Estimation (SQE) to calculate the Signal-to-Noise Ratio (SNR)…

Speaker Diarization

Relevance: 9%      Posted on: 2018-04-02

Speaker Diarization labels segments of the same voice(s) in one mono channel audio record based by the individual speaker´s voice. It is a language-, domain- and channel-independent technology. It performs not only the segmentation of speakers, but of technical signals and silence as well. The outputs of the technology can be both log file with labels and/or split audio files/one new multichannel audio file. The correct speaker diarization is still research task nowadays. Typical use cases: Preprocessing for other speech recognition technologies, labeling the parts of the utterance according to the speakers, splitting telephone conversation recorded in mono into several…

Language Identification (LID)

Relevance: 9%      Posted on: 2019-05-20

Phonexia Language Identification (LID) will help you distinguish the spoken language or dialect. It will enable your system to automatically route valuable calls to your experts in the given language or to send them to other software for analysis. Phonexia uses state-of-the-art language identification (LID) technology based on iVectors that were introduced by NIST (National Institute of Standards and Technology, USA) during the 2010 evaluations. The technology is independent on any text, language, dialect, or channel. This highly accurate technology uses the power of voice biometrics to automatically recognize spoken language. Application areas Preselecting multilingual sources and routing audio streams/files…

Designing and Developing Application

Relevance: 9%      Posted on: 2018-04-15

Before designing and developing the application, we encourage Partner to find clear answer for the following questions: Customer requirements: Do my customers need file processing (audio) or stream processing in real time? What is the human power of the customer that can analyze the results? How many minutes per day or streams in parallel do my customer need to process? What are real benefits for customer (finding the needle in haystack, approaching new information, processing only few data with highest possible accuracy)? How the solution match the current processes and infrastructure of the customer? How many false alarms are acceptable…

Workflow – Releases and Changelogs

Relevance: 9%      Posted on: 2019-10-07

Phonexia Workflow is a set of tools complementing Phonexia Speech Engine (SPE), which allow users to chain speech technologies into scenarios and process audio recordings automatically using these scenarios. This page lists changes in Workflow releases. Changelogs == Phonexia Workflow v1 == Phonexia Workflow 1.4.1 (10/07/2019) - SPE 3.16 - 3.17 Support for IPv4 only (since SPE does not support IPv6) Configurable application webhook address in both Workflow Runner and Data Discovery Tool This address is auto-detected when no value is supplied - default In some cases like network specific configuration it might be necessary to configure it manually Rapid…

Keyword Spotting

Relevance: 9%      Posted on: 2019-06-03

Phonexia Keyword Spotting (KWS) identifies occurrences of keywords and/or keyphrases in audio recordings. It can help you to get valuable information from huge quantities of speech recordings. You only need to specify the keywords or phrases you wish to find. This technology identifies all recordings with keyword occurrences and allows you to automatically route important recordings or calls to your experts. Typical use cases Call centers increase operator and supervisor efficiency by searching calls identify inappropriate expressions from operators check marketing campaigns with automatic script-compliance control Mass media and web search servers index and search multimedia by keyword route multimedia…

Speaker Diarization (DIAR)

Relevance: 9%      Posted on: 2017-06-26

About DIAR Phonexia Speaker Diarization (DIAR) enables segmentation of voices in one monochannel audio record. Technology Trained with emphasis on spontaneous telephony conversation The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc. Input Input format for processing: WAV or RAW (8 or 16 bits linear coding), A-law or Mu-law, PCM, 8kHz+ sampling Output Log file with processed information (segmentation of speech, silence, and technical signals – ie. elimination of phone lines beeps, DTMF tones, music, pauses, etc.) Audio file extracted for each…