Search Results for: Audio Source Profile

Results 1 - 20 of 55 Page 1 of 3
Results per-page: 10 | 20 | 50 | 100

Error 1007: Unsupported audio format

Relevance: 100%      Posted on: 2018-12-10

Phonexia Browser application may return error "1007: Unsupported audio format" during uploading audio file. Please consider if your audio files are in . But if you need use as input audio recordings in other formats, you can configure SPE for audio automated conversion. As prerequisite install external tool for audio conversion. Recommend is ffmpeg utility, powerful and well documented. Please find your distribution package at http://ffmpeg.org Then continue as described below: Using Phonexia Browser with embed SPE Open the Browser configuration dialog by click on button "Settings" located in tool ribbon. Select tab "Speech Engine" and configure SPE as described…

Open Source Acknowledgement

Relevance: 100%      Posted on: 2018-04-06

This page collect information about Open Source code and licenses. You might be interested to ask your Phonexia contact what part of the page is relevant to your project. BSAPI 3 dependencies Name Version License Link type antrl 3c-3.4 BSD license static boost 1.55 Boost License static botan 1.10.9 Simplified BSD static FLAC 1.2.1 BSD license static Open Fst 1.3.4 Apache license static OpenGrm NGram 1.1.0 Apache license static ogg 1.3.2 BSD license static opus 1.1 New BSD License static libogg 1.3.2 BSD license static speex 1.2rc1 BSD license static stdlibc++, libgcc - GNU GPL with GCC Runtime Library Exception…

Supported audio formats

Relevance: 100%      Posted on: 2018-12-10

Supported audio format are: WAVE (*.wav) container including any of: unsigned 8-bit PCM (u8) unsigned 16-bit PCM (u16le) IEEE float 32-bit (f32le) A-law (alaw) µ-law (mulaw) ADPCM FLAC codec inside FLAC (*.flac) container OPUS codec inside OGG (*.opus) container   Other audio formats must be converted using external tools. SPE server can be configured to support automated conversion on background, see SPE configuration hints. Great tools for converting other than supported formats to supported are ffmpeg (http://www.ffmpeg.org) or SoX (http://sox.sourceforge.net/). Both are multiplatform software tools for MS Windows, Linux and Apple OS X. Example of usage: ffmpeg ffmpeg -i <source_audio_file_name>…

Speaker Identification: Results Enhancement

Relevance: 27%      Posted on: 2019-05-29

Speaker Identification (SID) Results Enhancement is a process that adjusts the score threshold for detecting/rejecting speakers by removing the effect of speech length and audio quality. This is achieved by use of Audio Source Profiles, that represent as closely as possible the source of the speech recording (device, acoustic channel, distance from microphone, language, gender, etc.). Although the out-of-the-box system is robust in such factors, several result enhancement procedures can provide even better results and stronger evidence. Audio Source Profile An Audio Source Profile is a representation of the speech source, e.g., device, acoustic channel, distance from microphone, language, gender,…

Browser3 – Releases and Changelogs

Relevance: 27%      Posted on: 2019-10-09

Phonexia Browser v3 (Browser3) is developed as client on top of Phonexia Speech Engine v3. Phonexia Browser is a successor of Phonexia Speech Intelligence Resolver v1 (SIR1). This page lists changes in Browser releases. Releases Changelogs Phonexia Browser v3.18.0, BSAPI 3.22.0 - Oct 03 2019 New: Waveform editor can now process stereo file by Diarization in per-channel mode New: Added Gender balance and Score sharpness in Settings -> Scoring New: Multiple columns in Result pane can be turned on/off at once using context menu New: Minimum speech length changed to 7 seconds Fixed: LID results information chart is not updated…

SPE3 – Releases and Changelogs

Relevance: 27%      Posted on: 2019-10-02

Speech Engine (SPE) is developed as RESTfull API on top of Phonexia BSAPI. SPE was formerly known as BSAPI-rest (up to v2.x) or as Phonexia Server (up to v3.2.x). This page lists changes in SPE releases. Releases Changelogs == SPE v3.18.x == Speech Engine 3.18.2 (10/14/2019) - DB v1300, BSAPI 3.22.1 Fixed: Customized STT model fails on Windows with Request for next state but ending state reached. error message Speech Engine 3.18.1 (10/01/2019) - DB v1300, BSAPI 3.22.0 New: DICTATE technology has been renamed to STT_STREAM (/technologies/dictate -> /technologies/stt/stream) (for backward compatibility, the /technologies/dictate endpoint is internally redirected) New: SID/SID4…

Speech To Text

Relevance: 18%      Posted on: 2019-05-27

Phonexia Speech To Text – also known as a voice-to-text or speech recognition – converts speech signals into plain text. After the conversion, text can be easily read, edited, searched, processed by text-based data mining tools or archived. Phonexia Speech To Text is optimized for noisy recordings and colloquial speech, can process audio files as well as audio streams and can provide results in several output formats. Typical use cases look for specific information in large call archives (e.g., claims inspection) get additional value by advanced analysis of call traffic (e.g., topic detection) maintain short reaction times by routing calls…

Q: I found the following error: ApplicationStartup: Unhandled exception: BsapiException. What does it mean?

Relevance: 18%      Posted on: 2017-06-27

[Error] ApplicationStartup: Unhandled exception: BsapiException: SWaveformSegmenterI(/mnt/phxspe/home/phx/storage/dfs/a1cabcf7-c761-49f1 -a9bc-0a8209a09fd9.opus Requested segment (78056, 102056) is out of waveform range (0,91840). Any ideas what this means? A: It means that this opus file is created improperly and declares internally (in header) much more audio than available in real file. Please check your audio source/originator for proper functionality. Or use ffmpeg / sox utility as preprocessor of the audio and do audio normalization by self-conversion from opus to opus before recordings are processed through SPE.

Speaker Identification (SID)

Relevance: 18%      Posted on: 2019-06-13

Phonexia Speaker Identification uses the power of voice biometry to recognize speakers by their voice... i.e. to decide whether the voice in two recordings belongs to the same person or two different people. High accuracy of Speaker Identification, the Phonexia's flagship technology, has been validated in a NIST Speaker Recognition Evaluations. Basic use cases and application areas The technology can be used for various speaker recognition tasks. One basic distinction is based on the kind of question we want to answer. Speaker Identification is the case when we are asking "Whose voice is this?", such as in fake emergency calls.…

STT Language Model Customization tutorial

Relevance: 18%      Posted on: 2019-04-24

Language Model Customization tool (LMC) provides a way to improve the Speech To Text performance by creating customized language model. Language model is an important part of Phonexia Speech To Text. In a simplified way it can be imagined as a large dictionary with multiple statistics. The Speech To Text technology uses this dictionary and statistical model to convert audio signals into the proper text equivalents. Due to general diversity of spoken speech, the default generic language model may not acknowledge the importance of certain words over other words in certain situations. Language model customization is a way to inform the…

Language Identification results explained

Relevance: 18%      Posted on: 2019-05-20

This article aims on giving more details about Language Identification scoring and hints on how to tailor Language Identification to suit best your needs. Scoring and results explanation When Phonexia Language Identification identifies a language in audio recording (or languageprint) using a language pack, it creates languageprint of the recording (if input is audio recording) compares that languageprint with each language in a language pack and calculates probability that these two languages are the same The final scores are returned as logarithms of these individual probabilities – i.e. as values from {-inf,0} interval – for each language in the language pack.…

SPE configuration

Relevance: 18%      Posted on: 2018-02-02

Basic explanation of configuration directives for SPE with hints & tips. Overview of phxspe.properties for beginners.

Voice Inspector – Interpretation of results

Relevance: 18%      Posted on: 2019-06-24

Introduction Phonexia Voice Inspector (VIN) is a tool for forensic automatic speaker identification, compliant with the Methodological Guidelines for Best Practice in Forensic Semiautomatic and Automatic Speaker Recognition, published by the European Network of Forensic Science Institutes.  This post explains individual SID score types and ways to visualize the results in a speaker identification case implemented in Voice Inspector. Evidence In VIN, the term evidence has two meanings. In general, it refers to any SID score that the system calculates for any pair of recordings in the case. These scores are the output of the Phonexia SID technology which runs…

Product Portfolio

Relevance: 18%      Posted on: 2018-04-02

Phonexia Speech Platform is an umbrella concept for all Phonexia’s products and services related to speech technologies. It gives us the ability to customize various products to a wide range of customer needs. Platform Edition is an encapsulation of specific setup of speech technologies, modules, applications, utilities and services designed for a specific market segment. We distinguish Speech Analytics (SAL) and Voice Biometrics (VBS) as most common domain of usage. It is also a tool for marketing and sales. Voice Biometrics is focused more on identifying speaker, gender, language spoken and more. Speech Analytics focuses on gathering information about content…

Voice Biometrics

Relevance: 9%      Posted on: 2018-04-07

Overview Phonexia Voice Biometrics is a special edition of Phonexia Speech Platform which allows you to understand the nature of audio without having to listen to it. The product helps people to utilize the power of voice biometrics to verify speaker or identify crimes. The technologies reveals automatically WHO, what GENDER, what LANGUAGE is speaking, and many other metadata. Voice Biometrics - Typical Use-Cases Use case Speaker Verification is tailored to banks/insurance companies/money lending companies and others, where is needed to confirm if caller/voice in audio file is the same person who is known to the customer. For this use…

Measuring of a software processing speed – what is the FtRT (Faster than Real Time)

Relevance: 9%      Posted on: 2019-10-30

Faster Than Real Time (FTRT) is metrics developed for defining software performance reference point. Using this metric you can collect "benchmark" data of real processing speed for reviewed software, which should be find - and reproduce - on exactly defined HW. Then, comparing various benchmarks result, you can compare performance of the specified software and its parts on different HW configurations. And vice versa using the same metric you can compare software of different vendors on the same HW configuration and for the same processing task. We are recognizing two measurable metrics: Recording based FTRT is calculated from real recordings…

Phonexia Speech Platform

Relevance: 9%      Posted on: 2017-05-18

  Phonexia Speech Platform (Speech Platform) provides partners a complete portfolio of speech technologies with an easy-to-use design. The platform allows users to design and deploy a wide range of speech processing systems in a short time and without extensive knowledge of the technologies background. Products On top of Speech Platform, several products provided: for commercial market Phonexia Speech Analytics Phonexia Voice Biometrics for government market Phonexia Speech Analytics GOV Phonexia Voice Biometrics GOV Characteristics Completeness – all speech technologies in one place Simple to use – RESTfull API for rapid development Modularity – build your own specific process workflow…

Q: While trying to install SPE3, I get the error for loading libasound.so.2 libraries.

Relevance: 9%      Posted on: 2017-06-27

Currently I’m trying to install the provided binaries for Linux, but I do get the following when running phxadmin: ./phxadmin: error while loading shared libraries: libasound.so.2: cannot open shared object file: No such file or directory I’m trying to run this under CentOS 7. A: Please install sound libraries required for manipulation with audio files from official repository into your OS. For CentOS you may use: sudo yum install alsa-utils alsa-lib Hint: Great utility for finding subsequent Redhat/Fedora/CentOS libraries is https://www.rpmfind.net/linux/RPM/index.html