Search Results for: spe

Results 21 - 30 of 121 Page 3 of 13
Results per-page: 10 | 20 | 50 | 100

Speaker Identification: Results Enhancement

     Posted on: 2019-05-29

Speaker Identification (SID) Results Enhancement is a process that adjusts the score threshold for detecting/rejecting speakers by removing the effect of speech length and audio quality. This is achieved by use of Audio Source Profiles, that represent as closely as possible the source of the speech recording (device, acoustic channel, distance from microphone, language, gender, etc.). Although the out-of-the-box system is robust in such factors, several result enhancement procedures can provide even better results and stronger evidence. Audio Source Profile An Audio Source Profile is a representation of the speech source, e.g., device, acoustic channel, distance from microphone, language, gender,…

Speech To Text results explained

     Posted on: 2019-05-27

This article aims on giving more details about Speech To Text outputs and hints on how to tailor Speech To Text to suit best your needs. In the process of transcribing speech, the Speech To Text technology usually identifies multiple alternatives for individual speech segments, as multiple phrases can have similar pronunciations, possibly with different word boundaries, e.g. “eight tea machines” vs. “eighty machines”. The technology provides various output types which show only single or multiple transcription alternatives. For processing realtime streams, two result modes are supported – one mode provides complete transcription, second mode provides incremental results. Output types…

Speech To Text

     Posted on: 2019-05-27

Phonexia Speech To Text – also known as a voice-to-text or speech recognition – converts speech signals into plain text. After the conversion, text can be easily read, edited, searched, processed by text-based data mining tools or archived. Phonexia Speech To Text is optimized for noisy recordings and colloquial speech, can process audio files as well as audio streams and can provide results in several output formats. Typical use cases look for specific information in large call archives (e.g., claims inspection) get additional value by advanced analysis of call traffic (e.g., topic detection) maintain short reaction times by routing calls…

Language Identification (LID)

     Posted on: 2019-05-20

Phonexia Language Identification (LID) will help you distinguish the spoken language or dialect. It will enable your system to automatically route valuable calls to your experts in the given language or to send them to other software for analysis. Phonexia uses state-of-the-art language identification (LID) technology based on iVectors that were introduced by NIST (National Institute of Standards and Technology, USA) during the 2010 evaluations. The technology is independent on text and channel. This highly accurate technology uses the power of voice biometrics to automatically recognize spoken language. Application areas Preselecting multilingual sources and routing audio streams/files to language dependent…

STT Language Model Customization tutorial

     Posted on: 2019-04-24

Language Model Customization tool (LMC) provides a way to improve the Speech To Text performance by creating customized language model. Language model is an important part of Phonexia Speech To Text. In a simplified way it can be imagined as a large dictionary with multiple statistics. The Speech To Text technology uses this dictionary and statistical model to convert audio signals into the proper text equivalents. Due to general diversity of spoken speech, the default generic language model may not acknowledge the importance of certain words over other words in certain situations. Language model customization is a way to inform the…

Phonexia End User License Agreement

     Posted on: 2019-02-27

Please read the terms and conditions of this End User License Agreement (the “Agreement”) carefully before you use the Phonexia proprietary software providing speech solutions, technologies and accompanying services (the “Software”) delivered and marketed by Phonexia s.r.o.

Phonexia technologies introduction

     Posted on: 2019-01-25

Core objective: Basic understanding of Phonexia speech technologies and products; typical use cases, implementations and deployment topologies Duration: 35 minutes intended for idea makers and product designers assumes generic knowledge of Phonexia and speech technologies in general Content 00:00 Introduction What information can we get from speech? Overview of basic use cases Phonexia Speech Platform brief 4:21 Phonexia technologies overview and their usages Filtering and supporting technologies 04:32 Speech Quality Estimation (SQE) 05:27 Voice Activity Detection (VAD) 06:37 Diarization (DIAR) 07:41 Age Estimation (AGE) 08:14 Waveform Denoiser Voice Biometrics technologies 08:56 Speaker Identification (SID) 10:18 Language Identification (LID) 11:10 Gender…

Error 1007: Unsupported audio format

     Posted on: 2018-12-10

Phonexia Browser application may return error "1007: Unsupported audio format" during uploading audio file. Please consider if your audio files are in . But if you need use as input audio recordings in other formats, you can configure SPE for audio automated conversion. As prerequisite install external tool for audio conversion. Recommend is ffmpeg utility, powerful and well documented. Please find your distribution package at http://ffmpeg.org Then continue as described below: Using Phonexia Browser with embed SPE Open the Browser configuration dialog by click on button "Settings" located in tool ribbon. Select tab "Speech Engine" and configure SPE as described…

Supported audio formats

     Posted on: 2018-12-10

Supported audio format are: WAVE (*.wav) container including any of: unsigned 8-bit PCM (u8) unsigned 16-bit PCM (u16le) IEEE float 32-bit (f32le) A-law (alaw) µ-law (mulaw) ADPCM FLAC codec inside FLAC (*.flac) container OPUS codec inside OGG (*.opus) container   Other audio formats must be converted using external tools. SPE server can be configured to support automated conversion on background, see SPE configuration hints. Great tools for converting other than supported formats to supported are ffmpeg (http://www.ffmpeg.org) or SoX (http://sox.sourceforge.net/). Both are multiplatform software tools for MS Windows, Linux and Apple OS X. Example of usage: ffmpeg ffmpeg -i <source_audio_file_name>…

Error 1013: Unsupported: Server does not support authentication with token

     Posted on: 2018-12-10

Please check SPE subdirectory ./settings for configuration files. If only phxspe.browser.properties exists, then your Browser uses SPE as embedded component and set inside the file this directive: server.enable_authentication_token = false In that case you can still use SPE with Basic HTTP authentication, as described in documentation, section "Basic authentication" If you would like to play with "pure" daemon installation, then phxspe.properties file should exist in ./settings subdirectory. File phxspe.properties is created by phxadmin utility or can be created from ./data/phxspe.properties.default template file. Copy template file to ./settings directory Rename it to phxspe.properties Check for server.enable_authentication_token directive and setup it as…