Search Results for: file format

Results 1 - 10 of 56 Page 1 of 6
Results per-page: 10 | 20 | 50 | 100

Q: How can I tell in which format the .wav file is?

Relevance: 100%      Posted on: 2017-06-27

A: From the utilities in the package, you can find it in "ffprobe <file_name>", it will write out the info about the file. *Utility "ffprobe" is not included in our package(s). It is part of ffmpeg (https://ffmpeg.org/ffprobe.html) and it is neccessary to install it separately.

SPE3 – Releases and Changelogs

Relevance: 84%      Posted on: 2020-10-14

Speech Engine (SPE) is developed as RESTfull API on top of Phonexia BSAPI. SPE was formerly known as BSAPI-rest (up to v2.x) or as Phonexia Server (up to v3.2.x). This page lists changes in SPE releases. Releases Changelogs Speech Engine 3.35.1 (10/13/2020) - DB v1600, BSAPI 3.35.1 Public release Fixed: Missing input stream task name in log messages Fixed: Missing arguments in "word not found" error messages (when using preferred phrases) Speech Engine 3.35.0 (10/01/2020) - DB v1600, BSAPI 3.35.0 Public release New: LID model L4 was promoted to production (LID BETA_L4 renamed to LID L4) New: Added new language tag…

SPE configuration

Relevance: 80%      Posted on: 2018-02-02

Basic explanation of configuration directives for SPE with hints & tips. Overview of phxspe.properties for beginners.

Error 1007: Unsupported audio format

Relevance: 65%      Posted on: 2018-12-10

Phonexia Browser application may return error "1007: Unsupported audio format" during uploading audio file. Please consider if your audio files are in . But if you need use as input audio recordings in other formats, you can configure SPE for audio automated conversion. As prerequisite install external tool for audio conversion. Recommend is ffmpeg utility, powerful and well documented. Please find your distribution package at http://ffmpeg.org Then continue as described below: Using Phonexia Browser with embed SPE Open the Browser configuration dialog by click on button "Settings" located in tool ribbon. Select tab "Speech Engine" and configure SPE as described…

What is a user configuration file and how to use it

Relevance: 64%      Posted on: 2020-03-28

Advanced users with appropriate knowledge (gained e.g. by taking the Phonexia Academy Advanced Training) may want to finetune behavior of the technologies to adapt to the nature of their audio data. Modifying original BSAPI configuration files directly can be dangerous – inappropriate changes may cause unpredicatble behavior and without having a backup of the unmodified file it's difficult to restore working state. User configuration files provide a way to override processing parameters without modifying original BSAPI configuration files. WARNING: Inappropriate configuration changes may cause serious issues! Make sure you really know what you are doing. User configuration file is a…

Q: How to choose answer format from server (xml/json)?

Relevance: 59%      Posted on: 2017-06-27

A: Via HTTP header “Accept” parameter (application/json; application/xml) Via request query “format=json/xml” If the format is not defined (or the HTTP header "Accept" parameter has one of these values: application/*,*/*,*), server will return json.

Licensing (technical details)

Relevance: 53%      Posted on: 2018-03-02

This document describes all licensing types for Phonexia product licensing available to our partners and customers. Each partner/customer can choose the licensing variant which best fits the current project or infrastructure. The document does not describe business conditions of Phonexia licensing. What is the License? The License is a formal agreement regarding “The Product Usage Rights” between Phonexia s.r.o. and a user of any Phonexia technology or Phonexia product. Licenses are issued by the Business Department for all speech technologies and products, and may be required in order to use utilities and tools developed by Phonexia or partners. For technical…

Language Identification (LID)

Relevance: 21%      Posted on: 2020-07-09

Phonexia Language Identification (LID) will help you distinguish the spoken language or dialect. It will enable your system to automatically route valuable calls to your experts in the given language or to send them to other software for analysis. Phonexia uses state-of-the-art language identification (LID) technology based on iVectors that were introduced by NIST (National Institute of Standards and Technology, USA) during the 2010 evaluations. The technology is independent on text and channel. This highly accurate technology uses the power of voice biometrics to automatically recognize spoken language. Application areas Preselecting multilingual sources and routing audio streams/files to language dependent…

Speech Intelligence Resolver v1

Relevance: 19%      Posted on: 2017-05-18

About Phonexia Speech Intelligence Resolver v1 (SIR1) combines the power of speech technologies within a single application. The application automatically performs visualization of the record as well as filtering the speech metadata uncovered from your records effectively. Speech technologies implemented: Phonexia Speaker Identification (SID2) Phonexia Language Identification (LID2) Phonexia Gender identification (GID) Phonexia Voice Activity Detection (VAD) Phonexia Speaker Diarization (DIAR) Phonexia Keyword Spotting (KWS) Phonexia Speech Quality Estimator (SQE) Phonexia Speech Transcription (STT) SIR is a client application cooperating with REST servers. It can be used as a standalone application due to the integrated local REST server. It was…

STT Language Model Customization tutorial

Relevance: 15%      Posted on: 2019-04-24

Language Model Customization tool (LMC) provides a way to improve the Speech To Text performance by creating customized language model. Language model is an important part of Phonexia Speech To Text. In a simplified way it can be imagined as a large dictionary with multiple statistics. The Speech To Text technology uses this dictionary and statistical model to convert audio signals into the proper text equivalents. Due to general diversity of spoken speech, the default generic language model may not acknowledge the importance of certain words over other words in certain situations. Language model customization is a way to inform the…