Search Results for: config file

Results 1 - 10 of 55 Page 1 of 6
Results per-page: 10 | 20 | 50 | 100

What is a user configuration file and how to use it

Relevance: 100%      Posted on: 2020-03-28

Advanced users with appropriate knowledge (gained e.g. by taking the Phonexia Academy Advanced Training) may want to finetune behavior of the technologies to adapt to the nature of their audio data. Modifying original BSAPI configuration files directly can be dangerous – inappropriate changes may cause unpredicatble behavior and without having a backup of the unmodified file it's difficult to restore working state. User configuration files provide a way to override processing parameters without modifying original BSAPI configuration files. WARNING: Inappropriate configuration changes may cause serious issues! Make sure you really know what you are doing. User configuration file is a…

SPE configuration

Relevance: 93%      Posted on: 2018-02-02

Basic explanation of configuration directives for SPE with hints & tips. Overview of for beginners.

Licensing (technical details)

Relevance: 73%      Posted on: 2018-03-02

This document describes all licensing types for Phonexia product licensing available to our partners and customers. Each partner/customer can choose the licensing variant which best fits the current project or infrastructure. The document does not describe business conditions of Phonexia licensing. What is the License? The License is a formal agreement regarding “The Product Usage Rights” between Phonexia s.r.o. and a user of any Phonexia technology or Phonexia product. Licenses are issued by the Business Department for all speech technologies and products, and may be required in order to use utilities and tools developed by Phonexia or partners. For technical…

SPE3 – Releases and Changelogs

Relevance: 70%      Posted on: 2020-09-12

Speech Engine (SPE) is developed as RESTfull API on top of Phonexia BSAPI. SPE was formerly known as BSAPI-rest (up to v2.x) or as Phonexia Server (up to v3.2.x). This page lists changes in SPE releases. Releases Changelogs Speech Engine 3.30.13 (09/11/2020) - DB v1401, BSAPI 3.30.13 Public release New: Updated STT and KWS model AR_XL to version 5.1.0 Speech Engine 3.32.0 (08/28/2020) - DB v1500, BSAPI 3.32.0 Non-public Feature Preview release New: Added support for Webhooks and WebSockets in stream processing New: Added support for preferred phrases in 5th generation of STT (see POST /technologies/stt or POST /technologies/stt/input_stream) New:…

Q: How can I tell in which format the .wav file is?

Relevance: 64%      Posted on: 2017-06-27

A: From the utilities in the package, you can find it in "ffprobe <file_name>", it will write out the info about the file. *Utility "ffprobe" is not included in our package(s). It is part of ffmpeg ( and it is neccessary to install it separately.

STT Language Model Customization tutorial

Relevance: 47%      Posted on: 2019-04-24

Language Model Customization tool (LMC) provides a way to improve the Speech To Text performance by creating customized language model. Language model is an important part of Phonexia Speech To Text. In a simplified way it can be imagined as a large dictionary with multiple statistics. The Speech To Text technology uses this dictionary and statistical model to convert audio signals into the proper text equivalents. Due to general diversity of spoken speech, the default generic language model may not acknowledge the importance of certain words over other words in certain situations. Language model customization is a way to inform the…

How to configure STT realtime stream word detection parameters

Relevance: 45%      Posted on: 2020-03-28

One of the improvements implemented since Speech Engine 3.24 is neural-network based VAD, used for word- and segment detection. This article describes the segmenter configuration parameters and how they are affecting the realtime stream STT results. The default segmenter parametrs are as shown below: [vad.online_segmenter:SOnlineVoiceActivitySegmenterI] backward_extensions_length_ms=150 forward_extensions_length_ms=750 speech_threshold=0.5 Backward- and forward extension are intervals in miliseconds, which extend the part of the signal going to the decoder. Decoder is a component, which determines what a particular part of the signal contains (speech, silence, etc.). Based on that, decoder also decides whether segment has finished or not. Unlike in file processing…

Language Identification (LID)

Relevance: 36%      Posted on: 2020-07-09

Phonexia Language Identification (LID) will help you distinguish the spoken language or dialect. It will enable your system to automatically route valuable calls to your experts in the given language or to send them to other software for analysis. Phonexia uses state-of-the-art language identification (LID) technology based on iVectors that were introduced by NIST (National Institute of Standards and Technology, USA) during the 2010 evaluations. The technology is independent on text and channel. This highly accurate technology uses the power of voice biometrics to automatically recognize spoken language. Application areas Preselecting multilingual sources and routing audio streams/files to language dependent…

Browser3 – Releases and Changelogs

Relevance: 25%      Posted on: 2020-08-21

Phonexia Browser v3 (Browser3) is developed as client on top of Phonexia Speech Engine v3. Phonexia Browser is a successor of Phonexia Speech Intelligence Resolver v1 (SIR1). This page lists changes in Browser releases. Releases Changelogs Phonexia Browser v3.30.12, BSAPI 3.30.11 - Aug 20 2020 Public release Fixed: Transcription results intermittently displays words in wrong order Versions 3.30.9, 3.30.10 and 3.30.11 were skipped Phonexia Browser v3.31.3, BSAPI 3.30.11 - Aug 20 2020 Non-public Feature Preview release Fixed: Transcription results intermittently displays words in wrong order Phonexia Browser v3.31.2, BSAPI 3.31.0 - Jul 24 2020 Non-public Feature Preview release Fixed: STT…

Speech Intelligence Resolver v1

Relevance: 23%      Posted on: 2017-05-18

About Phonexia Speech Intelligence Resolver v1 (SIR1) combines the power of speech technologies within a single application. The application automatically performs visualization of the record as well as filtering the speech metadata uncovered from your records effectively. Speech technologies implemented: Phonexia Speaker Identification (SID2) Phonexia Language Identification (LID2) Phonexia Gender identification (GID) Phonexia Voice Activity Detection (VAD) Phonexia Speaker Diarization (DIAR) Phonexia Keyword Spotting (KWS) Phonexia Speech Quality Estimator (SQE) Phonexia Speech Transcription (STT) SIR is a client application cooperating with REST servers. It can be used as a standalone application due to the integrated local REST server. It was…