Search Results for: language model customization

Results 51 - 60 of 63 Page 6 of 7
Results per-page: 10 | 20 | 50 | 100

Voice Activity Detection – Essential

Relevance: 3%      Posted on: 2018-04-04

Phonexia Voice Activity Detection (VAD) identifies parts of audio recordings with speech content vs. nonspeech content. Technology Trained with emphasis on spontaneous telephony conversation The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc. Input Input format for processing: WAV or RAW (8 or 16 bits linear coding), A-law or Mu-law, PCM, 8kHz+ sampling Output Log file with processed information (speech vs. nonspeech segments) Segmentation The section Segmentation describes the results of VAD, which are segments of detected voice and silence. Segments are…

Q: What languages do you offer?

Relevance: 3%      Posted on: 2017-09-07

It depends on the technology. Phonexia Language Identification (LID) is pre-trained for 30+ languages. Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT) for 10+ including English, French, German, Russian or American Spanish.

Technical Training Essentials

Relevance: 3%      Posted on: 2019-09-27

Core objective: Understanding technical essentials of using Phonexia technologies and products Duration: ~94 minutes (7 + 19 + 22 + 23 + 23 min chapters) intended for product architects or developers assumes you have already watched Phonexia technologies introduction video assumes understanding of working in command line REST API principles processing JSON or XML Introduction (7 min) technologies recap CLI, REST and GUI interfaces overview MODULE 1: Getting started with Speech Engine (19 min) Installation Technologies configuration Server and database configuration Users configuration Files processing Synchronous and asynchronous requests, results polling Stream processing MODULE 2: Filtering and supporting…

Voice Biometrics

Relevance: 3%      Posted on: 2018-04-07

Overview Phonexia Voice Biometrics is a special edition of Phonexia Speech Platform which allows you to understand the nature of audio without having to listen to it. The product helps people to utilize the power of voice biometrics to verify speaker or identify crimes. The technologies reveals automatically WHO, what GENDER, what LANGUAGE is speaking, and many other metadata. Voice Biometrics - Typical Use-Cases Use case Speaker Verification is tailored to banks/insurance companies/money lending companies and others, where is needed to confirm if caller/voice in audio file is the same person who is known to the customer. For this use…

How to configure STT realtime stream word detection parameters

Relevance: 3%      Posted on: 2020-03-28

One of the improvements implemented since Speech Engine 3.24 is neural-network based VAD, used for word- and segment detection. This article describes the segmenter configuration parameters and how they are affecting the realtime stream STT results. The default segmenter parametrs are as shown below: [vad.online_segmenter:SOnlineVoiceActivitySegmenterI] backward_extensions_length_ms=150 forward_extensions_length_ms=750 speech_threshold=0.5 Backward- and forward extension are intervals in miliseconds, which extend the part of the signal going to the decoder. Decoder is a component, which determines what a particular part of the signal contains (speech, silence, etc.). Based on that, decoder also decides whether segment has finished or not. Unlike in file processing…

Open Source Acknowledgement

Relevance: 3%      Posted on: 2018-04-06

This page collect information about Open Source code and licenses. You might be interested to ask your Phonexia contact what part of the page is relevant to your project. BSAPI 3 dependencies Name Version License Link type ADVobfuscator 1.1 link static boost 1.70 Boost License static botan 2.7.0 Simplified BSD static duktape 2.5.0 MIT static FLAC 1.3.2 BSD license static fmt 5.2.1 MIT static glibc - GNU LGPL dynamic (Linux) minizip 1.2.11 link static mkl 2019.1.144 ISSL static nowide 0.1.1 Boost License static Open Fst 1.6.9 Apache license static ogg 1.3.3 BSD license static onnxruntime 1.1.0 MIT static opus 1.2.1…


Relevance: 3%      Posted on: 2018-02-01

Gausian mixture model – Statistical model for representing values.

Speech Analytics

Relevance: 3%      Posted on: 2018-04-06

Overview Phonexia Speech Analytics allows you to understand the  content of audio without having to listen to it. The results help both commercial entities and security/defense forces for immediate precise decision and response. The technologies reveal automatically WHAT content, TOPIC and KEY PHRASES are spoken, and many other metadata.   Speech Analytics - Typical Use-Cases Speech transcription is used in various application. Knowledge of content of whole call is bringing business value to the customer, comparing to listening the audio files by analytic or supervisor. Reading the text is also faster than listening the audio. Speech Analytics output is often…

What is a user configuration file and how to use it

Relevance: 3%      Posted on: 2020-03-28

Advanced users with appropriate knowledge (gained e.g. by taking the Phonexia Academy Advanced Training) may want to finetune behavior of the technologies to adapt to the nature of their audio data. Modifying original BSAPI configuration files directly can be dangerous – inappropriate changes may cause unpredicatble behavior and without having a backup of the unmodified file it's difficult to restore working state. User configuration files provide a way to override processing parameters without modifying original BSAPI configuration files. WARNING: Inappropriate configuration changes may cause serious issues! Make sure you really know what you are doing. User configuration file is a…


Relevance: 3%      Posted on: 2018-02-01

Phonexia Keyword Spotting - acoustics based ASR, several technologies possible, language dependent