Search: voice%20activity%20detection

55 results

Releases and Changelogs (Browser)

…v3.12.0, BSAPI 3.16.0 – Aug 17 2018 [#56] Added support for Denoiser technology * process recording from context menu in test table [#53] Fixed error “Authorisation failed” during starting embedded SPE server [#52] Fixed wrong displaying of time of very long length audio files [#3] Changed minimum speech length for calibrated voiceprint to 60 s [#51] Fixed application crash when…

LID: Terminology and adaptation

This article describes various ways of Language Identification adaptation. Basic terminology Languageprint (*.lp file) – numeric representation of the audio, extracted from audio file for language identification purpose of (similar to “voiceprint”, but representing sound of the spoken language, not sound of the speaking person) Languageprint archive (*.lpa file) – multiple languageprints combined into single archive Languageprint archives come pre-created…

STT: Results explained

…available in ouput. To better support voicebot applications, following additions were implemented: sentence_info array, containing confidence value for each sentence present in the one-best results (since version 3.24) (a sentence is a part of output from <segment> to </segment> token… i.e. if there are 2 such sentences in the results, the sentence_info array contains 2 elements) n_best_result object, containing additional…

STT: Configuring word detection parameters for stream transcription

…i.e. the backward extension value actually says for how long the processing must be delayed (processing has to wait until that much input signal arrives) ⇒ increasing this value means that speech activity is detected with longer delay (e.g. means delayed barge-in detection in voicebot implementation). The forward extension value basically means “add this much of a following signal to…

Phonexia Partner Program for Government Partners

Phonexia Partner Program for Government Partners This partnership program rewards partners in the government sector for selling and integrating the Phonexia’s speech recognition and voice biometrics product portfolio. Program Enrollment If you aspire to becoming a Phonexia partner, you can enroll into the Phonexia Partner Program and complete a three-month onboarding period. During this period, you will enjoy the same…

Understand SPE configuration file

…voiceprint. server.bsapi_comparator_fa_cache_size = 100000 See False Acceptance cache for mor details. Runtime server.enable_authentication_token # Authentication mode # Set true for authentication with sessions # Set false for basic authentication server.enable_authentication_token = true Controls which user authentication mode should SPE accept – whether X-SessionID authentication token, or HTTP Basic authentication. Default value true enables the X-SessionID authentication, i.e. the HTTP Basic…

STT: What is Words-To-Numbers feature and how to use it

…point zero three ⇒ 1586.03 sixty four million seven hundred thousand ninety ⇒ 64700090 This should help to simplify processing of the transcribed texts by text analytic layers or NLP (Natural Language Processing) engines, e.g. in voicebot applications. Where is the converted output available? The words to numbers conversion is available only in n-best output (i.e. where the entire sentence…

FAQs (PSP)

…license contains records for all required modules. See Licensing article for additional information in FAQ Phonexia Browser, FAQ Speech Platform, FAQ Voice Inspector Permalink Q: What are the requirements for SID evaluation dataset? For evaluating the real life scenario of Phonexia Speaker Identification technology, the system needs to be calibrated by SID dataset. SID dataset (minimum requirements): To measure SID…

FAQs (Browser)

…FAQ Voice Inspector Permalink Q: What are the requirements for SID evaluation dataset? For evaluating the real life scenario of Phonexia Speaker Identification technology, the system needs to be calibrated by SID dataset. SID dataset (minimum requirements): To measure SID performance precisely, it’s important to prepare evaluation recordings set very carefully. The requirements are: 50+ known speakers, 200+ recordings in…

Video – Filtering and supporting technologies

MODULE 2: Filtering and supporting technologies (22 min) Common generic rules for CLI, REST and GUI Filtering, sorting, pre-/post-processing overview Speech Quality Estimation (SQE) in CLI, REST and GUI Voice Activity Detection (VAD) in CLI, REST and GUI Diarization (DIAR) in CLI, REST and GUI Age Estimation (AGE) in CLI, REST and GUI Denoiser (DENOISER) in CLI, REST and GUI…