Skip to content Skip to main navigation Skip to footer

Search: voice cerify

75 results

STT: Configuring word detection parameters for stream transcription

…i.e. the backward extension value actually says for how long the processing must be delayed (processing has to wait until that much input signal arrives) ⇒ increasing this value means that speech activity is detected with longer delay (e.g. means delayed barge-in detection in voicebot implementation). The forward extension value basically means “add this much of a following signal to…

Download Speech Platform

…XL5 Diarization (DIAR) – model XL4 Language Identification (LID) – model L4 Gender Identification (GID) – model XL5 Age Estimation (AGE) ) – model XL5 Voice Activity Detection (VAD) – model GENERIC_3 and SID4_XL5 Speech Quality Estimation (SQE) Time Analysis Extraction (TAE) Waveform Denoiser (DENOISER) Phonexia Browser example audio (in ./BROWSER/example/ and ./SPE/bsapi/{technology}/example/) Step #2 – First start To get…

Releases and Changelogs (Browser)

…v3.12.0, BSAPI 3.16.0 – Aug 17 2018 [#56] Added support for Denoiser technology * process recording from context menu in test table [#53] Fixed error “Authorisation failed” during starting embedded SPE server [#52] Fixed wrong displaying of time of very long length audio files [#3] Changed minimum speech length for calibrated voiceprint to 60 s [#51] Fixed application crash when…

Phonexia Partner Program for Government Partners

Phonexia Partner Program for Government Partners This partnership program rewards partners in the government sector for selling and integrating the Phonexia’s speech recognition and voice biometrics product portfolio. Program Enrollment If you aspire to becoming a Phonexia partner, you can enroll into the Phonexia Partner Program and complete a three-month onboarding period. During this period, you will enjoy the same…

Phonexia Academy

About Main idea of the Phonexia Academy is to help partners to understand the market, Phonexia’s products and technologies. Sell more, deliver your projects on time and at the highest quality, and support your clients effectively. We provide following trainings: Phonexia technologies introduction (online video course) Technical Training Essentials (online video course) Technical Training Advanced – 2 courses: Voice Biometrics…

Video – Filtering and supporting technologies

MODULE 2: Filtering and supporting technologies (22 min) Common generic rules for CLI, REST and GUI Filtering, sorting, pre-/post-processing overview Speech Quality Estimation (SQE) in CLI, REST and GUI Voice Activity Detection (VAD) in CLI, REST and GUI Diarization (DIAR) in CLI, REST and GUI Age Estimation (AGE) in CLI, REST and GUI Denoiser (DENOISER) in CLI, REST and GUI…

Speech Quality Estimation (SQE)

…of an empty recording SNR would divide by zero => is_valid would be false waveform_snr – the signal to noise ratio (SNR) describes the ratio of the useful signal to the noise signal it is measured in dB calculated from the waveform distribution, (silence – has Gaussian distribution, voice – has Gamma distribution); SNR = 20 * log10(S/N) technical signal…

Understand SPE configuration file

voiceprint. server.bsapi_comparator_fa_cache_size = 100000 See False Acceptance cache for mor details. Runtime server.enable_authentication_token # Authentication mode # Set true for authentication with sessions # Set false for basic authentication server.enable_authentication_token = true Controls which user authentication mode should SPE accept – whether X-SessionID authentication token, or HTTP Basic authentication. Default value true enables the X-SessionID authentication, i.e. the HTTP Basic…

LID: Terminology and adaptation

This article describes various ways of Language Identification adaptation. Basic terminology Languageprint (*.lp file) – numeric representation of the audio, extracted from audio file for language identification purpose of (similar to “voiceprint”, but representing sound of the spoken language, not sound of the speaking person) Languageprint archive (*.lpa file) – multiple languageprints combined into single archive Languageprint archives come pre-created…

STT: Results explained

…available in ouput. To better support voicebot applications, following additions were implemented: sentence_info array, containing confidence value for each sentence present in the one-best results (since version 3.24) (a sentence is a part of output from <segment> to </segment> token… i.e. if there are 2 such sentences in the results, the sentence_info array contains 2 elements) n_best_result object, containing additional…

STT: What is Words-To-Numbers feature and how to use it

…point zero three ⇒ 1586.03 sixty four million seven hundred thousand ninety ⇒ 64700090 This should help to simplify processing of the transcribed texts by text analytic layers or NLP (Natural Language Processing) engines, e.g. in voicebot applications. Where is the converted output available? The words to numbers conversion is available only in n-best output (i.e. where the entire sentence…

Understand SPE configuration

…of MySQL database connections at the time. Default is 32 # server.db.mysql.max_connections = 32 # Maximum size of in-memory cache for calibrated voice-prints of speaker models. Default is 100 # server.db.sid_model_calib_vp_cache_size = 100 Sizing of the system The selection of speech technologies and the number of instances per technology which are instantiated when starting the SPE is configured by the…