Skip to content Skip to main navigation Skip to footer

Search: voice%20inspector

55 results

Speaker Diarization (DIAR)

Speaker Diarization labels segments of the same voice(s) in one mono-channel audio record based by the individual speaker´s voice. It is a language-, domain- and channel-independent technology. It performs not only the segmentation of speakers but of technical signals and silence as well. The outputs of the technology can be both log files with labels and/or split audio files/one new…

Designing and Developing Application

…specific hardware (mainly CPU, virtualized infrastructure vs. HW) or are you going to buy specific HW for customer? What is short/long time storage requirements (ie. audio and results availability, desktop vs. distributed system)? Is there any synchronization required (ie. voiceprint database to clients)? What is the topology of the solution/app (ie. where to store audio, voiceprints, results, …)? How to…

SID: TUTORIAL: Speaker Identification – How to Do a Basic Test

Phonexia Speaker Identification is a voice biometry tool for recognition of speakers by their voice. In this video, we will show you how to start using this technology! You will learn how to create a “Speaker Model” to identify a speaker in a set of data. Ready to test it? Start with our video: What else is needed? 1. Phonexia…

Measuring of a software processing speed – what is the FtRT (Faster than Real Time)

…noise, technical signals like ringing, DTMF tones, etc). This metric is useful for finding performance on actual audio data coming into audio processing pipeline. Regular recording with Voice and Silence segments in waveform Net Speech based FtRT is conservative, purely technical number. It is calculated from only spoken speech data, i.e. with all non-speech parts (silence, noise, DTMF tones, etc.)…

Understand SPE technologies, instances and workers

…(or bank branch): Post office is a place providing different kinds of services – one can go there to send letters, send or pick up packages, get a POBox, get some financial services, insurance, etc.). Speech Engine has various speech technologies configured – one can analyze the audio quality, extract voiceprints from recordings, compare voiceprints, transcribe audio to text, etc….

Recommended OS and HW (PSP)

…external dependencies like databases, storages, etc.) would require additional resources. Therefore you should always perform a proper load test using your entire system to determine the actual HW requirements. To give you a picture, here are recommendations for typical configurations: Voice Biometrics, basic 100 hours/day package (***) files processing CPU: 8 physical cores, 1x Intel® Xeon E5-2640 v4 or similar…

Time Analysis Extraction (TAE)

…in the particular direction and details about crosstalk, for example where the other speaker is talking “over” this speaker Segmentation This section is optional and need to be explicitly turned on. It describes segments of detected voice and silence (the same as Voice Activity Detection technology). More information You can find more information in corresponding chapter of API documentation: https://download.phonexia.com/docs/spe/#Time%20Analysis…

Phonexia Academy

About Main idea of the Phonexia Academy is to help partners to understand the market, Phonexia’s products and technologies. Sell more, deliver your projects on time and at the highest quality, and support your clients effectively. We provide following trainings: Phonexia technologies introduction (online video course) Technical Training Essentials (online video course) Technical Training Advanced – 2 courses: Voice Biometrics…

Download Speech Platform

…XL5 Diarization (DIAR) – model XL4 Language Identification (LID) – model L4 Gender Identification (GID) – model XL5 Age Estimation (AGE) ) – model XL5 Voice Activity Detection (VAD) – model GENERIC_3 and SID4_XL5 Speech Quality Estimation (SQE) Time Analysis Extraction (TAE) Waveform Denoiser (DENOISER) Phonexia Browser example audio (in ./BROWSER/example/ and ./SPE/bsapi/{technology}/example/) Step #2 – First start To get…

Understand SPE configuration

…of MySQL database connections at the time. Default is 32 # server.db.mysql.max_connections = 32 # Maximum size of in-memory cache for calibrated voice-prints of speaker models. Default is 100 # server.db.sid_model_calib_vp_cache_size = 100 Sizing of the system The selection of speech technologies and the number of instances per technology which are instantiated when starting the SPE is configured by the…

Speech Quality Estimation (SQE)

…of an empty recording SNR would divide by zero => is_valid would be false waveform_snr – the signal to noise ratio (SNR) describes the ratio of the useful signal to the noise signal it is measured in dB calculated from the waveform distribution, (silence – has Gaussian distribution, voice – has Gamma distribution); SNR = 20 * log10(S/N) technical signal…

Open Source Acknowledgement

…license speexdsp BSD stdlibc++, libgcc, libwinpthread (Windows only) GNU GPL with GCC Runtime Library Exception: License utfcpp BSL-1.0 xxhash-cpp https://github.com/RedSpah/xxhash_cpp/blob/master/LICENSE. – Connect your Github account BSD-2 Copyright: 2012-2020: Yann Collet, 2017-2020: Red Gavin zlib Zlib Phonexia BROWSER and Voice Inspector dependencies Library License ADVobfuscator GitHub – andrivet/ADVobfuscator BaseMatrixOps Apache License blaze BSD-3-Clause boost BSL-1.0 botan BSD-2-Clause bzip2 bzip2-1.0.8 cpp-httplib MIT…