MODULE 1: Getting started with Speech Engine (19 min) Installation Technologies configuration Server and database configuration Users configuration Files processing Synchronous and asynchronous requests, results polling Stream processing https://youtu.be/4qrB-GfFdWY…
Search: sqe
23 results
MODULE 4: Speech Analytics technologies (23 min) Common generic rules for CLI, REST and GUI Speech To Text (STT) in CLI, REST and GUI Keyword Spotting (KWS) in CLI, REST and GUI Phoneme Recognizer (PHNREC) in CLI, REST and GUI Time Analysis Extraction (TAE) in CLI, REST and GUI Summary https://www.youtube.com/watch?v=-FAoRywqv7U…
MODULE 3: Voice Biometrics technologies (23 min) Common generic rules for CLI, REST and GUI Speaker Identification (SID) in CLI, REST and GUI Language Identification (LID) in CLI, REST and GUI Gender Identification (GID) in CLI, REST and GUI Summary https://www.youtube.com/watch?v=AyEoPfYVel8…
A: Signal-to-Noise Ratio (SNR) is an important metric of whether a recording is worth further processing by other speech technologies, so it is part of our Speech Quality Estimation. However, calculating SNR automatically is not a trivial task. We use the fact that the statistical distribution of the frequencies in the waveform of speech has Gamma distribution. In contrast, noise…
…3 or 4 in the model name. Other technology models (SID, LID, GID, DIAR, AGE, SQE, VAD, DENOISE) Tech. models supported (generation specified by number in “Tech. model name”). Technology Tech. model name Released End of support Maintenance SID4 XL5 2022-09 6th gen. SID 5th gen. SID XL4 2020-03 6th gen. SID 5th gen. SID L4 2019-02 6th gen….
…XL5 Diarization (DIAR) – model XL4 Language Identification (LID) – model L4 Gender Identification (GID) – model XL5 Age Estimation (AGE) ) – model XL5 Voice Activity Detection (VAD) – model GENERIC_3 and SID4_XL5 Speech Quality Estimation (SQE) Time Analysis Extraction (TAE) Waveform Denoiser (DENOISER) Phonexia Browser example audio (in ./BROWSER/example/ and ./SPE/bsapi/{technology}/example/) Step #2 – First start To get…
…models VIN application (graphical user interface, GUI) with the following technologies in-build Speaker Identification (SID4_XL5) Speaker Diarization (DIAR) Voice Activity Detection (VAD) Speech Quality Estimator (SQE) Phoneme Recogniser (PHNREC) example population sets and audio (in ./examples/) and example report templates (in ./templates/) Hardware requirements minimum – CPU: Intel® Core™ i5, RAM: 4 GB, Required HDD space: 0.5 GB for software…
…✓ Voice Activity Detection (VAD) ✓ ✓ Time Analysis Extraction (TAE) ✓ ✓ Speech Quality Estimation (SQE) ✓ ✓ Language Identification (LID) ✓ Gender Identification (GID) ✓ Age Estimation (AGE) ✓ Speaker Diarization (DIAR) ✓ Results caching Processing results can be optionally stored in results cache database to speed up eventual re-processing of the same recordings by the same technology…
…and their usages Filtering and supporting technologies 04:32 Speech Quality Estimation (SQE) 05:27 Voice Activity Detection (VAD) 06:37 Diarization (DIAR) 07:41 Age Estimation (AGE) 08:14 Waveform Denoiser Voice Biometrics technologies 08:56 Speaker Identification (SID) 10:18 Language Identification (LID) 11:10 Gender Identification (GID) Speech Analytics technologies 11:43 Speech Transcription (STT) 12:30 Keyword Spotting (KWS) 13:32 Phoneme Recognition (PHNREC) 13:54 Time Analysis…
…– detects the audio part that contains voice, Speech Quality Estimation (SQE) – measures the quality of speech, Phoneme Recognizer (PHNREC) – several languages supported – converts speech into phonemes (written characters representing pronunciation), Waveform Denoiser (DENOISER) – automatically improves the audibility of speech for human listeners. Supported Languages The LID, STT and KWS technologies support various languages as listed…
…advanced configurations. bsapi ├── age ├── denoiser ├── diar ├── gid ├── kws ├── lid ├── sid4 ├── sqe ├── stt ├── tae └── vad Each individual technology directory contains typically three main subdirectories: data Technology data, in separate directories for individual technological- or language specific models example Audio files for quick testing, in some cases also in separate directories…
…technologies setup. If we assume that the whole machine is dedicated as a “speech computing unit” then, in general, we can calculate it as follows: file: phxspe.properties server.n_workers = <#_of_core> file: technologies.xml (no. of threads per technology, can be also set up by the phxadmin tool) SQE: <#_of_cores>/4 VAD: <#_of_cores>/2 other technologies: <#_of_cores> RAM: 8 cores = 32 GB 16…
…physical server, configure your technologies.xml to the following number of instances: SQE: <#_cpu_cores>/4 VAD: <#_cpu_cores>/2 any other technology: <#_cpu_cores> (Note: your license should also be configured properly. Ask our Sales department for cooperation in case of hot-load evaluation tests. The production license will be configured with our assistance, of course) Optimal RAM recommendation: 4 cores: 16 GB RAM 8 cores:…
This part requires higher (and non-anonymous) access level.
How to solve this situation:
- Log in here if you are not logged in.
- Register here. It takes just a few clicks and it’s free.
…when an error occurs, but view all errors and continue creating the evaluation set Fixed: SID Evaluator – invalid GID score values Fixed: SID Evaluator – missing SQE information in report Fixed: SID Evaluator – don’t save disabled recordings to evaluation set Phonexia Browser 3.40.3, BSAPI 3.40.4 (2021-05-28) Fixed: Some minor bugs in licensing system Phonexia Browser 3.40.2, BSAPI 3.40.2…