Formats supported directly and natively are: WAVE (*.wav) container including any of: unsigned 8-bit PCM (u8) unsigned 16-bit PCM (u16le) IEEE float 32-bit (f32le) A-law (alaw) µ-law (mulaw) ADPCM FLAC codec inside FLAC (*.flac) container OPUS codec inside OGG (*.opus) container Other audio formats must be converted to one of those natively supported using external tools. SPE server can be…
Search: Browser
57 results
MODULE 1: Getting started with Speech Engine (19 min) Installation Technologies configuration Server and database configuration Users configuration Files processing Synchronous and asynchronous requests, results polling Stream processing https://youtu.be/4qrB-GfFdWY…
Our technologies are prepared to run on both Windows and Linux OS. For more details of the supported operating systems as well as recommended HW setup, see Recommended OS and HW…
It depends on the technology. Phonexia Language Identification (LID) is pre-trained for 60+ languages. Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT) for 20+ languages including English, French, German, Russian, Spanish and many more….
MODULE 4: Speech Analytics technologies (23 min) Common generic rules for CLI, REST and GUI Speech To Text (STT) in CLI, REST and GUI Keyword Spotting (KWS) in CLI, REST and GUI Phoneme Recognizer (PHNREC) in CLI, REST and GUI Time Analysis Extraction (TAE) in CLI, REST and GUI Summary https://www.youtube.com/watch?v=-FAoRywqv7U…
A: Check your license file (license.dat) by opening it in Notepad. Make sure the license contains records for all required modules. See Licensing article for additional information…
A: Please see List of supported LID Languages. For more details, see LID technology documentation….
A: Please see List of supported STT Languages. For more details, see STT technology documentation….
MODULE 2: Filtering and supporting technologies (22 min) Common generic rules for CLI, REST and GUI Filtering, sorting, pre-/post-processing overview Speech Quality Estimation (SQE) in CLI, REST and GUI Voice Activity Detection (VAD) in CLI, REST and GUI Diarization (DIAR) in CLI, REST and GUI Age Estimation (AGE) in CLI, REST and GUI Denoiser (DENOISER) in CLI, REST and GUI…
A: These abbreviations mean the following: LR – likelihood ratio, result from statistical test for two models comparison. It returns a number which expresses how many times more likely the data are under one model than the other. LR meets numbers in interval <0;+inf). LLR – abbreviation for log-likelihood ratio statistic, logarithmic function of LR. LLR meets numbers in interval…
A: Threshold for score isn’t set up correctly. Adjust speaker score sharpness value to calibrate the recalculation. Please see Calibration in technology documentation….
MODULE 3: Voice Biometrics technologies (23 min) Common generic rules for CLI, REST and GUI Speaker Identification (SID) in CLI, REST and GUI Language Identification (LID) in CLI, REST and GUI Gender Identification (GID) in CLI, REST and GUI Summary https://www.youtube.com/watch?v=AyEoPfYVel8…
For evaluating the real life scenario of Phonexia Speaker Identification technology, the system needs to be calibrated by SID dataset. SID dataset (minimum requirements): To measure SID performance precisely, it’s important to prepare evaluation recordings set very carefully. The requirements are: 50+ known speakers, 200+ recordings in total (i.e. 3 to 5 recordings per speaker*) 1+ minute of net speech…
This package lets new users try and evaluate the semantic search functionality on their data, privately. No connection to the internet is needed for the operation. Hardware requirements recommended: Intel Core i7 or better, 10 GB RAM, 26 GB storage (SSD preferred) You have two options to choose from: Docker image Virtual Appliance with the docker image already imported on…
A: Please see List of supported KWS Languages. For more details, see KWS technology documentation….