Search: SID%20API

54 results

SID: TUTORIAL: Speaker Identification – How to Do a Basic Test

Phonexia Speaker Identification is a voice biometry tool for recognition of speakers by their voice. In this video, we will show you how to start using this technology! You will learn how to create a “Speaker Model” to identify a speaker in a set of data. Ready to test it? Start with our video: What else is needed? 1. Phonexia…

FAQs (PSP)

…license contains records for all required modules. See Licensing article for additional information in FAQ Phonexia Browser, FAQ Speech Platform, FAQ Voice Inspector Permalink Q: What are the requirements for SID evaluation dataset? For evaluating the real life scenario of Phonexia Speaker Identification technology, the system needs to be calibrated by SID dataset. SID dataset (minimum requirements): To measure SID…

FAQs (Browser)

…FAQ Voice Inspector Permalink Q: What are the requirements for SID evaluation dataset? For evaluating the real life scenario of Phonexia Speaker Identification technology, the system needs to be calibrated by SID dataset. SID dataset (minimum requirements): To measure SID performance precisely, it’s important to prepare evaluation recordings set very carefully. The requirements are: 50+ known speakers, 200+ recordings in…

Understand SPE directory structure

…data directory holds additional data files for entities created by that user – e.g. SID Speaker Models, or LID language packs. If there no such entities exist for that user, this directory is empty. Here is an example of admin‘s data directory containing custom LID language pack for model L4 and SID speaker models named “David” and “Paul” (the tree…

Phonexia Speech Engine

Phonexia Speech Engine (SPE) is main part of Phonexia Speech Platform. SPE is a server application for 64-bit Linux or Windows, providing REST API to entire portfolio of Phonexia speech technologies. SPE capabilities overview: Audio files and stream processing Audio files RTP / HTTP streams Speaker Identification (SID) ✓ ✓ Speech To Text (STT) ✓ ✓ Keyword Spotting (KWS) ✓…

Q: Why does the system show high score (>90%) even for non-targets?

A: Threshold for score isn’t set up correctly. Adjust speaker score sharpness value to calibrate the recalculation. Please see Calibration in technology documentation….

Download Speech Platform

…only English models for Speech To Text and Keyword Spotting. Additional supported languages are available upon request. ⓘ Click to show/hide the package content Speech Engine – technologies included: Speech To Text (STT) – model EN_US_6 (US English) Keyword Spotting (KWS) – model EN_US_6 (US English) Phoneme Recognizer (PHNREC) – model EN_US_6 (US English) Speaker Identification 4 (SID4) – model…

Understand SPE metafiles

Certain SPE entities – SID Speaker models, SID Audio source profiles, LID Language packs – can have additional information associated with them in the form of “metafiles”. This article explains the intended usage of metafiles. In general, SPE is intended as under-the-hood engine, focusing purely on the speech-related audio processing. Any additional functionality should be done on the application layer,…

404 error

404 four hundred four [fôr ˈhəndrid fôr] Page not found back to home…

Download Voice Inspector 5.2

…models VIN application (graphical user interface, GUI) with the following technologies in-build Speaker Identification (SID4_XL5) Speaker Diarization (DIAR) Voice Activity Detection (VAD) Speech Quality Estimator (SQE) Phoneme Recogniser (PHNREC) example population sets and audio (in ./examples/) and example report templates (in ./templates/) Hardware requirements minimum – CPU: Intel® Core™ i5, RAM: 4 GB, Required HDD space: 0.5 GB for software…

Recommended OS and HW (PSP)

…or 10th Gen Intel® Core Processor RAM: 16 GB Storage: 100 GB (depends on audio retention policy) SSD strongly recommended for superior performance over HDD Configuration includes: SID4 XL4, GID XL4, LID L4, AGE L4, VAD, SQE Transcription System, basic 100 hours/day package (***) files processing CPU: 8 physical cores, 1x Intel® Xeon E5-2640 v4 or similar or 10th Gen…

Video – Voice Biometrics technologies

MODULE 3: Voice Biometrics technologies (23 min) Common generic rules for CLI, REST and GUI Speaker Identification (SID) in CLI, REST and GUI Language Identification (LID) in CLI, REST and GUI Gender Identification (GID) in CLI, REST and GUI Summary https://www.youtube.com/watch?v=AyEoPfYVel8…

Contact

Visit Us at Address: Chaloupkova 3002/1a, CZ 612 00 Brno, Czech Republic, European Union GPS: N 49° 13.426′, E 016° 35.898 General Queries and Sales [email protected] landline: +420 511 205 265 Company registration details Identification number (ICO): 27680258 VAT identification (DIC): CZ27680258 Registered in the Business Register kept at the District Court in Brno, File C, Inset 51524….

Speech to Text (STT)

About STT Phonexia Speech to Text (STT) converts speech in audio signals into plain text. Technology works with both acoustics as well as dictionary of words, acoustic model and pronunciation. This makes it dependent on language and dictionary – only some set of words can be transcribed. As an input, audio file or stream is needed, together with selection of…

Keyword Spotting (KWS)

Phonexia Keyword Spotting (KWS) identifies occurrences of keywords and/or keyphrases in audio recordings. It can help you to get valuable information from huge quantities of speech recordings. You only need to specify the keywords or phrases you wish to find. This technology identifies all recordings with keyword occurrences and allows you to automatically route important recordings or calls to your…