Skip to content Skip to main navigation Skip to footer

Search: file format

44 results

FAQs (Voice Verify)

…not extract these characteristics from the user’s voice. You can obtain them by using Speech Platform, which is another Phonexia product. More details about Speech Platform can be found here. in FAQ Voice Verify Permalink Q: Can stereo recordings be used for enrollment? A: It is possible to enroll users via recordings. In this case however only mono files are…

Q: What are the requirements for SID evaluation dataset?

…in each recording (i.e. usually 2+ minutes recording length) only one speaker in each recording wide variety of gender and age is recommended recordings should be as similar to the target use case as possible (device, channel, distance from mic, languages distribution) audio files should be mono, lin16 format, 8 kHz+ sample rate *Note: splitting single recording into multiple shorter…

Keyword Spotting (KWS)

…experts. Typical use cases Call centers increase operator and supervisor efficiency by searching calls identify inappropriate expressions from operators check marketing campaigns with automatic script-compliance control Mass media and web search servers index and search multimedia by keyword route multimedia files and streams according to their content Security/defense maintain fast reaction times by routing calls with specific content to human…

Phonexia technologies introduction

…technologies 11:43 Speech Transcription (STT) 12:30 Keyword Spotting (KWS) 13:32 Phoneme Recognition (PHNREC) 13:54 Time Analysis Extraction (TAE) 14:22 Speech Platform architecture; Speech Engine, Phonexia Browser, Phonexia Voice Inspector brief 18:52 HW and SW requirements, typical deployment topologies 21:34 Supported file– and stream formats, typical implementations and data flows 27:29 Licensing technical options 32:24 Summary, recommended next steps   https://youtu.be/DDu0Y1rgQ6k…

STT: Results explained

…These can be recognized by recording-level confidence value of -1. “one_best_result”: { “confidence”: -1, “segmentation”: [ …   N-best output { “phrase”: “can you hear me okay i wanted to”, “channel”: 0, “score”: 509.71384, “confidence”: 0.33733934, “start”: 1500000, “end”: 28200000 } This format can be used by analytical applications to process further the alternatives. It can be also useful when…

Time Analysis Extraction (TAE)

…dialogue. This can be used to improve calls between operators and callers or to indicate potential stress points in phone calls, for example, change of speech speed during the conversation). Input TAE can process both audio files and streams (for format details see Speech Engine documentation). By its nature, TAE is usable mainly on two channel phone calls recordings, where…

Speaker Identification (SID)

…speaker’s voice. It cannot be used to recreate the original audio file which is useful when the content has to stay anonymous. The recommended minimum amount of net speech for enrollment is approx. 30 seconds (latest generation of Phonexia SID lowers this requirement to 20 seconds). Voiceprints can then be stored in a database in the form of binary blobs,…

Waveform Denoiser (DENOISER)

…software cannot remove unwanted speech or music in the background. Denoiser is used to remove noise from the recording and at the same time to amplify the speech signal for: Better intelligibility when listening by people (recommended use), Achieving better results with automatic speech recognition technologies (necessary to test on customer data first). Input: audio file (format details – see…

Speech Quality Estimation (SQE)

Phonexia’s Speech Quality Estimation quantifies the acoustic quality of recordings. This helps the user to quickly determine whether the acoustic quality of a recording is good for processing with other speech technologies or not. As an answer for SQE, the SPE returns a json/xml file. This file includes general information about the technology and statistics of all (one or two)…

Speech to Text (STT)

About STT Phonexia Speech to Text (STT) converts speech in audio signals into plain text. Technology works with both acoustics as well as dictionary of words, acoustic model and pronunciation. This makes it dependent on language and dictionary – only some set of words can be transcribed. As an input, audio file or stream is needed, together with selection of…

Quick Start Guide (VIN)

…licensed does not need Internet connection. Copy a license file (license.dat, obtained from Phonexia) to the application’s root directory (e.g., next to VoiceInspector (Linux) / VoiceInspector.exe (Windows); the file should be copied to the VIN folder before running the “VoiceInspector (Linux) / VoiceInspector.exe (Windows)” executable file). Run VoiceInspector (Linux) / VoiceInspector.exe (Windows) To access this manual, press F1 or select…

Phoneme Recogniser (PHNREC)

Phonexia Phoneme Recogniser (PHNREC) converts speech signals into pronunciation characters (so called phonemes). After the conversion, the pronunciation (text) can be easily indexed and searched by third party text data mining tools. The technology is optimized for noisy recordings and colloquial speech, can process audio files as well as audio streams and can provide results in several output formats. Phoneme…

Orbis 1.0.0 Release Notes

…support metadata files in proprietary formats. Only Orbis JSON format is supported for metadata upload in version 1.0. Solution: Convert your proprietary metadata format into the specified JSON format. Hit feature Due to the performance issues, the Hits are automatically calculated only on recording upload. When a new rule is defined the Hits recalculation is not perform automatically. Solution: Push…

Key Features (PSP)

…audio conversion tools. Tested with sox or ffmpeg. For the configuration of this functionality, see [SPE]/settings/phxspe.properties Note: You should be aware that audio format conversion (e.g., if the original audio format is highly compressed) can decrease the accuracy of speech technologies. Integration Possibilities Phonexia Speech Platform can be integrated into a partner’s application using the Speech Engine component (REST API)….

Orbis 1.2.0 Release Notes

…number of recordings for analysis. A FIFO algorithm is used to automatically remove the older entries. Limitations (known issues) Recording metadata formats Orbis doesn’t support metadata files in proprietary formats. Only Orbis JSON format is supported for metadata upload in current version. Solution: Convert your proprietary metadata format into the specified JSON format. Hit feature Due to the performance issues,…