Skip to content Skip to main navigation Skip to footer

Search: multi

55 results

FAQs (Voice Verify)

…accepted. There is currently no mechanism to detect which channel in stereo or multi-channel recording contains the voice of the desired speaker. For that reason, the admin of Voice Verify must ensure that recordings used for voiceprint creation are mono and contain the voice of the desired speaker only. in FAQ Voice Verify Permalink Q: What are the audio/stream quality…

FAQs (Browser)

…this conversion automatically in background, see Understand SPE audio converter article. Great tools for converting other than supported formats to supported are FFmpeg (http://www.ffmpeg.org) or SoX (http://sox.sourceforge.net/). Both are multiplatform software tools for Microsoft Windows, Linux and Apple OS X. Example of usage: FFmpeg ffmpeg -i <source_audio_file_name> <output_audio_base_name>.wav This command converts any supported format/codec audio file to normalized WAV audio…

Q: What are the supported audio formats?

…configured do this conversion automatically in background, see Understand SPE audio converter article. Great tools for converting other than supported formats to supported are FFmpeg (http://www.ffmpeg.org) or SoX (http://sox.sourceforge.net/). Both are multiplatform software tools for Microsoft Windows, Linux and Apple OS X. Example of usage: FFmpeg ffmpeg -i <source_audio_file_name> <output_audio_base_name>.wav This command converts any supported format/codec audio file to normalized…

Speaker Diarization (DIAR)

…silence as well. The outputs of the technology can be both log files with labels and/or split audio files/one new multichannel audio file. Typical use cases: Preprocessing for other speech recognition technologies, labeling the parts of the utterance according to the speakers, splitting telephone conversations recorded in mono into several channels, identifying how many speakers are speaking in the recording….

Speaker Identification

…the customer’s use case (security versus Customer convenience) (done by Phonexia) Mean normalization Every customer’s data are different. Even though Phonexia Speaker Identification technology is trained on a variety of data coming from multiple environments, it is worth making Mean normalization, which adapts the technology to the data of a Customer. Existence of an evaluation set is a prerequisite for…

Understand SPE benchmark

…lengths and speech/non-speech ratios it is recommended to run the benchmark using multiple different audio files and calculate the average FtRT processing speed yourself. Alternatively, you can tune (or hack) SPE and prepare your own, or replace the default set of benchmarking recordings – see further below… Benchmark recordings sets The default sets of audio files supplied with SPE are…

Phonexia Partner Program for Government Partners

…the Starter Kit during the onboarding period? Yes, the Starter Kit can be purchased anytime during our cooperation. Can I purchase the Starter Kit multiple times? Yes, for each project, proof of concept, and product line, you can purchase a Starter Kit again. Phonexia consultants can’t wait to support your business. How do you deliver technical training? Phonexia technical training…

Speaker change on enrollments

…Please note that the voiceprint created from a stream is checked for the number of speakers only after the stream has ended. If there is more than 1 speaker detected in the voice, error message is produced when using /api/v2/verify endpoint: { “stream_uuid”: “30f22400-2809-4ea3-8191-1f7289c3d009”, “external_id”: “JohnDoe”, “detail”: “Multiple speakers were detected in the voiceprint and it cannot be used.” }…

About Phonexia Orbis

…technologies will highlight the parts of interest for you, allowing you to spend energy only on those most relevant recordings. Analyze Key Findings First Stay laser focused. Phonexia Orbis displays audio recordings ordered based on their relevance derived from multiple criteria so you can quickly grasp the context and connect all the necessary dots, progressing from the most-related recording to…

KWS: Results explained

…the detected pronunciation. Start- and end time is in HTK units. 1 HTK unit is 100 nanoseconds, so dividing the times by 10000 gives the amount of milliseconds. Score is log likelihood ratio from {-inf,+inf} interval. Confidence is a probability from {0,1} interval. To convert it to percentage, multiply the confidence value by 100. Example This example of Keyword Spotting…

Phonexia Speech Engine

…or stream exists. Once the recording is deleted from SPE storage, or stream is ended, SPE removes all information, metadata and technology results from the database. Basic user management SPE allows to define multiple users with different user roles and user rights. Each SPE user has access only to its own data storage, files, metadata and processing results. Load management…

Understand SPE metafiles

…DELETE methods to upload, download or delete any kind of file with metadata of your choice, associated with the corresponding SPE entity. There are no limits on the content of the metafiles, their names, etc. (apart from those imposed by the underlying operating system and/or filesystem). Plain text files, structured formats like JSON or XML, pictures, documents, multimedia files… you…

Understand SPE technologies, instances and workers

…Different post offices may provide different (sets of) services – smaller offices may provide only small set of services, while big ones can have multiple floors with wide service portfolio. Different Speech Engine installations may provide different (sets of) technologies – smaller installations can have like only single technology configured, while big ones can have wide set of various technologies…

Audio Quality Estimation

This technology provides indication about the audio quality of the speech during enrolments and verifications in real-time via API and for future reference via logs. It is available for both single-server and multi-server deployments. Currently the Audio Quality Estimation feature is limited to polling (SIP, http streams) only, Webhooks support is planned to be added in the future. Based on…