Skip to content Skip to main navigation Skip to footer

Search: Audio%20Source%20Profile

72 results

Terms of Service

…any electronic data provided by you, including but not limited to any kind of audio and video materials, music, sounds, texts and pictures. You shall remain at all times solely responsible for the content of your account. You shall avoid uploading any kind of illegal, harmful, abusive or otherwise inappropriate content to your account: PHONEXIA reserves the right to remove…

Age Estimation (AGE)

Phonexia Age Estimation (AGE) estimates the age of a speaker from audio recording or voiceprint. Technology Trained with emphasis on spontaneous telephony conversation The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc. Input Audio: WAV or RAW (8 or 16 bits linear…

Support

…the Product partially functional, the use of which in a production environment is substantially reduced. The Issue contains an error that impairs the ability of the system to process a majority of audio files or audio streams, or that renders the setup and maintenance of the system inoperable. Permalink Critical Issue The system is inoperative, and it has a critical…

Understand SPE home directory

…uploading file using POST /audiofile physically creates the file on filesystem in the storage location… and the file stays there until it’s explicitly deleted using DELETE /audiofile. There might be various reasons to NOT use the REST API for uploading files to the Speech Engine, e.g. to save the server from unwanted burden caused by many uploads and/or big files……

Understand SPE metafiles

Certain SPE entities – SID Speaker models, SID Audio source profiles, LID Language packs – can have additional information associated with them in the form of “metafiles”. This article explains the intended usage of metafiles. In general, SPE is intended as under-the-hood engine, focusing purely on the speech-related audio processing. Any additional functionality should be done on the application layer,…

Privacy Policy

…usage such as how often you use your Phonexia Account, how often you upload audio, video or other files, the size of generated content and other activity related to your use of Phonexia services. 1.2 Computer browser Some information is also provided by your computer browser through cookies. By using our services, you agree to use of the cookies. Certain…

Speaker Diarization (DIAR)

Speaker Diarization labels segments of the same voice(s) in one mono-channel audio record based by the individual speaker´s voice. It is a language-, domain- and channel-independent technology. It performs not only the segmentation of speakers but of technical signals and silence as well. The outputs of the technology can be both log files with labels and/or split audio files/one new…

Speech Quality Estimation (SQE)

…channels. The statistics of all channels include the numbers for many aspects of recording quality, and the overall global score. Technology The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc. Input Input format for processing: WAV or RAW (8 or 16 bits…

Open Source Acknowledgement

…dlfcn-win32 (Windows only) LGPL duktape MIT (link) eigen MPL flac BSD-style license fmt MIT gsl-lite MIT libbacktrace (Linux only) BSD-3-Clause minizip Zlib mkl freeware under ISSL (Intel Simplified Software License) mman-win32 (Windows only) MIT ogg BSD-style license onnxruntime MIT, onnxruntime/LICENSE at main openfst Apache License optim Apache license opus BSD portaudio PortAudio – an Open-Source Cross-Platform Audio API Qt LGPL…

Designing and Developing Application

…specific hardware (mainly CPU, virtualized infrastructure vs. HW) or are you going to buy specific HW for customer? What is short/long time storage requirements (ie. audio and results availability, desktop vs. distributed system)? Is there any synchronization required (ie. voiceprint database to clients)? What is the topology of the solution/app (ie. where to store audio, voiceprints, results, …)? How to…

Recommended OS and HW (PSP)

…or 10th Gen Intel® Core Processor RAM: 16 GB Storage: 100 GB (depends on audio retention policy) SSD strongly recommended for superior performance over HDD Configuration includes: SID4 XL4, GID XL4, LID L4, AGE L4, VAD, SQE Transcription System, basic 100 hours/day package (***) files processing CPU: 8 physical cores, 1x Intel® Xeon E5-2640 v4 or similar or 10th Gen…

Waveform Denoiser (DENOISER)

…software cannot remove unwanted speech or music in the background. Denoiser is used to remove noise from the recording and at the same time to amplify the speech signal for: Better intelligibility when listening by people (recommended use), Achieving better results with automatic speech recognition technologies (necessary to test on customer data first). Input: audio file (format details – see…

Phoneme Recogniser (PHNREC)

Phonexia Phoneme Recogniser (PHNREC) converts speech signals into pronunciation characters (so called phonemes). After the conversion, the pronunciation (text) can be easily indexed and searched by third party text data mining tools. The technology is optimized for noisy recordings and colloquial speech, can process audio files as well as audio streams and can provide results in several output formats. Phoneme…

Releases and Changelogs (VIN)

…1.3 2015-06-04 2016-12-04 2016-12-04 Public Changelogs Voice Inspector 5.2 Voice Inspector 5.2.0, BSAPI 3.61.0 (2024-04-04) New: New Case wizard checks for presence of Questioned and Reference recordings New: Number of audio channels is displayed in Case view Recording details view Score table view Report Fixed: Application crash with phoneme search Fixed: Generalized logistic distribution for Suspected speaker vs. Suspected speaker…