Skip to content Skip to main navigation Skip to footer

Search: Audio%20Source%20Profile

72 results

Understand SPE technologies, instances and workers

…(or bank branch): Post office is a place providing different kinds of services – one can go there to send letters, send or pick up packages, get a POBox, get some financial services, insurance, etc.). Speech Engine has various speech technologies configured – one can analyze the audio quality, extract voiceprints from recordings, compare voiceprints, transcribe audio to text, etc….

STT: What is Preferred Phrases feature and how to use it

…it can help in other applications, too – e.g. when transcribing domain-specific audios, the frequently used domain-specific phrases can be boosted. How preferred phrases work The picture below shows a simplified standard speech transcription process – the digitized speech signal spectrum is analyzed in the neural network acoustic model (which describes the pronunciations of a given language) and goes into…

Arabic dialects in Phonexia LID and STT

…a bit, but you won’t understand Moroccan Data acquisition AUDIO (used for LID and STT training) MSA is used in formal speaking situations such as sermons, lectures, news broadcasts, and speeches so it is pretty difficult/impossible to find recordings of spontaneous phone conversations in MSA available MSA recordings are usually from broadcasting (microphone) or rather formal scripted speeches (also microphone)…

Key Features (VIN)

…Speaker Identification, Speaker Diarization, Phoneme Recognizer, Voice Activity Detection, Speech Quality Estimation A search for repetitive sound patterns across all recordings in audio due to the automatic phonemic transcription Input: Questioned recordings (a minimum of 1 recording) Suspected speaker recordings (a minimum of 1 recording) The Population set (a technical minimum of 10 speakers, and a recommended minimum of 50…

Q: What are the requirements for SID evaluation dataset?

…in each recording (i.e. usually 2+ minutes recording length) only one speaker in each recording wide variety of gender and age is recommended recordings should be as similar to the target use case as possible (device, channel, distance from mic, languages distribution) audio files should be mono, lin16 format, 8 kHz+ sample rate *Note: splitting single recording into multiple shorter…

Minor Issue

Any scenario that does not fall under the Critical or Severe Issue definitions above. The Product is still operable but contains Issues occurring in a minority of audio files or audio streams or are of a minor nature….

SPE and Browser installation: embedded SPE

…multimedia converter By default, the Speech Engine will accept only a limited list of audio formats. In order to process the non-native formats, install a multimedia converter. The recommended SW for this is FFmpeg. FFmpeg on Windows Download the latest version from https://www.gyan.dev/ffmpeg/builds/ffmpeg-release-essentials.zip After unzipping the package, move the ffmpeg.exe executable to the /SPE/ directory. You can delete the rest…

Major Issue

An Issue that renders the Product partially functional, the use of which in a production environment is substantially reduced. The Issue contains an error that impairs the ability of the system to process a majority of audio files or audio streams, or that renders the setup and maintenance of the system inoperable….

Q: While trying to install SPE3, I get the error for loading libasound.so.2 libraries

Currently I’m trying to install the provided binaries for Linux, but I do get the following when running phxadmin: ./phxadmin: error while loading shared libraries: libasound.so.2: cannot open shared object file: No such file or directory I’m trying to run this under CentOS 7. A: Please install the right libraries required for manipulation with audio files from official repository into…

Q: How can we test Phonexia technologies?

We can prepare a testing package for you with full functionality of all technologies. The license validity is 90 days to allow you to test the technologies. Note: by default a NET license is provided for testing. This license needs to have active Internet connection to a phonexia licensing server in order to function. Rest assured no data – audio,…

Download Speech Platform

…(DIAR) – model XL4 Language Identification (LID) – model L4 Gender Identification (GID) – model XL5 Age Estimation (AGE) ) – model XL5 Voice Activity Detection (VAD) – model GENERIC_3 and SID4_XL5 Speech Quality Estimation (SQE) Time Analysis Extraction (TAE) Waveform Denoiser (DENOISER) Phonexia Browser example audio (in ./BROWSER/example/ and ./SPE/bsapi/{technology}/example/) Step #2 – First start To get started, please…

Voice Activity Detection (VAD)

Voice Activity Detection is a language-, domain- and channel-independent technology that identifies parts of audio recordings with speech content vs. non-speech content. It creates labels for speech and other signals in the recording; this can then serve as a decision point whether to process the recording by other technologies or not. VAD is usually part of rapid filtration process in…