Skip to content Skip to main navigation Skip to footer

Search: PCM

10 results

Age Estimation (AGE)

…coding), A-law or Mu-law, PCM, 8kHz+ sampling Voiceprints: AGE L4 model supports SID4 L4 voiceprints; legacy AGE models support voiceprints created by AGE itself Output Log file with processed information (age estimate) Processing speed Approx. 20x faster than real-time processing on 1 CPU core i.e. standard 8 CPU core server processes 3,840 hours of audio in 1 day of computing…

Speech Quality Estimation (SQE)

…linear coding), A-law or Mu-law, PCM, 8kHz+ sampling Output global score – percentage expression of audio quality (range <0;100>), by default, the global score is calculated based on waveform_n_bits and waveform_snr variables. pesq – value inspired by PESQ (Perceptual Evaluation of Speech Quality). Value is -0.5 to 4.5, the higher rating, the better quality of the recording. Other important statistics…

Understand SPE configuration file

PCM 16-bit, 8 kHz. See audio_converter.command for more details. audio_converter.command # Set converter command # %1 is for input file # %2 is for output file # ffmpeg example: # audio_converter.command = ffmpeg -loglevel warning -y -i %1 %2 # sox example: # audio_converter.command = sox %1 %2 audio_converter.command = ffmpeg -loglevel warning -y -i %1 %2 Sets the command…

FAQs (PSP)

…see Recommended OS and HW in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What are the supported audio formats? Formats supported directly and natively are: WAVE (*.wav) container including any of: unsigned 8-bit PCM (u8) unsigned 16-bit PCM (u16le) IEEE float 32-bit (f32le) A-law (alaw) µ-law (mulaw) ADPCM FLAC codec inside FLAC (*.flac) container OPUS codec inside OGG (*.opus)…

FAQs (Browser)

…directly and natively are: WAVE (*.wav) container including any of: unsigned 8-bit PCM (u8) unsigned 16-bit PCM (u16le) IEEE float 32-bit (f32le) A-law (alaw) µ-law (mulaw) ADPCM FLAC codec inside FLAC (*.flac) container OPUS codec inside OGG (*.opus) container Other audio formats must be converted to one of those natively supported using external tools. SPE server can be configured do…

Understand SPE connectors for external TTS

…via the service native API and with SPE via standard input (stdin) and output (stdout). The connector behavior should be as follows: if connector is started with –info parameter, it outputs TTS service capabilities information data in JSON format to stdout if connector is started without parameter reads input JSON data from stdin outputs raw PCM signed 16-bit little-endian mono…

Releases and Changelogs (SPE)

…3.7.2 contains wrong version of BSAPI that may cause some errors Speech Engine 3.7.2 (03/27/2017) – BSAPI 3.11.0 [#4579] Fixed registering VAD stream returns HTTP code 500 if realtime workers limit exceeded [#2807] RTP streams now support payload (PCMU) and 8 (PCMA) [#4536] Added new configuration option “stream.http.timeout” [#4588] Update BSAPI to 3.11.0 [#4529] Added French stream KWS [#4305] Added…

Q: What are the supported audio formats?

Formats supported directly and natively are: WAVE (*.wav) container including any of: unsigned 8-bit PCM (u8) unsigned 16-bit PCM (u16le) IEEE float 32-bit (f32le) A-law (alaw) µ-law (mulaw) ADPCM FLAC codec inside FLAC (*.flac) container OPUS codec inside OGG (*.opus) container Other audio formats must be converted to one of those natively supported using external tools. SPE server can be…

General

…8-bit PCM (u8) unsigned 16-bit PCM (u16le) IEEE float 32-bit (f32le) A-law (alaw) μ-law (mulaw) ADPCM FLAC codec inside FLAC (*.flac) container OPUS coden inside OGG (*.opus) container Other formats are converted using ffmpeg, but it cannot be guaranteed, that the quality of these recordings will be sufficient. One recording should contain only one speaker. Enrollment/verification Once the deployment is…