Skip to content Skip to main navigation Skip to footer

Search: Wav

32 results

Waveform Denoiser (DENOISER)

…Speech Engine documentation); stream not supported, technology model name to be used for processing. Output: audio file (WAV or RAW), together with xml/json report (in SPE only). Fig.: Comparison of original recording (david_noisy.wav, top half of image) and same recording processed by Denoiser (david_denoised.wav, bottom half of the image). Typical Questions Q: What do you recommend for deploying this technology?…

Understand SPE benchmark

…as follows (the version number 1.0 is present only for some historical reasons and is ignored): benchmark └── 1.0 ├── default │ ├── 030.wav │ ├── 060.wav │ ├── 090.wav │ ├── 120.wav │ ├── 150.wav │ ├── 180.wav │ ├── 210.wav │ ├── 240.wav │ ├── 270.wav │ └── 300.wav └── czech ├── 030.wav ├── 060.wav ├── 090.wav ├──…

Q: How can I tell in which format the .wav file is?

A: From the utilities in the package*, you can find it in ffprobe <file_name>, it will write out the info about the file. *Utility “ffprobe” is not included in our package(s). It is part of ffmpeg (https://ffmpeg.org/ffprobe.html) and is necessary to be installed separately….

Understand SPE audio converter

…file format ‘C:\TMP\tmp9408aaaaaa’: BsapiException: SWaveFileI(1751): Corrupted WAVE file format: ‘C:\TMP\tmp9408aaaaaa’. 2021-01-30 20:49:26 [Trace] ConverterSubsystem: Converting C:\TMP\tmp9408aaaaaa -> C:\TMP\tmp9408baaaaa.wav 2021-01-30 20:49:27 [Debug] ConverterSubsystem: File C:\TMP\tmp9408aaaaaa has been converted. 2021-01-30 20:49:27 [Trace] ConverterSubsystem: Removed temporary file: C:\TMP\tmp9408aaaaaa 2021-01-30 20:49:27 [Trace] Data: Moving: ‘C:\TMP\tmp9408baaaaa.wav‘ -> ‘D:\SPE\home\admin\storage\test1.wav‘ 2021-01-30 20:49:27 [Trace] Data: Moved: ‘C:\TMP\tmp9408baaaaa.wav‘ -> ‘D:\SPE\home\admin\storage\test1.wav‘ 2021-01-30 20:49:27 [Trace] Data: File ‘/test1.wav‘ registered in database…

Releases and Changelogs (Browser)

…Fixed multi-channel recordings might not be processed by STT for the first time Added “Copy text” to context menu of STT widget in Waveform editor Support for SPE 3.5.x Phonexia Browser v3.4.0, BSAPI 3.8.0 – Sep 21 2016 Fixed do not show second label panel in waveform editor when double-click on result Fixed import of the speaker models may get…

Speech Quality Estimation (SQE)

…linear coding), A-law or Mu-law, PCM, 8kHz+ sampling Output global score – percentage expression of audio quality (range <0;100>), by default, the global score is calculated based on waveform_n_bits and waveform_snr variables. pesq – value inspired by PESQ (Perceptual Evaluation of Speech Quality). Value is -0.5 to 4.5, the higher rating, the better quality of the recording. Other important statistics…

Releases and Changelogs (SPE)

…in a database Speech Engine 3.40.4, DB v1700, BSAPI 3.40.4 (2021-05-28) Fixed: BSAPI 3.40.3 does not include fixes from 3.40.2 Fixed: Different results in LID L4 for waveform and languageprint input Fixed: Requested segment is out of waveform range error in TAE Fixed: End time may be before start time in STT “one best” transcription Fixed: When creating a new…

Understand SPE executable files

…to URL (e.g. “http://server:port”) priority=number – Set request priority (see Understanding SPE processing priority for more details) phxclient: example 1 phxclient /login=admin /password=phonexia /method=POST /uri=”127.0.0.1:8600/audiofile?path=/myfile.wav” /data=”c:\audio files\example recording.wav” Upload example recording.wav file from c:\audio files folder to SPE running at this machine (i.e. with IP address 127.0.0.1) and put it in the root of SPE internal storage under myfile.wav name….

FAQs (PSP)

…OS X. Example of usage: FFmpeg ffmpeg -i <source_audio_file_name> <output_audio_base_name>.wav This command converts any supported format/codec audio file to normalized WAV audio format in 16-bit PCM little-endian as it is the default system. For more parameters please check FFmpeg manual pages. SoX sox <source_audio_file_name> -b 16 <output_audio_base_name>.wav Number of bits defined by -b parameter must be specified. in FAQ Phonexia…

Age Estimation (AGE)

…time Representation of the results: For the CMD version Name_of_the_file.wav Age[integer – limited to 99] example/david_1.wav 41 example/david_2.wav 40 For the SPE version name – representing the age score – representing the score for the age [1/0] In order to get a result, each age receives a score; when the score equals to “1”, it represents the value of the…

Q: What are the supported audio formats?

…Linux and Apple OS X. Example of usage: FFmpeg ffmpeg -i <source_audio_file_name> <output_audio_base_name>.wav This command converts any supported format/codec audio file to normalized WAV audio format in 16-bit PCM little-endian as it is the default system. For more parameters please check FFmpeg manual pages. SoX sox <source_audio_file_name> -b 16 <output_audio_base_name>.wav Number of bits defined by -b parameter must be specified….

Releases and Changelogs (VIN)

…to ‘affricate’ phoneme ‘D’ changed from ‘fricative to ‘plosive’ phoneme ‘T’ changed from ‘fricative to ‘plosive’ phoneme ‘c’ changed from ‘plosive’ to ‘affricate’ Voice Inspector v3.2.1, BSAPI 3.15.0 (2018-03-16) Export of Speakers/Populations allows export only voiceprints Wave editor’s Spectrum settings allow to set up smaller values for Window length Added generic label panel in Wave editor The new version of…

LID: Terminology and adaptation

…directory created languageprints get stored to /path/to/my/languageprints, each language to its own separate subdirectory we use the “L4” model, hence the _l4 configuration file suffix ./phxcmd lpextract -v -c /path/to/lid/settings/lpextract_l4.bs -d /path/to/my/audio/MyLanguage -e wav -D /path/to/my/languageprints/MyLanguage ./phxcmd lpextract -v -c /path/to/lid/settings/lpextract_l4.bs -d /other/path/to/audio/MyOtherLanguage -e wav -D /path/to/my/languageprints/MyOtherLanguage etc. for more languages… where: -v parameter tells the tool to provide verbose…

FAQs (Browser)

…directly and natively are: WAVE (*.wav) container including any of: unsigned 8-bit PCM (u8) unsigned 16-bit PCM (u16le) IEEE float 32-bit (f32le) A-law (alaw) µ-law (mulaw) ADPCM FLAC codec inside FLAC (*.flac) container OPUS codec inside OGG (*.opus) container Other audio formats must be converted to one of those natively supported using external tools. SPE server can be configured do…