Search: le

145 results

Speaker Identification (SID)

…or in files (typically with .vp extension). Here is an example of the voiceprint content in a human readable form: Version: 40 Speech length: 18.799999 s Content length: 1000 bytes User data length: bytes Content data: 5.573507 3.315646 3.519821 -2.426645 6.843580 -1.432263 -2.243011 -1.649323 2.136194 -2.494688 -3.733182 2.041409 -4.760473 -1.576752 -3.024205 -1.927082 1.253917 0.153468 -0.923258 -0.509448 -4.984461 3.218744 2.757949 4.167875…

STT: Language Model Customization tutorial

…in command line STT, simply specify the new configuration file belonging to the customized STT model in the -config parameter. For example, assuming that original pl_pl_5 model was customized, specifying updated as the model suffix, the corresponding STT command line to use the customized model would look similar to this: phxcmd stt -config settings\stt_pl_pl_5_updated.bs -in-file <input_file> -out-file <output_file> … …

SPE and Browser installation: standalone SPE

…change the following lines to enable the FFMPEG convertor: change the line: # Enable or disable audio converter audio_converter.enabled = false to: # Enable or disable audio converter audio_converter.enabled = true 6. Start Speech Engine In order to start the Speech Engine, start the SPE executable called phxspe On Windows – type cmd in the Address bar, to open the…

Q: While trying to install SPE3, I get the error for loading libasound.so.2 libraries

Currently I’m trying to install the provided binaries for Linux, but I do get the following when running phxadmin: ./phxadmin: error while loading shared libraries: libasound.so.2: cannot open shared object file: No such file or directory I’m trying to run this under CentOS 7. A: Please install the right libraries required for manipulation with audio files from official repository into…

Q: What to do with the ApplicationStartup: Unhandled exception: BsapiException error?

When running SPE, the following error occurs: [Error] ApplicationStartup: Unhandled exception: BsapiException: SWaveformSegmenterI(/mnt/phxspe/home/phx/storage/dfs/a1cabcf7-c761-49f1 -a9bc-0a8209a09fd9.opus Requested segment (78056, 102056) is out of waveform range (0,91840). A: It means that this opus file is created improperly and declares internally (in header) much more audio than available in real file. Please check your audio source/originator for proper functionality. Or use ffmpeg / sox…

Understand SPE user accounts

…other” accounts still need to register the file to be able to actually use it in SPE… otherwise, the file would be visible only by the account which originally uploaded the file. This is because SPE keeps some file metadata (name, timestamps, …) in its database and files not having its database record (associating them with the SPE account) are…

Phonexia Speech Engine

…file to configure server and optionally database Optionally, run phxadmin –add-user in console to configure user account(s) to access the REST API (or use pre-configured user admin) Finally, run phxspe in console to start Speech Engine Now your SPE server is running and you can access the REST API via IP address and port set in properties file (settings/phxspe.properties). Details…

STT: What is Preferred Phrases feature and how to use it

…details in Adding words to STT language model article. Legacy preferred phrases (SPE 3.32 – 3.42) have a number of limitations: adding words to dictionary is not supported only words already known by the language model are allowed in preferred phrases Phrases containing unknown words are ignored and a warning message is logged to SPE log. Therefore, to be able…

Voice Inspector – supporting technologies

This part requires higher (and non-anonymous) access level.
How to solve this situation:

Log in here if you are not logged in.
Register here. It takes just a few clicks and it’s free.

Understand SPE home directory

…location might be useful e.g. in complex deployments with multiple separate SPEs which need to be accessing single centralized file storage placed on high-performance networked disk array, etc. Similarly to the operating systems, the SPE home directory contains subdirectories for each SPE user (see SPE user management article). These subdirectories contain data belonging to the respective users: – user’s file…

Understand SPE connectors for external TTS

…(taking longer time to synthesize). Connector naming, location, configuration TTS connectors should be placed in {SPE_installation_directory}/external/technologies/tts directory, each connector in a separate subdirectory. To enable a connector, include its subdirectory name to the external.technologies.tts_connectors setting in SPE configuration file. Connector executable file must be named connector (i.e. without file extension). Connector configuration – like TTS service address, access credentials, API…

SID4 performance on Intel® Xeon® Platinum 8124M

Benchmark goals Find realistic performance using total recording length Find FTRT based exactly on net_speech (engineering sizing data) Find system performance using all physical cores Find system performance using all logical cores Infrastructure setup Intel® Xeon® Platinum 8124M is used in virtual machine with 8 physical cores reserved exclusively for this VM, Hyper Threading is enabled [16 logical cores available],…

Understand SPE benchmark

…SPE in the {SPE}/data/benchmark directory. The second option uses single audio file of your choice uploaded to SPE storage, specified by the path parameter. The set of audio files supplied with SPE contains recordings of various length (from 30 seconds to 5 minutes) and with various speech/non-speech ratio. This is to account for the fact that both the length of…

Speech to Text (STT)

…including discriminative training and neural network-based features Output One-best transcription – i.e. a file with a time-aligned speech transcript (time of word’s start and end) Variants for transcriptions – i.e. hypotheses for words at each moment (confusion network) or hypotheses for utterances at each slot (n-best transcription) Processing speed – several versions available: from 8x faster than real-time processing on…

KWS: Results explained

…sheet demonstrating the sigmoid function: Score-to-Confidence. Score-to-confidence conversion tuning Starting with SPE/BSAPI 3.24 (October 2019) it’s possible to modify the confidence calculation using confidence_shift and confidence_sharpness values in user configuration file in [score_calib:SKeywordScoreCalibrationI] section. User configuration file must have the same name as original configuration file, with added .usr extension, e.g. kws_en_us_5.bs.usr – see the What is a user configuration…