…or in files (typically with .vp extension). Here is an example of the voiceprint content in a human readable form: Version: 40 Speech length: 18.799999 s Content length: 1000 bytes User data length: bytes Content data: 5.573507 3.315646 3.519821 -2.426645 6.843580 -1.432263 -2.243011 -1.649323 2.136194 -2.494688 -3.733182 2.041409 -4.760473 -1.576752 -3.024205 -1.927082 1.253917 0.153468 -0.923258 -0.509448 -4.984461 3.218744 2.757949 4.167875…
Search: le
145 results
…in command line STT, simply specify the new configuration file belonging to the customized STT model in the -config parameter. For example, assuming that original pl_pl_5 model was customized, specifying updated as the model suffix, the corresponding STT command line to use the customized model would look similar to this: phxcmd stt -config settings\stt_pl_pl_5_updated.bs -in-file <input_file> -out-file <output_file> … …
…change the following lines to enable the FFMPEG convertor: change the line: # Enable or disable audio converter audio_converter.enabled = false to: # Enable or disable audio converter audio_converter.enabled = true 6. Start Speech Engine In order to start the Speech Engine, start the SPE executable called phxspe On Windows – type cmd in the Address bar, to open the…
Currently I’m trying to install the provided binaries for Linux, but I do get the following when running phxadmin: ./phxadmin: error while loading shared libraries: libasound.so.2: cannot open shared object file: No such file or directory I’m trying to run this under CentOS 7. A: Please install the right libraries required for manipulation with audio files from official repository into…
When running SPE, the following error occurs: [Error] ApplicationStartup: Unhandled exception: BsapiException: SWaveformSegmenterI(/mnt/phxspe/home/phx/storage/dfs/a1cabcf7-c761-49f1 -a9bc-0a8209a09fd9.opus Requested segment (78056, 102056) is out of waveform range (0,91840). A: It means that this opus file is created improperly and declares internally (in header) much more audio than available in real file. Please check your audio source/originator for proper functionality. Or use ffmpeg / sox…
…other” accounts still need to register the file to be able to actually use it in SPE… otherwise, the file would be visible only by the account which originally uploaded the file. This is because SPE keeps some file metadata (name, timestamps, …) in its database and files not having its database record (associating them with the SPE account) are…
…file to configure server and optionally database Optionally, run phxadmin –add-user in console to configure user account(s) to access the REST API (or use pre-configured user admin) Finally, run phxspe in console to start Speech Engine Now your SPE server is running and you can access the REST API via IP address and port set in properties file (settings/phxspe.properties). Details…
…details in Adding words to STT language model article. Legacy preferred phrases (SPE 3.32 – 3.42) have a number of limitations: adding words to dictionary is not supported only words already known by the language model are allowed in preferred phrases Phrases containing unknown words are ignored and a warning message is logged to SPE log. Therefore, to be able…
This part requires higher (and non-anonymous) access level.
How to solve this situation:
- Log in here if you are not logged in.
- Register here. It takes just a few clicks and it’s free.
…location might be useful e.g. in complex deployments with multiple separate SPEs which need to be accessing single centralized file storage placed on high-performance networked disk array, etc. Similarly to the operating systems, the SPE home directory contains subdirectories for each SPE user (see SPE user management article). These subdirectories contain data belonging to the respective users: – user’s file…
…(taking longer time to synthesize). Connector naming, location, configuration TTS connectors should be placed in {SPE_installation_directory}/external/technologies/tts directory, each connector in a separate subdirectory. To enable a connector, include its subdirectory name to the external.technologies.tts_connectors setting in SPE configuration file. Connector executable file must be named connector (i.e. without file extension). Connector configuration – like TTS service address, access credentials, API…
Benchmark goals Find realistic performance using total recording length Find FTRT based exactly on net_speech (engineering sizing data) Find system performance using all physical cores Find system performance using all logical cores Infrastructure setup Intel® Xeon® Platinum 8124M is used in virtual machine with 8 physical cores reserved exclusively for this VM, Hyper Threading is enabled [16 logical cores available],…
…SPE in the {SPE}/data/benchmark directory. The second option uses single audio file of your choice uploaded to SPE storage, specified by the path parameter. The set of audio files supplied with SPE contains recordings of various length (from 30 seconds to 5 minutes) and with various speech/non-speech ratio. This is to account for the fact that both the length of…
…including discriminative training and neural network-based features Output One-best transcription – i.e. a file with a time-aligned speech transcript (time of word’s start and end) Variants for transcriptions – i.e. hypotheses for words at each moment (confusion network) or hypotheses for utterances at each slot (n-best transcription) Processing speed – several versions available: from 8x faster than real-time processing on…
…sheet demonstrating the sigmoid function: Score-to-Confidence. Score-to-confidence conversion tuning Starting with SPE/BSAPI 3.24 (October 2019) it’s possible to modify the confidence calculation using confidence_shift and confidence_sharpness values in user configuration file in [score_calib:SKeywordScoreCalibrationI] section. User configuration file must have the same name as original configuration file, with added .usr extension, e.g. kws_en_us_5.bs.usr – see the What is a user configuration…