Search: file%20format

94 results

Q: How to fix the Error 1013: Unsupported: Server does not support authentication with token?

…would like to play with “pure” daemon installation, then phxspe.properties file should exist in ./settings subdirectory. File phxspe.properties is created by phxadmin utility or can be created from ./data/phxspe.properties.default template file. Copy template file to ./settings directory Rename it to phxspe.properties Check for server.enable_authentication_token directive and setup it as needed. Restart phxspe Basic installation steps are described in ./doc/INSTALL.html document….

Q: How to fix Error 1007: Unsupported audio format?

Phonexia Browser application may return error “1007: Unsupported audio format” during uploading audio file. Please consider if your audio files are in Q: What are the supported audio formats? . But if you need use as input audio recordings in other formats, you can configure SPE for audio automated conversion. As prerequisite install external tool for audio conversion. Recommend is…

Release Notes

…and fixes Speech Engine: General Reduced RAM consumption (since 3.58.0) RAM consumption can be up to several gigabytes lower, depending on technologies configuration and processed audio. This is mainly visible in Speech To Text when processing many audios or longer audios (or both). The effect may be less visible in other technologies. Fixed issues with non-ASCII / Unicode file names…

KWS: Results explained

…sheet demonstrating the sigmoid function: Score-to-Confidence. Score-to-confidence conversion tuning Starting with SPE/BSAPI 3.24 (October 2019) it’s possible to modify the confidence calculation using confidence_shift and confidence_sharpness values in user configuration file in [score_calib:SKeywordScoreCalibrationI] section. User configuration file must have the same name as original configuration file, with added .usr extension, e.g. kws_en_us_5.bs.usr – see the What is a user configuration…

Understand SPE workers configuration

…the maximum number of simultaneously running tasks. # Multithread settings server.n_workers = 8 server.n_realtime_workers = 8 Requests for additional file processing tasks are put in a queue and processed according their order and priorities. Requests for additional stream processing tasks are refused with HTTP status 403 (the realtime nature of stream processing does not allow any queuing). File processing can…

Understand SPE processing priority

SPE has a simple built-in system of task prioritization. This allows for flexible management of processing queue, which is useful especially in mass audio processing. For example, if there is a long queue of files waiting to be processed, and one needs to urgently process another bunch of files, these files can be sent for processing using higher priority… and…

Privacy Policy

…usage such as how often you use your Phonexia Account, how often you upload audio, video or other files, the size of generated content and other activity related to your use of Phonexia services. 1.2 Computer browser Some information is also provided by your computer browser through cookies. By using our services, you agree to use of the cookies. Certain…

STT: What is Words-To-Numbers feature and how to use it

…the numeric.pegjs file to tune or extend the conversion functionality. ⚠ WARNING: Create a backup copy of numeric.pegjs before editing the file! Making incorrect changes can have unpredictable effects and eventually make STT stop working. Rules are described using PEG.js syntax, which is a JavaScript-like modification of Parsing Expression Grammar (PEG). Details about the syntax can be found at PEG.js…

FAQs (VIN)

Voice Inspector FAQ Q: When I start VIN software, it gives me “License is not for this application” error. A: Please attach the licensing file (license.dat) to the support ticket at our Service Desk. in FAQ Voice Inspector Permalink Q: When I start VIN software, it gives me “License expired” error. A: Please check that the licensing file (license.dat) file…

Understand SPE processing queue

…can be handled simultaneously is defined by server.n_workers for audio files processing and server.n_realtime_workers for realtime streams processing settings in SPE configuration file. This is by default set automatically, based on your hardware and software configuration – see How to configure Speech Engine workers article. The picture below demonstrates the queue processing (for the sake of simplicity, technologies assignments to…

Q: While trying to install SPE3, I get the error for loading libasound.so.2 libraries

Currently I’m trying to install the provided binaries for Linux, but I do get the following when running phxadmin: ./phxadmin: error while loading shared libraries: libasound.so.2: cannot open shared object file: No such file or directory I’m trying to run this under CentOS 7. A: Please install the right libraries required for manipulation with audio files from official repository into…

Sizing of the computing units for speech technologies

…technologies setup. If we assume that the whole machine is dedicated as a “speech computing unit” then, in general, we can calculate it as follows: file: phxspe.properties server.n_workers = <#_of_core> file: technologies.xml (no. of threads per technology, can be also set up by the phxadmin tool) SQE: <#_of_cores>/4 VAD: <#_of_cores>/2 other technologies: <#_of_cores> RAM: 8 cores = 32 GB 16…

Terms of Service

…the County Court in Brno under file C, insert 5124), provider of the PHONEXIA technology (hereinafter referred to as “PHONEXIA”) and you (“you”, “your”, „user“ or “Member”), and your use of and access to the website, PHONEXIA services or any other services provided by PHONEXIA (“Services”). 1.2. Agreeing to the Terms. To begin use of Services, you must agree to…

Speaker Diarization (DIAR)

…silence as well. The outputs of the technology can be both log files with labels and/or split audio files/one new multichannel audio file. Typical use cases: Preprocessing for other speech recognition technologies, labeling the parts of the utterance according to the speakers, splitting telephone conversations recorded in mono into several channels, identifying how many speakers are speaking in the recording….

Age Estimation (AGE)

…coding), A-law or Mu-law, PCM, 8kHz+ sampling Voiceprints: AGE L4 model supports SID4 L4 voiceprints; legacy AGE models support voiceprints created by AGE itself Output Log file with processed information (age estimate) Processing speed Approx. 20x faster than real-time processing on 1 CPU core i.e. standard 8 CPU core server processes 3,840 hours of audio in 1 day of computing…