Search: spe-3.11.1-win64

127 results

Voice Activity Detection (VAD)

…VAD is usually part of rapid filtration process in deployment. Typical use cases are: detection of present or absent human speech for voice processing, filtering non-speech parts of the recording, filtering out recordings with not enough net speech to be processed by other technologies voice activated process, etc. The speed of Voice Activity Detection is 140 ftRT per one instance….

Open Source Acknowledgement

…License) link mman-win32 (Windows only) MIT ogg BSD-style license onnxruntime MIT, link openfst Apache License openssl OpenSSL opus BSD range-v3 BSL-1.0 scnlib Apache License 2.0 spdlog MIT speex revised BSD license speexdsp BSD utfcpp BSL-1.0 zlib Zlib stdlibc++, libgcc, libwinpthread (Windows only) GNU GPL with GCC Runtime Library Exception, link SPE dependencies Name License ADVobfuscator GitHub – andrivet/ADVobfuscator: Obfuscation library…

Phonexia End User License Agreement

Please read the terms and conditions of this End User License Agreement (the “Agreement”) carefully before you use the Phonexia proprietary software providing speech solutions, technologies and accompanying services (the “Software”) delivered and marketed by Phonexia s.r.o., having its registered seat at Chaloupkova 3002/1a, 612 00 Brno, Czech Republic, identification number: 27680258, registered in the Commercial Registry maintained by the…

What is User configuration file and how to use it

…example: When using Czech STT on realtime streams, the results show that system outputs end of segment too often, i.e. longer pauses between words made by the speakers are misidentified as end of sentence, while in fact the speakers actually continue to speak. So it is desired to finetune the system to accept longer delay between words without ending a…

Recommended OS and HW (PSP)

…tested by Phonexia on these systems. (**) Speech Platform components (e.g. Speech Engine) are known to be successfully deployed on these systems. Recommended hardware Required HW resources depend on set of technologies (i.e. SPE configuration) and the load that should be processed per day (or during a peak hour). Additionally, your own application built on top of SPE (including eventual…

Download other languages for Speech platform

This part requires higher (and non-anonymous) access level.
How to solve this situation:

Log in here if you are not logged in.
Register here. It takes just a few clicks and it’s free.

Get better support

…product executable file “properties” files (phxspe.properties from SPE is minimum) – usually in ./settings/ directory Issue data – supporting: Actual and active HW configuration (CPU, OS, RAM, storage status (free space)), you might use the following to get the information: The Benchmark function in SPE or Phonexia’s hw-gen for generating basic HW print Windows 64bit http://download.phonexia.com/utils/hw-gen64.exe GNU/Linux 64bit http://download.phonexia.com/utils/hw-gen64 System…

DELETE – Software Vetting (Best Practice)

This part requires higher (and non-anonymous) access level.
How to solve this situation:

Log in here if you are not logged in.
Register here. It takes just a few clicks and it’s free.

Q: Why does the system show high score (>90%) even for non-targets?

A: Threshold for score isn’t set up correctly. Adjust speaker score sharpness value to calibrate the recalculation. Please see Calibration in technology documentation….

Q: How can I add new language to LID?

A: There are multiple methods to train a new language, please see article in Components > Speech Technologies > LID….

Q: What are the recommendations for LID adaptation set?

A: The following is recommended: For adding new language to language pack 20+ hours of audio for each new language model (or 25+ hours of audio containing 80% of speech) Only 1 language per record For adapting the existing language model (discriminative training) 10+ hours of audio for each language May be done on customer site. May be done in…

Home Page

KWS: Results explained

…before the keyword (1), the Keyword model (2) and a Background model of any speech parallel with the keyword model (3). Models 2 and 3 produce two likelihoods – Lkw and Lbg (any speech = background). Raw score is calculated as log likelihood ratio (LLR): score = loge(Lkw/Lbg) Confidence is calculated from the raw score using a sigmoid function: where:…

Q: What languages are supported by LID?

A: Please see List of supported LID Languages. For more details, see LID technology documentation….

Q: What languages are supported by KWS?

A: Please see List of supported KWS Languages. For more details, see KWS technology documentation….