Search: server

54 results

Understand SPE home directory

SPE home directory is an analogy of user home directory in operating systems (e.g. /home/ in *nix, /Users/ in macOS or Windows, etc.) – it is the place where SPE stores data for users configured in SPE. Default SPE home directory location is {SPE_installation_directory}/home/. This location can be changed using server.user.home setting in phxspe.properties SPE configuration file. Changing the home…

Recommended OS and HW (PSP)

Recommended operating systems Windows 64-bit – Windows Server 2019 (*), latest version of Windows 10 (*) Linux 64-bit – latest version of RHEL/CentOS 7 (*) Compatible Operating Systems (**) : 64-bit Windows 8.1, Windows Server 2016, and newer 64-bit Linux with glibc >= 2.17, e.g. Ubuntu 20.04, Mint 19.3, RHEL/CentOS 8.2, … (*) Speech Platform components (e.g. Speech Engine) are…

Understand SPE multithreaded technologies initialization

The server.technology_multithread_initialization setting in SPE configuration allows SPE to initialize instances of technologies during startup using multiple parallel threads. Default setting is OFF, i.e. instances of technologies are initialized using single thread, one-by-one. This allows easier tracking of eventual issues during SPE startup and better readability of technologies initialization log messages (only single initialization happens at a time). The downside…

Understand SPE audio converter

…the subsystem again failed to detect the format (BSAPI exception) and SPE couldn’t call the converter because it’s disabled – the upload fails with Unsupported audio format error response. SPE log ======= 2021-01-30 20:49:26 [Debug] server: Incoming request: [RID=2] from=127.0.0.1:52762, method=POST, URI=/audiofile?path=%2Ftest1.wav&format=json 2021-01-30 20:49:26 [Trace] ConverterSubsystem: Request stream saved to temporary file: C:\TMP\tmp9408aaaaaa 2021-01-30 20:49:26 [Error] ConverterSubsystem: Error during detecting…

Speech to Text (STT)

…1 CPU core (eg. standard 8 CPU core server (8 instances of STT) can process 1010 hours of audio in 1 day of computing time (flat load, depend on technology model)) Supported languages: List of supported languages. Acoustic models Acoustic model is created by training on training data. It includes characteristics of a voices of a set of speakers provided…

Keyword Spotting (KWS)

…experts. Typical use cases Call centers increase operator and supervisor efficiency by searching calls identify inappropriate expressions from operators check marketing campaigns with automatic script-compliance control Mass media and web search servers index and search multimedia by keyword route multimedia files and streams according to their content Security/defense maintain fast reaction times by routing calls with specific content to human…

Q: My NET license has stopped working, returning “Not enough free licenses” error.

…more instances than allowed by the license file (using -j parameter in command line). In rare cases your SW copy may have accidentally crashed. If this is the case, please wait for automatic license renewal period (60 minutes after last check). Check if your connection to the license server hasn’t changed. Check if validity of the license is not expired….

Q: We prefer USB dongle but without the USB storage

A: We don’t provide USB without memory storage, possible solutions are: establish security directives related to work with the USB dongle (persons allowed to, in/out memory scan check), use HW based licensing, use license server….

Q: I can’t manage to run Phonexia Browser software. I always get an error.

I always get the same error messages: unable to connect to the SPE unable to start the localhost: giving up and kill the localhost. A: This error may happen if the initialization of SPE engine takes too long. Phonexia Browser software treats it as initialization failure and kills the server. You can fix this by doing the following: Increase timeout…

Q: How can we test Phonexia technologies?

We can prepare a testing package for you with full functionality of all technologies. The license validity is 90 days to allow you to test the technologies. Note: by default a NET license is provided for testing. This license needs to have active Internet connection to a phonexia licensing server in order to function. Rest assured no data – audio,…

Sizing of the computing units for speech technologies

…technologies setup. If we assume that the whole machine is dedicated as a “speech computing unit” then, in general, we can calculate it as follows: file: phxspe.properties server.n_workers = <#_of_core> file: technologies.xml (no. of threads per technology, can be also set up by the phxadmin tool) SQE: <#_of_cores>/4 VAD: <#_of_cores>/2 other technologies: <#_of_cores> RAM: 8 cores = 32 GB 16…

Speaker Identification (SID)

…technological model and can range from 5 to 50 times faster than real time on 1 server CPU core. Voiceprint extraction is the most time-consuming part of the process. Voiceprint comparison, on the other hand, is extremely fast – a millions of voiceprint comparisons can be done in 1 second. Voiceprint extraction (Speaker enrollment) Speaker enrollment starts with the extraction…

Speech Quality Estimation (SQE)

…of bits used by the waveform absolute value if less than 8, the signal has insufficient quality wfilter_technical_signal_length – the length of technical signals (tones, wide-band noise, etc.), measured in seconds Processing speed Approx. 2,000x faster than real-time processing on 1 CPU core i.e. standard 8 CPU core server processes 384,000 hours of audio in 1 day of computing time…

Understand SPE administration and backup

…All other should be with “user” role (one user does not see content of other user). See Understand SPE user accounts for details. user.home – where the server stores the users data, see Understand SPE home directory for details LOG files – log file rotation is configured in phxspe.properties, see Understand SPE configuration file for details SPE database administration –…

Age Estimation (AGE)

…coding), A-law or Mu-law, PCM, 8kHz+ sampling Voiceprints: AGE L4 model supports SID4 L4 voiceprints; legacy AGE models support voiceprints created by AGE itself Output Log file with processed information (age estimate) Processing speed Approx. 20x faster than real-time processing on 1 CPU core i.e. standard 8 CPU core server processes 3,840 hours of audio in 1 day of computing…