Understand SPE configuration

Sizing of the system

The selection of speech technologies and the number of instances per technology which are instantiated when starting the SPE is configured by the phxadmin utility and the specification is by default saved to the file {spe_root}/settings/technologies.xml. The details about the XML structure are described in another article. Please note, that the technology configuration can be saved to another place. For example, it can be shared between multiple instances of SPE using network storage, virtual storage etc. This is useful for orchestrator-driven architectures using virtualization, dockerization or other types of managed instantiation.

# Path to technologies configuration
technologies.configuration = ${application.dir}settings/technologies.xml

Performance tuning

The server.n_workers directive determines how many instances of any initiated technology will work in parallel. Phonexia technologies can use full power of one physical core per one instance of any technology! It has the following consequences:

Hyperthreading has no significant effect for good CPU utilization
Virtual machines should be configured after careful consideration about performance
RAM speed is more important than CPU clock frequency
Because large amounts of data (statistical models) are loaded to RAM, the pipe bandwidth between RAM and CPU is IMPORTANT
L3 cache (shared between CPU cores) is a key player.

TIP: For best CPU utilization on a single physical server, configure your technologies.xml to the following number of instances:

SQE: <#_cpu_cores>/4

VAD: <#_cpu_cores>/2

any other technology: <#_cpu_cores>

(Note: your license should also be configured properly. Ask our Sales department for cooperation in case of hot-load evaluation tests. The production license will be configured with our assistance, of course)

Optimal RAM recommendation:

4 cores: 16 GB RAM

8 cores: 32 GB RAM

16 cores: 64 GB RAM

The server.n_workers directive is crucial for best performance of the whole system. The example below is optimal for a CPU with 8 physical cores:

# Multithread settings
server.n_workers = 8

TIP: You can setup system environment variables to determine how many physical cores are available. You can determine the number of physical cores by executing the following commands in a Linux console or in the MS Windows Command Prompt:

Linux: grep "^core id" /proc/cpuinfo | sort -u | wc -l

Windows: wmic cpu get NumberOfCores /value

Assign the value found by the above method to an environmental variable (let say, P_CORES) and use it like this server.n_workers = ${system.env.P_CORES} and voilà …

Realtime workers must be set separately to enable RTP or HTTP stream processing. Principally, it is the same as setting normal workers:

server.n_realtime_workers = 0

The SPE maintains its own queue of task requests waiting to be processed and the following directive determines the allowed number of tasks in the queue. You can do fine-tuning for each SPE instance load and compute the optimal usage of that instance. This directive should be taken into account also in the user rights settings on the server resource utilization (see SPE documentation).

# Sets limit for number of pending operations.
server.n_task_limit = 1000

The results from finished tasks are held in memory for fast retrieval for a specified number of seconds. If database use is disabled, the results are lost after the timeout. If database use is allowed (default), the results can be retrieved from the database as long as the audio recording is stored in the SPE storage.

# Timeout auto remove finished task
server.finished_task_timeout = 60

Documentation (SPE)

Understand SPE executable files

Understand SPE configuration

Sizing of the system

Performance tuning

Previous Article

Next Article

ABOUT PHONEXIA

LEGAL

ACCOUNT

Sizing of the system

Performance tuning

Previous Article

Next Article

Related Articles

ABOUT PHONEXIA

LEGAL

ACCOUNT

TAGS