Skip to content Skip to main navigation Skip to footer

Sizing of the computing units for speech technologies

Best practices for good sizing of Phonexia technologies depend on a few facts:

  1. Intense work with large data sets requires good performance and bandwidth between RAM and CPU. It all depends on the size of the files with technological models data, usually loaded into RAM and used intensively for computing operations
  2. Always think only about physical cores of CPU (HT, VT features can’t help in performance)
  3. Also seek for CPUs with a large L3 cache. And the better CPUs are those with higher l3_cache_size/#_of_physical_CPU_cores ratio. We currently assume that CPUs from the current Intel Xeon Family in the 4th generation are the best. For small computation tasks, i7 family CPUs also have reasonable price/performance ratio)
  4. Big challenge: correct SPE3/Speech platform technologies setup. If we assume that the whole machine is dedicated as a “speech computing unit” then, in general, we can calculate it as follows:

file: phxspe.properties

server.n_workers = <#_of_core>

file: technologies.xml (no. of threads per technology, can be also set up by the phxadmin tool)
SQE: <#_of_cores>/4
VAD: <#_of_cores>/2
other technologies: <#_of_cores>
RAM:
8 cores = 32 GB
16 cores = 64 GB

Conclusion:

  1. The best computing performance can be expected from a CPU with:
    l3_cache_size/#_of_physical_CPU_cores=>2.5 MB
  2. Memory bandwidth & speed is more important than CPU base frequency.
  3. Intel fixes on TLB  due to Meltdown and Spectre issues matters in performance.

Important notice (valid for SPE3) – due to internal SPE3 requirements you must multiple the required number of licenses for the following technologies as follows:

  • LID_license_no=<no_of_planned_instances>*2
  • SID_license_no=<no_of_planned_instances>*3

In case you need more instances for hot-load testing of Phonexia technologies, please contact your Sales Representative from Phonexia Sales Department.

Related Articles