Skip to contentSkip to main navigation Skip to footer

Understanding SPE database

SPE database serves multiple purposes:

  • stores SPE internal data
  • stores various information about SPE entities created by SPE user
    • audio files metadata
    • speaker models and their voiceprints
    • speaker groups and their voiceprints
    • calibration sets
    • keyword lists
    • language packs
    • audio source profiles
  • stores cached processing results (optional, can be set in SPE configuration file)
  • stores SPE log data (optional and MySQL only, can be set in SPE configuration file)

To cache or not to cache?

Well, that’s a question… 😉 It depends on the particular use case AND on the design of your application, whether using the built-in results caching would be beneficial or not.
In general, the built-in results caching can be useful when creating simple lightweight application. When building a complex voiceprocessing system, using multiple SPE processing units, load balancing, etc., it should be generally better to disable the built-in results caching and create your own caching layer, tailored specifically for your particular system architecture and/or processing workflow.

Cached data persistency

Cached processing results are kept in database as long as the audio file exists. When the audio file is deleted from SPE storage, all related information, metadata and processing results are deleted from the database. Stream processing data is not cached at all.

If data privacy and security is a concern, disabling the built-in results caching ensures that processing results are returned only via the REST API response and are not kept in the database at all.

Supported databases

SPE supports SQLite and MySQL 5.x database engine.
The database engine is configured in phxspe.properties SPE configuration file – see the Database section of SPE configuration file article for more details.

SQLite

SQLite is the out-of-the-box SPE default database type.
By its nature, SQLite is intended mainly as lightweight storage for configuration data. Still, it can handle also the results caching of course… unless we talk about real mass-processing.
When using results caching AND processing like hundreds of thousands or millions of audio files per day, the SQLite’s locking mechanism (simple global database lock) can become a performance bottleneck… and choosing a higher-performance MySQL database is the way to go.

When SPE is configured to use SQLite database, the database is created and initialized automatically by running phxadmin or phxspe.
SQLite database is typically created during first-time SPE setup, when configuring technologies using phxadmin – it’s created silently behind the scenes, using values from phxspe.properties configuration file (location, file name) [1] and default SPE configuration (users, roles, etc.).

SQLite database updates are also handled automatically by SPE – from time to time, as we add new features or improve existing functionality, the database internal structure may get updated in newer SPE versions. When using SQLite, if new SPE version detects that database needs an update, it’s done fully automatically behind the scenes.

[1] If Speech Engine is used together with Phonexia Browser in so-called “embedded” mode (see details about “embedded SPE” mode in Browser manual), Phonexia Browser creates its own separate SPE configuration file and the SQLite database file is located in SPE home directory and named phxserver.sqlite.
This might be important in certain scenarios, e.g. when registering LID language pack using phxadmin – you need to point the phxadmin to the appropriate SPE configuration file in order to make the changes to correct database.

MySQL

MySQL database is a high-performance alternative for SQLite.
As opposed to SQLite, MySQL uses fine-grained locking mechanisms, resulting in higher performance in environments with high concurrency – e.g. in mass-processing deployments with multiple SPE processing units and results caching in central database, etc.

When SPE is configured to use MySQL database, the database must be created and initialized manually first using SQL scripts provided in SPE distribution package.
Similarly, when updating SPE to newer version, any eventual required MySQL database updates must be done manually as part of the manual SPE update process using the SQL scripts.
See more details in Understanding SPE database scripts article.

Database size

Database is not being vacuumed/optimized/shrinked… however, the database space freed by deleted data is re-used by newly added data.
Therefore it is normal that database size grows over time to a certain extent. Assuming that a) the daily input load is more-or-less the same, and b) that processed/unneeded audio files get removed from SPE storage, the database would grow to a certain size and then stay at that size, as the number/size of new DB records gets in balance with the number/size of deleted DB records.

In any case, if the database gets oversized e.g. by one-off processing of unusual amount of audio, it can still be manually vacuumed/optimized/shrinked using the appropriate commands.

Database structure and content

SPE database consists of tables and views with rest_ prefix (this comes from SPE’s predecessor named Phonexia REST Server).
Based on type of data they contain, these can be divided to following groups:

  • SPE internal data
    • information about files and directories in SPE storage
    • internal data: resource types, resource locks, users, user roles, user sesssions, technology models
  • user-created entities data
    • SID speaker models and their voiceprints, speaker groups, calibration sets, audio source profiles
    • LID language models
    • KWS keyword lists
  • cached processing results
    • if caching is enabled, processing results for each technology

SPE internal data

Tables containing SPE internal data:

rest_directory_typelist of internal directory types
rest_file_shadowlist of information about files registered in SPE – path, creation and modification timestamps, owner (SPE user), directory
rest_logSPE log data, see above
rest_resource_typelist of internal resource types – file, SID speaker model, SID speaker group, SID calibration set, SID audio source profile, KWS keyword list, LID language pack
rest_resource_locklist of resources locked during processing
rest_rolelist of pre-defined SPE user roles
rest_userlist of SPE users and their settings and status – login, password, active/inactive, max. pending operations, current pending operations
rest_user_roleassociations between users and roles
rest_user_sessionlist of active user API sessions
rest_technology_modellist of technology model names

User-created SPE entities data

Tables containing data about entities created by SPE users:

rest_model_sidlist of SID speaker models – name, owner (SPE user), modification timestamp
rest_model_sid_sourceslist of files used as sources for SID speaker models creation
rest_model_sid_metafileslist of files used as SID speaker models metafiles
rest_group_sidlist of SID speaker groups – name, owner (SPE user)
rest_group_sid_modelsassociations between SID speaker groups and speaker models
rest_voiceprintSID voiceprints – voiceprint data, technology model used to create the voiceprint, speaker model to which the voiceprint belongs (speaker model voiceprints), calibration set to which the voiceprint belongs (FAR calibration set voiceprints)
rest_model_sid_calib_voiceprintSID speaker model voiceprints calibrated to FAR – voiceprint data, speaker model, technology model used to create the voiceprint, max. FAR, calibration set used to calibrate the voiceprint
rest_calibset_sidlist of SID FAR calibration sets – name and modification timestamp, owner (SPE user)
rest_calibset_sid_sourceslist of files used as sources for SID FAR calibration sets creation
rest_calibset_sid_metafileslist of files used as SID FAR calibration sets metafiles
rest_calibset_sid_total_chunksnumber of chunks in SID FAR calibration sets
rest_profile_sid4list of SID4 Audio Source Profiles – name, owner (SPE user), technology model used to create the profile, file with the profile content, hash
rest_profile_sid4_metafileslist of files used as SID4 Audio Source Profiles metafiles
rest_model_lidlist of LID language packs – name, owner (SPE user), technology model to which the language pack belongs (i.e. technology model used to create source languageprints/language models)
rest_model_lid_metafileslist of LID language packs metafiles
rest_model_kwsKWS keyword lists – keyword list JSON data, keyword list name, owner (SPE user), technology model to which the keyword list belongs

Processing results data

Tables containing cached processing results (if results caching is enabled):

rest_result_ageAGE processing results – file, used technology model, results JSON data
rest_result_diarDIAR processing results – file, used technology model, used processing parameters, results JSON data
rest_result_gidGID processing results – file, used technology model, results JSON data
rest_result_kwsKWS processing results – file, used technology model, used keyword list, results JSON data
rest_result_lidLID processing results – file, used technology model, used language pack, results JSON data
rest_result_phnrecPHNREC processing results – file, used technology model, results JSON data
rest_result_sidSID processing results – file, used technology model, used speaker model, used FAR calibration set, max. FAR, results JSON data
rest_result_sid4SID4 processing results – file, used technology model, used speaker model, used file- and speaker model Audio Source Profile, results JSON data
rest_result_sqeSQE processing results – file, used technology model, results JSON data
rest_result_sttSTT processing results – file, used technology model, results JSON data
rest_result_taeTAE processing results – file, used technology model, results JSON data
rest_result_vadVAD processing results – file, used technology model, results JSON data

SPE logging to database

Storing SPE logs to database is available only for MySQL.
This is mainly for performance reasons – SQLite is not designed for high concurrency, i.e. its locking mechanism would create a bottleneck… especially in setups where multiple SPE instances are configured to store the logging data into the same database.

Log data is stored in rest_log table and includes the following columns:

Sourceidentifier of SPE subsystem which created the log record
Nameidentifier of source SPE which created the log record
can be set by server.identifier or server.logging.database.identifier configuration settings (see SPE configuration file explained for details)
ProcessIdnumeric PID of the process which created the log record
Threadidentifier of thread which created the log record
ThreadIdnumeric ID of thread which created the log record
Prioritypriority of the operation which created the log record
Textraw log text as it would be written into log file or console
DateTimelog record creation timestamp
Example of SPE log in MySQL database

Privacy Preference Center

Necessary

Required cookies required for proper function of Word Press publication platform.

gdpr*, wordpress*,cf7*,wp-settings*,PHPSESSID

Analytics

We are using Google Analytic in Global Site Tag configuration for keeping site content optimized for great user experience. No personal data are sent.

_ga*,_gid