What is User configuration file and how to use it

Advanced users with appropriate knowledge (gained e.g. by taking the Phonexia Academy Advanced Training) may want to finetune behavior of the technologies to adapt to the nature of their audio data.
Modifying original BSAPI configuration files directly can be dangerous – inappropriate changes may cause unpredicatble behavior and without having a backup of the unmodified file it’s difficult to restore working state.
User configuration files provide a way to override processing parameters without modifying original BSAPI configuration files.

WARNING: Inappropriate configuration changes may cause serious issues!
Make sure you really know what you are doing.

User configuration file is a plain text file with the same name as main configuration file, with additional extension .usr. For example:

Main configuration file name	User configuration file name
`stt_cs_cz_5_online.bs`	`stt_cs_cz_5_online.bs.usr`
`kws_nl_nl_5.bs`	`kws_nl_nl_5.bs.usr`
`phnrec_pashto.bs`	`phnrec_pashto.bs.usr`
`vpextract4_xl4.bs`	`vpextract4_xl4.bs.usr`

During technology initialization (e.g. during Speech Engine startup), the initialization routine checks for existence of such user config file. If found, it’s automatically loaded after loading the main configuration file and the settings from user config is automatically applied over the setings from main configuration file.

Usage example:

When using Czech STT on realtime streams, the results show that system outputs end of segment too often, i.e. longer pauses between words made by the speakers are misidentified as end of sentence, while in fact the speakers actually continue to speak. So it is desired to finetune the system to accept longer delay between words without ending a sentence.

So, following the How to configure STT realtime stream word detection parameters article, we create a stt_cs_cz_5_online.bs.usr text file along the original stt_cs_cz_5_online.bs configuration file in <SPE directory>/bsapi/stt/settings directory and put the following lines in it (changing the forward extension parameter from default 750 to 1500):

[vad.online_segmenter:SOnlineVoiceActivitySegmenterI]
forward_extensions_length_ms=1500

Then after restarting SPE – and optionally checking in SPE log that user configuration file stt_cs_cz_5_online.bs.usr was really loaded (this information is available at the ‘trace’ logging level only) – the STT results should show end of segment less frequently.

Speech Engine update

STT: Results explained

What is User configuration file and how to use it

Previous Article

Next Article

ABOUT PHONEXIA

LEGAL

ACCOUNT

Previous Article

Next Article

Related Articles

ABOUT PHONEXIA

LEGAL

ACCOUNT

TAGS