Language Model Customization tool (LMC) provides a way to improve the Speech To Text performance by creating customized language model.
Language model is an important part of Phonexia Speech To Text. In a simplified way it can be imagined as a large dictionary with multiple statistics. The Speech To Text technology uses this dictionary and statistical model to convert audio signals into the proper text equivalents.
Due to general diversity of spoken speech, the default generic language model may not acknowledge the importance of certain words over other words in certain situations. Language model customization is a way to inform the system about these words.
The basic principle of the LMC tool is that it takes an existing STT model as a source and creates new STT model with your customizations included as a target.
To see results of the customizations, you need to use the new STT model for the transcription.
Currently supported language model customizations are:
- adding new words and/or pronunciations
This is intended for adding client-, domain- or product specific words like company names, product names, component names, etc.
Note: LMC works only with 5th or newer generation STT models.
LMC is provided as command line tool and is available from Phonexia either as part of Speech To Text package for command line, or as a separate download.
Customizing STT language model
1) Creating word list
Word list is UTF-8 encoded text file, containing list if words to be added to the STT language model, one word per line.
Note: LMC v3.30.0 (March 2020) or older requires the text file without Byte-Order-Mark (BOM)
If pronunciation is not explicitly specified, a default one generated internally will be used. To add multiple pronunciation variants for the same word, enter multiple word–pronunciation pairs, each on a separate line.
- the words
contractdon’t have any specific pronunciations defined
- the word
schneiderhas a specific pronunciation defined
- the abbreviation
MIThas two alternative pronunciations defined
iPhone contract schneider sh n ay d er MIT eh m ay t iy MIT m ih t
2) Creating customized STT model using LMC tool
Basic philosophy of the LMC tool is that it takes an existing model and creates its copy, with added customizations. The customized copy is marked by a name suffix, to differentiate it from the source.
The used word list file is “backed up” to the target directory where the customized copy is created.
The customized model can NOT be used as a source for subsequent customization (i.e. no cascading customizations are possible).
To “cumulate customizations” it’s necessary to create customized model using “cumulative word list” – that’s where the word list backup copied to the target model directory gets handy.
Basic LMC usage is
lmc -config <configuration_file> -add-words <wordlist_file> -model-suffix <model_name_suffix> -out-model-dir <directory_to_place_customized_output>
<configuration_file> is the
*.bs config file belonging to the existing model to be customized
<wordlist_file> is the word list file created in previous step
<model_name_suffix> is a text which will be added as suffix to the modified model name – for example, the default Polish model name is
pl_pl_5, so specifying
custom suffix will result in the customized model being named
<directory_to_place_customized_output> is the output directory where the resulting customized model will be placed (together with a copy of the word list file, as a backup)
Using customized STT model in Speech Engine STT
To use customized STT model in Speech Engine STT, it’s necessary to
- place the customized model in correct location, so that Speech Engine can find it
- register and enable the customized model in Speech Engine using
1) Placing the customized STT model in correct location
In order to be recognized by Speech Engine, the customized STT model must be placed in a correct location. The location is
<SPE_directory>/bsapi/stt – the
settings directories of the customized STT model should go here.
So either copy the customized STT model there manually, or let LMC to place its output directly there:
lmc -config ... ... ... -out-model-dir <SPE_directory>/bsapi/stt
2) Registering the customized STT model in Speech Engine
First make sure that Speech Engine is not running.
configure-tech parameter, select STT technology and enable the customized model which should be listed there.
Then launch Speech Engine.
3) Checking the customization result
You can then check that the customized STT model is listed in
GET /technologies list.
To use the customized STT model, put its name in the
model parameter, like this:
Using customized STT model in command line STT
To use customized STT model in command line STT, simply specify the new configuration file belonging to the customized STT model in the
For example, assuming that original
pl_pl_5 model was customized, specifying
updated as the model suffix, the corresponding STT command line to use the customized model would look similar to this:
stt -config settings\stt_pl_pl_5_updated.bs -in-file <input_file> -out-file <output_file> ...