LID: Terminology and adaptation

Table of Contents

Adaptation using command line

It is also possible to perform the LID adaptation tasks using command line tool phxcmd.

NOTE: Version 3.37 or older does not contain phxcmd, but multiple separate tools instead. If you are using such version, simply omit the phxcmd in the commands, e.g.:

use lpextract instead of phxcmd lpextract for extracting languageprints from audio files
use lppack instead of phxcmd lppack for creating languageprint archives
use lid instead of phxcmd lid for language pack training

Creating new language

STEP 1: Extract languageprints from recordings using lpextract command.

The example below demonstrates commands to extract languageprints from audio recordings in 2 languages, each language located in separate directories

recordings in first language are located in /path/to/my/audio/MyLanguage directory
recordings in second language are located in /other/path/to/audio/MyOtherLanguage directory
created languageprints get stored to /path/to/my/languageprints, each language to its own separate subdirectory
we use the “L4” model, hence the _l4 configuration file suffix

./phxcmd lpextract -v -c /path/to/lid/settings/lpextract_l4.bs -d /path/to/my/audio/MyLanguage -e wav -D /path/to/my/languageprints/MyLanguage
./phxcmd lpextract -v -c /path/to/lid/settings/lpextract_l4.bs -d /other/path/to/audio/MyOtherLanguage -e wav -D /path/to/my/languageprints/MyOtherLanguage

etc. for more languages...

where:

-v parameter tells the tool to provide verbose console output
-c parameter specifies path to .bs BSAPI configuration file for lpextract (use suffix according to LID technological model you are using – “l4”, “l3”, “xl3”)
-d parameter specifies path to input directory with source audio files
-D parameter specifies path to output directory where you want to have the extracted languageprints stored
NOTE: the directory name will be used as the language name in next step
-e parameter specifies file extensions to be included in languageprint extraction (if you have raw files instead of wav, the extension would be e.g. “raw”)

NOTE:
If you want to enhance existing languge using your own audio files – as opposed to creating new language from scratch – copy the existing pre-trained .lpa file into the directory with your languageprints before continuing to next step.
Make sure to use the .lpa file from correct LID technological model ( “L4”, “L3”, “XL3”)!

STEP 2: Pack the individual languageprints to languageprint archives using lppack command.
You need to specify path to parent directory of the directory holding your languageprints extracted in the previous step. The subdirectory name(s) will be used as languageprint archive name(s).

In the example below

we use input directory /path/to/my/languageprints from previous step, which contains subdirectories MyLanguage and MyOtherLanguage with extracted languageprints (also used in previous step)
we use output directory bsapi/lid/lprints/l4, which is the directory containing the pre-created languageprint archives supplied by Phonexia – you can put them anywhere else, just make sure to use the correct path in the listfile when creating the language pack (see further below)

The lppack command automatically names the created languageprint archives using names of the subdirectories, i.e. in our example it will be MyLanguage.lpa and MyOtherLanguage.lpa.

./phxcmd lppack -v -d /path/to/my/languageprints -D bsapi/lid/lprints/l4

where:

-v parameter tells the tool to provide verbose console output
-d parameter specifies path to input parent directory
-D parameter specifies path to output directory where the output languageprint archive(s) will be created
NOTE: the archive(s) will be named using names of subdirectories under the input directory

Resulting languageprint archive(s) can be used for creating custom language packs, see below for details.

Creating language pack

STEP 1: Prepare a listfile with list of languageprint archives corresponding to the languages you want to have in the language pack – each line starts with language name, followed by a TAB or SPACE character, and a path to the .lpa file.

Make sure that

paths are valid – relative paths must be relative to the location of your list file… or simply use absolute paths
paths lead to the correct technological model directory of your choice (l4, l3, xl3, …)

Example below assumes that the listfile will be saved to {SPE} directory (hence the relative paths to bsapi/...) and also assumes the “L4” model. You should reflect your setup accordingly.

cs-CZ           bsapi/lid/lprints/l4/cs-CZ.lpa
pl-PL           bsapi/lid/lprints/l4/pl-PL.lpa
en-GB           bsapi/lid/lprints/l4/en-GB.lpa
ru-RU           bsapi/lid/lprints/l4/ru-RU.lpa
MyLanguage      bsapi/lid/lprints/l4/MyLanguage.lpa
MyOtherLanguage bsapi/lid/lprints/l4/MyOtherLanguage.lpa

Save the listfile as e.g. MyLanguagePack.txt.

STEP 2: Train language pack using the listfile and lid command.
Example below assumes that MyLanguagePack.txt listfile is located in the {SPE} directory (as per the step above) and uses the “L4” model… The l4_MyLanguagePack is the chosen name of output directory where the trained language pack will be stored:

./phxcmd lid -v -c /path/to/lid/settings/lid_l4.bs -l MyLanguagePack.txt -train -M bsapi/lid/models/l4_MyLanguagePack

where:

-v parameter tells the tool to provide verbose console output
-c parameter specifies path to .bs BSAPI configuration file for lid (use suffix according to LID technological model you are using – “l4”, “l3”, “xl3”)
-l parameter specifies path to input listfile created in previous step
-train parameter tells the tool to train new language pack
-M parameter specifies path to output directory where you want to have the language pack created
NOTE: it is strongly recommended to use a subdirectory of “models” directory, to simplify the language pack registration to SPE

STEP 3: Register language pack to SPE and verify that it works as expected.
See Using custom language pack in Speech Engine chapter for details.

Using custom LID language pack in Speech Engine

To use customized LID language pack in Speech Engine, it’s necessary to

ensure that language pack placed in correct location, so that Speech Engine can find it
register and enable the language pack in SPE using phxadmin

1) Put the language pack in correct location

In order to be recognized by Speech Engine, the language pack needs to be in a correct location. The location is <SPE_directory>/bsapi/lid/models – if you have followed the above instructions correctly, your language pack should be already in the right place.
If the directory with your language pack is not there, copy it there.

2) Register the language pack in Speech Engine

First make sure that Speech Engine is not running.
Then run phxadmin tool with add-language-pack parameter pointing to the language pack directory and config parameter pointing to appropriate Speech Engine configuration file:
Example (on Windows, use / instead of -- as parameter delimiter):

./phxadmin --add-language-pack="bsapi/lid/models/l4_MyLanguagePack" --config="settings/phxspe.properties"

where:

--add-language-pack parameter specifies path to your language pack directory
--config parameter tells the tool which SPE configuration file to use
Default Speech Engine configuration file is settings/phxspe.properties.
However, when using Phonexia Browser in “SPE on localhost” mode (also known as “Embedded SPE”), the configuration file is settings/phxspe.browser.properties.
(Make sure to use the right configuration file, otherwise you might register the language pack to different configuration and therefore it won’t be visible where you would expect it.)

phxadmin then asks under which LID model and under which user should the added language pack be registered.
In the example below we are registering it under “L4” model (since we used “l4” source files to create it) and under “admin” user:

List of supported LIDC models:
1) L4
2) XL3
Choose model of LIDC [1]: 1
Login: admin
Language pack 'l4_MyLanguagePack' has been added to user 'admin'.

Then launch Speech Engine.
If everything was done successfully, you should see the new language pack

listed in response to GET /technologies/languageid/languagepacks REST query
- i.e. available for use in model=... parameter in GET /technologies/languageid REST queries
listed in Language models pane in Phonexia Browser
- i.e. available for selection for processing by Language Identification in Browser

STT: How to properly convert Confusion Network results to One-best

KWS: Results explained

LID: Terminology and adaptation

Adaptation using command line

Creating new language

Creating language pack

Using custom LID language pack in Speech Engine

1) Put the language pack in correct location

2) Register the language pack in Speech Engine

Previous Article

Next Article

ABOUT PHONEXIA

LEGAL

ACCOUNT

Adaptation using command line

Creating new language

Creating language pack

Using custom LID language pack in Speech Engine

1) Put the language pack in correct location

2) Register the language pack in Speech Engine

Previous Article

Next Article

Related Articles

ABOUT PHONEXIA

LEGAL

ACCOUNT

TAGS