Skip to content Skip to main navigation Skip to footer

Language Identification (LID)

Phonexia Language Identification (LID) will help you distinguish the spoken language or dialect. It will enable your system to automatically route valuable calls to your experts in the given language or to send them to other software for analysis.

Application areas

  • Preselecting multilingual sources and routing audio files to language-dependent technologies (transcribing, indexing, etc.)
  • Analyzing network traffic media (language statistics)
  • Routing particular calls (languages) to human operators (language experts)

Scoring and results

The LID language pack defines a set of recognizable languages.
When identifying the language in audio recording (or languageprint), LID does the following:

  1. creates languageprint of the recording (if the input is audio recording)
  2. compares that languageprint with each language in a language pack
    • and calculates probability that these two languages are the same

The final scores are returned as logarithms of these individual probabilities – i.e. as values from {-inf,0} interval – for each language in the language pack.
(to convert raw LID score to percentage, use e score * 100 formula)

LID adaptation (custom language packs)

The scoring principle described above implies that score is distributed among all languages in a language pack.
It means that every language has to score with non-zero value… i.e. that the scores may get diluted as they get spread among many languages.
Additionally, if the language pack contains too many non-equally trained languages (i.e. using very different amount of source audio), the entire system could be influenced and generate low scores even for matching languages.

Therefore it is a good idea to create language pack containing only limited number of languages, e.g. by excluding some really exotic ones, or by keeping only those few languages actually expected in your use case.

This process of tailoring the language pack for particular needs is called language pack adaptation and is described in LID adaptation article.

Example usages of custom language packs

  • Law enforcement agency monitoring a network of criminals using only a particular set of languages can use the approach of keeping only languages expected to appear in the traffic.
    This can reduce the number of scored languages to like 3 or 5 languages only.
  • Multilingual call center serving European market can use the approach of excluding languages which surely won’t appear in their traffic – like African ones (Afan, Hausa, …), Asian ones (Chinese, Japanese, …), etc. – while still keeping languages which are less likely, but still possible to appear.
    This can reduce the number of scored languages from 60+ languages (included in the default out-of-the-box language pack) to like 20 or even less languages.

In both cases, limiting the number of languages in a language pack results in the scores being distributed among less languages, i.e. the score values getting higher with clearer distinction between languages and clearer gap between best-scoring language and the other ones.

Here is an example of a Turkish phone call identification

You may notice a much sharper score when using a Language pack with only relevant languages (77.3% vs 93,3%):

Using default language pack
with 60+ languages
Using limited language pack
with 20 European languages
Language Raw score Percentage Language Raw score Percentage
Turkish -0.258 77.270 Turkish -0.069 93.326
Uzbek -2.436 8.753 Albanian -4.347 1.294
Azerbaijani -3.027 4.845 Hungarian -4.657 0.949
Dari -4.432 1.190 Ukrainian -5.037 0.649
Albanian -5.139 0.586 Swedish -5.088 0.617
Tibetan -5.270 0.515 French -5.168 0.570
Georgian -5.277 0.511 English_British -5.316 0.491
Swedish -5.384 0.459 Macedonian -5.443 0.433
Farsi -5.737 0.323 Greek -5.698 0.335
Hungarian -5.777 0.310 Serbian -6.002 0.247