Search: Language models

34 results

Language Identification – Languages

Recognized languages Languages pre-trained in the default language pack are listed in the table below, each LID generation is a separate column (in the 4th generation we switched to using language tags instead of names): L4 L3, XL3 S2, L2 (deprecated sq-AL Albanian Albanian Albanian am-ET Amharic Amharic Amharic ar-EG Arabic (Egypt) Arabic ar-KW Arabic (Gulf, Kuwait) Arabic_Gulf …

STT: What is Preferred Phrases feature and how to use it

…a decoder. The decoder uses the information from acoustic model, combines it with information from language model recognition network (which describes the statistics about word grouping and sentences of a given language) and provides the transcription output. (See the Speech To Text article for more details about speech transcription principles) When using preferred phrases, we build additional language model…

Understand SPE technologies configuration file

…of XL4 model <?xml version=”1.0″?> <technology_subsystem_settings> <technologies> <item> <name>STT</name> <models> <item> <name>SK_SK_5</name> <n_instances>8</n_instances> <config_file /> </item> </models> </item> <item> <name>STT_STREAM</name> <models> <item> <name>CS_CZ_6</name> <n_instances>2</n_instances> <config_file /> </item> </models> </item> <item> <name>SID4E</name> <models> <item> <name>L4</name> <n_instances>2</n_instances> <config_file /> </item> <item> <name>XL4</name> <n_instances>3</n_instances> <config_file /> </item> </models> </item> <item> <name>SID4C</name> <models> <item> <name>L4</name> <n_instances>2</n_instances> <config_file /> </item> <item> <name>XL4</name> <n_instances>3</n_instances> <config_file />…

Speech To Text / Keyword Spotting supported languages

…Standard Turkish (Turkey) TR_TR_6 2022-01 8th gen. Standard Ukrainian (Ukraine) UK_UA_6 2023-04 8th gen. Standard Vietnamese (Vietnam) VI_VN_6 2021-10 8th gen. Standard Deprecated languages/models (not supported, after end-of-life) Older/other languages or models not listed in the above table are no longer supported and reached end-of-life. These are 1st, 2nd, 3rd or 4th generation models, typically marked with a number 1,…

STT: Language Model Customization tutorial

Language Model Customization tool (LMC) provides a way to improve the Speech To Text performance by creating customized language model. Language model is an important part of Phonexia Speech To Text. In a simplified way it can be imagined as a large dictionary with multiple statistics. The Speech To Text technology uses this dictionary and statistical model to convert audio…

FAQs (Browser)

…Browser. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What languages do you offer? It depends on the technology. Phonexia Language Identification (LID) is pre-trained for 60+ languages. Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT) for 20+ languages including English, French, German, Russian, Spanish and many more. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What…

Understand SPE benchmark

…if such directory is found, audio files from that directory are used (expecting that the audio contains speech in that corresponding language). If not found, it falls back to default directory. The reason for language-specific data is that processing audio in different language than the language for which the model was trained negatively affects the processing speed (basically, the processing…

Understand SPE user accounts

…not visible by SPE and by the account. Similar trickery can be done with the data directory, allowing to share LID language models and language packs, or SID speaker models, etc. between accounts. User accounts management SPE user accounts can be managed using REST API (see Administration section of the API documentation), or using command line administration utilities phxadmin or…

Arabic dialects in Phonexia LID and STT

Arabic language has (a) one standardised variety, and (b) many non-standard varieties (dialects). In this article, our linguistic team explains differences between Modern Standard Arabic and Arabic dialects in the context of Phonexia Arabic models. Standard variety: Modern Standard Arabic (MSA) All Arabs learn it at school (not from their parents, so we cannot say it is their native variety)…

STT: What is Words-To-Numbers feature and how to use it

This article explains details of new STT feature for native numeric numbers and dates trancription in n‍-best output and gives some tips for fine-tuning the results. NOTE: The feature works out-of-the-box in the following STT languages and models: English – EN_US_6 and EN_US_A_6 Spanish – ES_6 Polish – PL_PL_6 Czech – CS_CZ_5 and CS_CZ_6 Slovak – SK_SK_5 and SK_SK_6 You…

Understand SPE metafiles

…separate files. Another example would be the information about content of created LID language pack – if LID language pack is successfully created, SPE creates a metafile named report, which contains detailed information about the source files used for the language pack creation. See the LID language pack creation REST endpoint documentation for more details about the report metafile content….

Releases and Changelogs (VIN)

…the act and the Conclusion of the case can be edited from the case information table. Improved: Copying from chart values table is enhanced – added header to copied data, added Ctrl+C keyboard shortcut to copy table data. Improved: Report template is enhanced for all included languages – improved report layout, easier CSS styling via style.css, added new report variables…

Phonexia Speech Engine

…main binary file itself SPE requires database, which might be SQLite (delivered inside Phonexia package) or MySQL. No other components are needed. Structure of Technologies and technology models From the technical point of view, every technology can work with different technology modules. These are various languages for STT (CS_CZ4, EN_US4), or various sizes for SID (L3, XL3). Technology can work…

Download Speech Platform

…only English models for Speech To Text and Keyword Spotting. Additional supported languages are available upon request. ⓘ Click to show/hide the package content Speech Engine – technologies included: Speech To Text (STT) – model EN_US_6 (US English) Keyword Spotting (KWS) – model EN_US_6 (US English) Phoneme Recognizer (PHNREC) – model EN_US_6 (US English) Speaker Identification 4 (SID4) – model…

Gender Identification (GID)

Gender Identification is a language-, domain- and channel-independent technology that uses the acoustic characteristics of the recording to determine the gender of the speaker in question. This technology is able to distinguish between two genders: Male (M) and Female (F). Minimum of speech signal for identification: 7+ sec recommended with XL5, XL4 and L4 model (9+ sec for previous generation…