Search: Language Model

42 results

STT: Language Model Customization tutorial

…STT model, put its name in the model parameter, like this: GET /technologies/stt?path=foobar.wav&model=<customized_model_name> Using customized STT model in command line STT To use customized STT model in command line STT, simply specify the new configuration file belonging to the customized STT model in the -config parameter. For example, assuming that original pl_pl_5 model was customized, specifying updated as the model…

Releases and Changelogs (SPE)

…need for command line tools… finally!) New: LID Language Packs allow to store meta-files New: New entity “LID Language Model” (equivalent of *.lpa LanguagePrint Archive) Improved: Updated STT model RU_RU_A to version 4.6.0 of (updated language model) Removed: Support for RLS-enforced licences in command line applications Removed: FeaturePasterRepeat warning on null/empty repeat vector Speech Engine 3.37 Speech Engine 3.37.1, DB…

LID: Terminology and adaptation

…languageprints created using model L4 can be combined into languageprint archive and/or language model only with languageprints created using model L4… and language pack for model L4 must consist only from language models created using languageprints/archives of model L4. Adaptation types overview Creating new language model from your own audio files, to add new language not supported out-of-the-box at least…

Release Notes

…XL4 model (compatibility must be explicitly enabled) Speech Engine: Speech to Text (STT) We have several exciting new features relevant to STT and KWS technologies: Czech (Czech Republic) language model updated (tech. model name: CS_CZ_6): We added new words to the language model, so recent frequent words like “COVID” are correctly transcribed. Slovak (Slovakia) language model updated (tech. model name:…

STT: Adding words to language model on the fly

Adding words to STT language model on-the-fly is possible in SPE 3.45 or newer as part of preferred phrases feature. The POST /technologies/stt or POST /technologies/stt/input_stream API calls actually serve two purposes: specify the actual preferred phrases (in the phrases part) specify words to be added to STT language model (in the dictionary part) Each part can be used independently,…

Language Identification (LID)

…Routing particular calls (languages) to human operators (language experts) Scoring and results The LID language pack defines a set of recognizable languages (represented by a language models). When identifying the language in audio recording (or languageprint), LID does the following: creates languageprint of the recording (if the input is audio recording) compares that languageprint with each language model in a…

Understand SPE database

…user), technology model used to create the profile, file with the profile content, hash rest_profile_sid4_metafiles list of files used as SID4 Audio Source Profiles metafiles rest_model_lid list of LID language packs – name, owner (SPE user), technology model to which the language pack belongs (i.e. technology model used to create source languageprints/language models) rest_model_lid_metafiles list of LID language packs metafiles…

Speech to Text (STT)

…1 CPU core (eg. standard 8 CPU core server (8 instances of STT) can process 1010 hours of audio in 1 day of computing time (flat load, depend on technology model)) Supported languages: List of supported languages. Acoustic models Acoustic model is created by training on training data. It includes characteristics of a voices of a set of speakers provided…

SPE and Browser installation: standalone SPE

…nr. 23) 1) Age Estimation [active model: XL5(1x)] 2) Denoiser Technology [active model: EN_US(1x)] 3) Diarization [active model: XL4(1x)] 4) Gender Identification [active model: XL5(1x)] 5) Keyword Spotting [active model: EN_US_6(1x)] 6) Phoneme Recognition [active model: EN_US_6(1x)] 7) Keyword Spotting Stream [active model: EN_US_6(1x)] 8) Language Identification LanguagePrint Comparator [active model: L4(1x)] 9) Language Identification LanguagePrint Extractor [active model: L4(1x)]…

STT: What is Preferred Phrases feature and how to use it

…a decoder. The decoder uses the information from acoustic model, combines it with information from language model recognition network (which describes the statistics about word grouping and sentences of a given language) and provides the transcription output. (See the Speech To Text article for more details about speech transcription principles) When using preferred phrases, we build additional language model…

Adding new language or technology model (Browser)

…our example, we are adding new Spanish model (ES_6 technology model) of Speech to Text and Keyword Spotting (with Phoneme Recognizer). When you install new languages or models, they are turned off by default and need to be enabled in Phonexia Browser. To turn new models on, open Phonexia Browser: go to Settings Switch to Speech Engine tab Open STT…

Releases and Changelogs (Browser)

…Compatibility with SPE 3.45 + all changes included in Feature Preview release 3.42 (see below) Phonexia Browser 3.42 Phonexia Browser 3.42.0, BSAPI 3.42.1 (2021-08-24) New: Server Information dialog New: Widget and dialog for managing language models New: Dialog for creating new language pack Improved: Language pack widget – add/remove language packs, show metafiles and language details Phonexia Browser 3.40 (Public…

FAQs (PSP)

…may use them for both training a new language pack and testing/comparing against an existing language pack. The language-prints need to be compatible only with the model of LID used for language-print extraction. in FAQ Speech Platform Permalink Q: What are the recommendations for LID adaptation set? A: The following is recommended: For adding new language to language pack 20+…

Arabic dialects in Phonexia LID and STT

…for each – North Levantine (apc) and South Levantine (ajp). Our models were trained using data from both varieties, therefore we followed RFC 5646, section 2.2.4 and created custom language code ar-XL, where the XL means “cross-Levantine” 😉 NOTE: To get the best STT results, use the model that corresponds to given dialect. The AR_XL_* model is best suitable for…

Understand SPE directory structure

…for individual models settings BSAPI configuration files (*.bs) and optionally manually created user configs (*.bs.usr) There is one exception – LID – which has additional two directories containing pre-built languageprint archives (*.lpa) and language packs: lprints and models. Schemes below show examples of directories for GID (Gender Identification), STT (Speech To Text) and LID (Language Identification): – GID and LID…