Search: new language

32 results

Understand SPE directory structure

…for individual models settings BSAPI configuration files (*.bs) and optionally manually created user configs (*.bs.usr) There is one exception – LID – which has additional two directories containing pre-built languageprint archives (*.lpa) and language packs: lprints and models. Schemes below show examples of directories for GID (Gender Identification), STT (Speech To Text) and LID (Language Identification): – GID and LID…

STT: Adding words to language model on the fly

Adding words to STT language model on-the-fly is possible in SPE 3.45 or newer as part of preferred phrases feature. The POST /technologies/stt or POST /technologies/stt/input_stream API calls actually serve two purposes: specify the actual preferred phrases (in the phrases part) specify words to be added to STT language model (in the dictionary part) Each part can be used independently,…

Speech To Text / Keyword Spotting supported languages

Languages supported by Speech To Text and Keyword Spotting Standard = Maintained until newer generation is released, or end of support is reached. Language generation is specified by the number in “Model name”. Language (region) Model name Released End of support Maintenance Arabic (Gulf, Kuwait) AR_KW_6 2022-04 8th gen. Standard Arabic (Levantine) AR_XL_6 2021-05 8th gen. Standard AR_XL_5 2020-08 7th…

Speech Engine update

…technology models configuration usually introduces new features or major fixes, which may change communication between server and client, or other changes which may affect customer processes can also include new technology models; with such update you can add only the new technology, without SPE installation Upgrade changes the first version number (e.g. x.y.z to x+1) and is a major change…

Understand SPE connectors for external TTS

…from stdin is as follows: { “text”: string, “voice”: { “name”: string, “languageCode“: string } } Where: text is the text to be synthesized name is a voice name to be used for synthesis (ref. to the voice names provided in the connector “info” data) languageCode is a language code defining the language to be used for synthesis (ref. to…

SID: Speaker Identification: Results Enhancement

…the new profile is defined by specifying a directory of recordings to be used and calibration modes that should be performed when using the profile. Once the profile is created, its parameters cannot be changed. Instead, there is the possibility of creating a new “child” profile based on a previously created profile by specifying the original as “parent” during the…

Understand SPE benchmark

…if such directory is found, audio files from that directory are used (expecting that the audio contains speech in that corresponding language). If not found, it falls back to default directory. The reason for language-specific data is that processing audio in different language than the language for which the model was trained negatively affects the processing speed (basically, the processing…

Arabic dialects in Phonexia LID and STT

…TEXT (used for STT language model training) MSA is used in all formal writing such as official correspondence, literature, newspapers, webpages so there is no problem to accumulate loads of texts, but it will be more formal and far from spontaneous speech Support for MSA in Phonexia products Name LID L4 STT Description Arabic (MSA) arb — Modern Standard Arabic,…

Support Lifecycle Policy (PSP)

…supported by Speech To Text and Keyword Spotting Standard = Maintained until newer generation is released, or end of support is reached. Language generation is specified by the number in “Model name”. Language (region) Model name Released End of support Maintenance Arabic (Gulf, Kuwait) AR_KW_6 2022-04 8th gen. Standard Arabic (Levantine) AR_XL_6 2021-05 8th gen. Standard AR_XL_5 2020-08 7th gen….

Understand SPE executable files

…add-language-pack=<path> – Add custom LID language pack from specified directory. Language pack name will be same as dirctory name. delete-language-pack – Delete custom LID language pack Support hwgen[=<file>] – Create machine HW profile file report – Create SPE report useful for troubleshooting and diagnostics. Report contains configuration, logs, licences and hardware profile of current computer. Migration from legacy version upgrade…

SPE and Browser installation: embedded SPE

…the cooperation. 3. Optional: add additional languages If you are going to test additional languages besides the default English, present in the Phonexia Evaluation package, you need to perform a simple operation of merging the contents of two packages into one. The additional languages are provided upon request by Phonexia sales representative. If you do not have the languages you…

STT: Results explained

…with newly received ones. Hint: These corrections never go back beyond end-of-segment boundary (</segment> token). In other words, they may happen only within a single segment boundaries. Realtime stream processing ouput Historically, realtime stream processing provided only single output type – one-best. The one-best results are updated continuosly, i.e. as soon as a new speech element is recognized, it’s immediately…

Phonexia technology models EoL

Speech to Text (STT) and Keyword Spotting (KWS) models Languages supported by Speech To Text and Keyword Spotting Standard = Maintained until newer generation is released, or end of support is reached. Language generation is specified by the number in “Model name”. Language (region) Model name Released End of support Maintenance Arabic (Gulf, Kuwait) AR_KW_6 2022-04 8th gen. Standard Arabic…

STT: What is Words-To-Numbers feature and how to use it

This article explains details of new STT feature for native numeric numbers and dates trancription in n‍-best output and gives some tips for fine-tuning the results. NOTE: The feature works out-of-the-box in the following STT languages and models: English – EN_US_6 and EN_US_A_6 Spanish – ES_6 Polish – PL_PL_6 Czech – CS_CZ_5 and CS_CZ_6 Slovak – SK_SK_5 and SK_SK_6 You…

Recommended OS and HW (PSP)

…Intel® Core Processor RAM: 16 GB Storage: 100 GB (depends on your audio retention policy) SSD strongly recommended for superior performance over HDD Configuration includes: STT 6th generation – 2 languages (half load each), KWS 6th generation – 2 languages, LID L4, VAD, SQE Voice Biometrics + Transcription System, basic 100 hours/day package (***) files processing CPU: 14 physical cores,…