Search: language

60 results

Release Notes

…use of our SPE component. LID language models and language packs management in Browser It allows users to e.g. easily customize the set of languages in LID language packs. Customers will benefit from increased precision of results by lowering the false positive scores on customer data. Available for all LID technological models. See Browser manual PDF for more details about…

Speech to Text (STT)

…language model to be used for transcription. As an output the transcription in one of the formats is provided. The technology extract features out of voice, using acoustic and language models together with pronunciation all in recognition network creates a hypothesis of transcribed words and „decode“ the most possible transcription. Based on requested output types one or more transcribed text…

STT: What is Preferred Phrases feature and how to use it

…a decoder. The decoder uses the information from acoustic model, combines it with information from language model recognition network (which describes the statistics about word grouping and sentences of a given language) and provides the transcription output. (See the Speech To Text article for more details about speech transcription principles) When using preferred phrases, we build additional language model…

SPE and Browser installation: standalone SPE

…merging the contents of two packages into one. The additional languages are provided upon request by Phonexia sales representative. If you do not have the languages you want to test, contact our sales to arrange the cooperation. Download the files with additional languages locally and unzip them. Then copy the additional languages over to where you saved the default Evaluation…

FAQs (PSP)

…may use them for both training a new language pack and testing/comparing against an existing language pack. The language-prints need to be compatible only with the model of LID used for language-print extraction. in FAQ Speech Platform Permalink Q: What are the recommendations for LID adaptation set? A: The following is recommended: For adding new language to language pack 20+…

Understand SPE directory structure

…for individual models settings BSAPI configuration files (*.bs) and optionally manually created user configs (*.bs.usr) There is one exception – LID – which has additional two directories containing pre-built languageprint archives (*.lpa) and language packs: lprints and models. Schemes below show examples of directories for GID (Gender Identification), STT (Speech To Text) and LID (Language Identification): – GID and LID…

Key Features (PSP)

…in the Languages Available section. Speech To Text (STT) and Keyword Spotting (KWS) languages Language Identification (LID) languages Supported Audio input The Speech Engine server supports various audio formats as listed in API reference > Audio requirements. It also supports the RTP/HTTP stream processing as listed in API reference > RTP/HTTP streams. The Speech Engine allows the usage of some…

Understand SPE database

…user), technology model used to create the profile, file with the profile content, hash rest_profile_sid4_metafiles list of files used as SID4 Audio Source Profiles metafiles rest_model_lid list of LID language packs – name, owner (SPE user), technology model to which the language pack belongs (i.e. technology model used to create source languageprints/language models) rest_model_lid_metafiles list of LID language packs metafiles…

Understand SPE connectors for external TTS

…from stdin is as follows: { “text”: string, “voice”: { “name”: string, “languageCode“: string } } Where: text is the text to be synthesized name is a voice name to be used for synthesis (ref. to the voice names provided in the connector “info” data) languageCode is a language code defining the language to be used for synthesis (ref. to…

Support Lifecycle Policy (PSP)

…supported by Speech To Text and Keyword Spotting Standard = Maintained until newer generation is released, or end of support is reached. Language generation is specified by the number in “Model name”. Language (region) Model name Released End of support Maintenance Arabic (Gulf, Kuwait) AR_KW_6 2022-04 8th gen. Standard Arabic (Levantine) AR_XL_6 2021-05 8th gen. Standard AR_XL_5 2020-08 7th gen….

Understand SPE executable files

…add-language-pack=<path> – Add custom LID language pack from specified directory. Language pack name will be same as dirctory name. delete-language-pack – Delete custom LID language pack Support hwgen[=<file>] – Create machine HW profile file report – Create SPE report useful for troubleshooting and diagnostics. Report contains configuration, logs, licences and hardware profile of current computer. Migration from legacy version upgrade…

Q: What are the recommendations for LID adaptation set?

A: The following is recommended: For adding new language to language pack 20+ hours of audio for each new language model (or 25+ hours of audio containing 80% of speech) Only 1 language per record For adapting the existing language model (discriminative training) 10+ hours of audio for each language May be done on customer site. May be done in…

Releases and Changelogs (Browser)

…Compatibility with SPE 3.45 + all changes included in Feature Preview release 3.42 (see below) Phonexia Browser 3.42 Phonexia Browser 3.42.0, BSAPI 3.42.1 (2021-08-24) New: Server Information dialog New: Widget and dialog for managing language models New: Dialog for creating new language pack Improved: Language pack widget – add/remove language packs, show metafiles and language details Phonexia Browser 3.40 (Public…

Arabic dialects in Phonexia LID and STT

Arabic language has (a) one standardised variety, and (b) many non-standard varieties (dialects). In this article, our linguistic team explains differences between Modern Standard Arabic and Arabic dialects in the context of Phonexia Arabic models. Standard variety: Modern Standard Arabic (MSA) All Arabs learn it at school (not from their parents, so we cannot say it is their native variety)…

SID: Speaker Identification: Results Enhancement

…language. We have never seen this data during SID training so it is a sensible thing to calibrate the system. Since there is only a single source of data (telephony) and only a single language (Wakandan), one can assume that it is enough to create a single profile and use it for both sides of the comparison. We are monitoring…