Skip to content Skip to main navigation Skip to footer

Search: speech to text languages

20 results

Understand SPE metafiles

Certain SPE entities – SID Speaker models, SID Audio source profiles, LID Language packs – can have additional information associated with them in the form of “metafiles”. This article explains the intended usage of metafiles. In general, SPE is intended as under-the-hood engine, focusing purely on the speech-related audio processing. Any additional functionality should be done on the application layer,…

Phonexia technology models EoL

Speech to Text (STT) and Keyword Spotting (KWS) models Languages supported by Speech To Text and Keyword Spotting Standard = Maintained until newer generation is released, or end of support is reached. Language generation is specified by the number in “Model name”. Language (region) Model name Released End of support Maintenance Arabic (Gulf, Kuwait) AR_KW_6 2022-04 8th gen. Standard Arabic…

Understand SPE directory structure

…for individual models settings BSAPI configuration files (*.bs) and optionally manually created user configs (*.bs.usr) There is one exception – LID – which has additional two directories containing pre-built languageprint archives (*.lpa) and language packs: lprints and models. Schemes below show examples of directories for GID (Gender Identification), STT (Speech To Text) and LID (Language Identification): – GID and LID…

Download Semantic Search demo

…an Ubuntu-based Linux operating system with a GUI. Supported languages Supported languages ISO Name ISO Name ISO Name af Afrikaans ht Haitian_Creole pt Portuguese am Amharic hu Hungarian ro Romanian ar Arabic hy Armenian ru Russian as Assamese id Indonesian rw Kinyarwanda az Azerbaijani ig Igbo si Sinhalese be Belarusian is Icelandic sk Slovak bg Bulgarian it Italian sl Slovenian…

Understand SPE connectors for external TTS

…from stdin is as follows: { “text“: string, “voice”: { “name”: string, “languageCode”: string } } Where: text is the text to be synthesized name is a voice name to be used for synthesis (ref. to the voice names provided in the connector “info” data) languageCode is a language code defining the language to be used for synthesis (ref. to…

Adding new language or technology model (Browser)

…our example, we are adding new Spanish model (ES_6 technology model) of Speech to Text and Keyword Spotting (with Phoneme Recognizer). When you install new languages or models, they are turned off by default and need to be enabled in Phonexia Browser. To turn new models on, open Phonexia Browser: go to Settings Switch to Speech Engine tab Open STT…

Phoneme Recogniser (PHNREC)

…user can add to language model of speech-to-text technology (better accuracy of KWS technology). Input audio file (format details – see Speech Engine documentation); stream not supported, technology model name (i.e. language code) to be used for phoneme transcription. Output In the process of transcribing speech-to-phonemes, the Phoneme Recogniser usually identifies individual speech segments and convert it to pronunciation. Example…

Understand SPE technologies configuration file

…SQE_STREAM Speech Quality Estimation Stream STT Speech To Text STT_STREAM Speech To Text Stream TAE Time Analysis Extraction TAE_STREAM Time Analysis Extraction Stream VAD Voice Activity Detection VAD_STREAM Voice Activity Detection Stream SIDC Speaker Identification Voiceprint Comparator (legacy) SIDC_STREAM Speaker Identification Voiceprint Stream Comparator (legacy) SIDCALIBSET Speaker Identification VoicePrint Calibration (legacy) SIDCALIBSET_STREAM Speaker Identification VoicePrint Stream Calibration (legacy) SIDE Speaker…

Phonexia Speech Engine

Phonexia Speech Engine (SPE) is main part of Phonexia Speech Platform. SPE is a server application for 64-bit Linux or Windows, providing REST API to entire portfolio of Phonexia speech technologies. SPE capabilities overview: Audio files and stream processing Audio files RTP / HTTP streams Speaker Identification (SID) ✓ ✓ Speech To Text (STT) ✓ ✓ Keyword Spotting (KWS) ✓…

Speech To Text / Keyword Spotting supported languages

Languages supported by Speech To Text and Keyword Spotting Standard = Maintained until newer generation is released, or end of support is reached. Language generation is specified by the number in “Model name”. Language (region) Model name Released End of support Maintenance Arabic (Gulf, Kuwait) AR_KW_6 2022-04 8th gen. Standard Arabic (Levantine) AR_XL_6 2021-05 8th gen. Standard AR_XL_5 2020-08 7th…

Download Speech Platform

…only English models for Speech To Text and Keyword Spotting. Additional supported languages are available upon request. ⓘ Click to show/hide the package content Speech Engine – technologies included: Speech To Text (STT) – model EN_US_6 (US English) Keyword Spotting (KWS) – model EN_US_6 (US English) Phoneme Recognizer (PHNREC) – model EN_US_6 (US English) Speaker Identification 4 (SID4) – model…

FAQs (Browser)

…Browser. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What languages do you offer? It depends on the technology. Phonexia Language Identification (LID) is pre-trained for 60+ languages. Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT) for 20+ languages including English, French, German, Russian, Spanish and many more. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What…

SPE and Browser installation: standalone SPE

…merging the contents of two packages into one. The additional languages are provided upon request by Phonexia sales representative. If you do not have the languages you want to test, contact our sales to arrange the cooperation. Download the files with additional languages locally and unzip them. Then copy the additional languages over to where you saved the default Evaluation…