Skip to content Skip to main navigation Skip to footer

Search: speech to text manual

19 results

Testing possibilities

…(GSM, VoIP,…) Microphone placement (close-field vs. far-field) Audio quality Formats Codecs Background noise Geological locations Age distribution Style of speech Monolog vs. dialog Reading a text vs. live conversation In some of the scenarios mentioned above, it is quite difficult to assure all of these requirements, that is the reason why the best option for accuracy testing is definitely in…

Licensing (technical details)

…all speech technologies and products and may be required in order to use utilities and tools developed by Phonexia or partners. For technical purposes, the License agreement is represented by the license file, which describes the Phonexia technologies or products allowed to be used with that license file. License file The license file is a plain text file named license.dat…

Understand SPE executable files

…(in octal format, e.g. 027) pidfile=<path> – Write the application’s process ID (PID) to the specified file Windows-specific registerService – Register the application as Windows service displayName=<text> – Specify service friendly name (valid only with registerService) description=<text> – Specify service description (valid only with registerService) startup=automatic|manual – Specify service startup mode (valid only with registerService) unregisterService – Unregister the previously

About Phonexia Orbis

…speakers and their corresponding recordings. Speech Transcription In Orbis edition that includes Speech to Text technology, user may let the audio be automatically or on demand transcribed in language chosen from the portfolio Phonexia Speech to Text offers. Network Map The solution visualizes the relations between persons and assets based on time on a network map. Persons, Assets and Relations…

Arabic dialects in Phonexia LID and STT

TEXT (used for STT language model training) MSA is used in all formal writing such as official correspondence, literature, newspapers, webpages so there is no problem to accumulate loads of texts, but it will be more formal and far from spontaneous speech Support for MSA in Phonexia products Name LID L4 STT Description Arabic (MSA) arb — Modern Standard Arabic,…

Understand SPE database

Speech Engine is used together with Phonexia Browser in so-called “embedded” mode (see details about “embedded SPE” mode in Browser manual), Phonexia Browser creates its own separate SPE configuration file and the SQLite database file is located in SPE home directory and named phxserver.sqlite. This might be important in certain scenarios, e.g. when registering LID language pack using phxadmin –…

Understand SPE configuration

text-based, well commented and human readable. Read carefully these comments as there are some useful tips and tricks hidden inside. Let’s begin; pay attention to the comment about variables notation format mentioned in the configuration preamble: # This is the default properties file for Phonexia Speech Engine # # Variables: # ${application.dir} path to application directory # ${system.env.<NAME>} system environment…

Understand SPE directory structure

…for individual models settings BSAPI configuration files (*.bs) and optionally manually created user configs (*.bs.usr) There is one exception – LID – which has additional two directories containing pre-built languageprint archives (*.lpa) and language packs: lprints and models. Schemes below show examples of directories for GID (Gender Identification), STT (Speech To Text) and LID (Language Identification): – GID and LID…

Phoneme Recogniser (PHNREC)

…user can add to language model of speech-to-text technology (better accuracy of KWS technology). Input audio file (format details – see Speech Engine documentation); stream not supported, technology model name (i.e. language code) to be used for phoneme transcription. Output In the process of transcribing speech-to-phonemes, the Phoneme Recogniser usually identifies individual speech segments and convert it to pronunciation. Example…

STT: What is Preferred Phrases feature and how to use it

…e.g. “WiFi” vs. “HiFi”, “cell” vs. “sell”, “eighty machines” vs. “eight tea-machines” etc. Usually, the language model part of the Speech To Text does its job and prefers the correct word in the context of longer phrase or entire sentence: × I’m going to cell my car. Hmmm, such sentence does not sound like common English… √ I’m going to…

Understand SPE configuration file

In this article we explain details of the Speech Engine configuration file phxspe.properties, located in settings subdirectory in SPE installation location. Settings in this configuration file affect the Speech Engine behavior and performance. The configuration file is usually created after SPE installation – on first use of phxadmin, default configuration file phxspe.properties is created in the settings directory. The file…

FAQs (Browser)

…Browser. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What languages do you offer? It depends on the technology. Phonexia Language Identification (LID) is pre-trained for 60+ languages. Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT) for 20+ languages including English, French, German, Russian, Spanish and many more. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What…

Download Speech Platform

…only English models for Speech To Text and Keyword Spotting. Additional supported languages are available upon request. ⓘ Click to show/hide the package content Speech Engine – technologies included: Speech To Text (STT) – model EN_US_6 (US English) Keyword Spotting (KWS) – model EN_US_6 (US English) Phoneme Recognizer (PHNREC) – model EN_US_6 (US English) Speaker Identification 4 (SID4) – model…

STT: Language Model Customization tutorial

Language Model Customization tool (LMC) provides a way to improve the Speech To Text performance by creating customized language model. Language model is an important part of Phonexia Speech To Text. In a simplified way it can be imagined as a large dictionary with multiple statistics. The Speech To Text technology uses this dictionary and statistical model to convert audio…