Skip to content Skip to main navigation Skip to footer

Search: Language Model

44 results

Installation of Phonexia Browser

Some packages are distributed with only a limited set of speech technologies and languages or without speech technologies. First installation Our software is distributed as a ZIP file. Installation procedure is as simple as: unzip the archive paste additional KWS, STT… models paste the license.dat file to the root directory where you have BROWSER folder and run_browser(.exe) script run the…

Understand SPE home directory

…Data The data directory holds additional data files for entities created by that user – e.g. SID Speaker Models, or LID language packs. If no such entities exist for that user, this directory is empty. Unlike the storage, content of this directory is intended to be manipulated by SPE only and should not be manipulated directly on the filesystem level….

Phonexia Speech Engine

…main binary file itself SPE requires database, which might be SQLite (delivered inside Phonexia package) or MySQL. No other components are needed. Structure of Technologies and technology models From the technical point of view, every technology can work with different technology modules. These are various languages for STT (CS_CZ4, EN_US4), or various sizes for SID (L3, XL3). Technology can work…

Age Estimation (AGE)

…coding), A-law or Mu-law, PCM, 8kHz+ sampling Voiceprints: AGE L4 model supports SID4 L4 voiceprints; legacy AGE models support voiceprints created by AGE itself Output Log file with processed information (age estimate) Processing speed Approx. 20x faster than real-time processing on 1 CPU core i.e. standard 8 CPU core server processes 3,840 hours of audio in 1 day of computing…

Gender Identification (GID)

…7+ sec recommended (with XL4 and L4 model (9+ sec for previous generation of XL3 and L3 models) Output scoring: likelihood ratio and percentage metric (0-100%) Typical use cases: filtering calls by gender, playing advertisement focused on specific gender, getting quick demographic analysis of the recordings. The speed of Gender Identification is up to 150 FtRT (depending on the model)….

Orbis 1.4.0 Release Notes

Newest generation of Speaker Identification technology added Speaker identification technology verifies and authenticates speakers in seconds. The new generation has increased accuracy by 1 percentage point (a relative improvement of 33 %) – XL5 model vs. XL4 model that was previously in Orbis. The processing speed of the XL5 model is the same or faster than that of the XL4…

Download Semantic Search demo

…an Ubuntu-based Linux operating system with a GUI. Supported languages Supported languages ISO Name ISO Name ISO Name af Afrikaans ht Haitian_Creole pt Portuguese am Amharic hu Hungarian ro Romanian ar Arabic hy Armenian ru Russian as Assamese id Indonesian rw Kinyarwanda az Azerbaijani ig Igbo si Sinhalese be Belarusian is Icelandic sk Slovak bg Bulgarian it Italian sl Slovenian…

Speaker Identification (SID)

…signal captured in a recording are also more or less unique, thus the technology can be language-, accent-, text-, and channel-independent. Automatic speaker recognition systems are based on the extraction of the unique features from voices and their comparison. The systems thus usually comprise two distinct steps: Voiceprint Extraction (Speaker enrollment) and Voiceprint comparison. The processing speed depends on the…

STT: What is Words-To-Numbers feature and how to use it

…numbers conversion is based on set of grammar rules, describing how the conversion should work. Conversion rules are stored in numeric.pegjs file, located in grm subdirectory inside the STT model directory. For example: in Czech 6th generation STT it’s located in {SPE_directory}/bsapi/stt/data/models_cs_cz_6/grm in Spanish 6th generation STT it’s located in {SPE_directory}/bsapi/stt/data/models_es_6/grm Can it be extended or tuned? You can edit…

SID: Speaker Identification: Results Enhancement

language. We have never seen this data during SID training so it is a sensible thing to calibrate the system. Since there is only a single source of data (telephony) and only a single language (Wakandan), one can assume that it is enough to create a single profile and use it for both sides of the comparison. We are monitoring…

STT: Results explained

…a speaker does not pronounce a word correctly and the one-best results do not correspond to what the speaker actually said. Start– and end time is in HTK units. 1 HTK unit equals to 100 nanoseconds ==> dividing the values by 10000 gives the time in milliseconds. Score is a rate of match with the acoustic and language model from…

Understand SPE user accounts

…not visible by SPE and by the account. Similar trickery can be done with the data directory, allowing to share LID language models and language packs, or SID speaker models, etc. between accounts. User accounts management SPE user accounts can be managed using REST API (see Administration section of the API documentation), or using command line administration utilities phxadmin or…

Understand SPE metafiles

…separate files. Another example would be the information about content of created LID language pack – if LID language pack is successfully created, SPE creates a metafile named report, which contains detailed information about the source files used for the language pack creation. See the LID language pack creation REST endpoint documentation for more details about the report metafile content….