Search: model L

58 results

Understand SPE configuration file

…disabled by default. reporting.ssl.cipher_list # Set SSL cipher list (default: ALL:!COMPLEMENTOFDEFAULT:!eNULL) # For more details, see https://www.openssl.org/docs/man1.1.0/apps/ciphers.html # reporting.ssl.cipher_list = ALL:!COMPLEMENTOFDEFAULT:!eNULL # Set SSL cipher list (default: ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH) # For more details, see https://www.openssl.org/docs/man1.0.2/apps/ciphers.html # reporting.ssl.cipher_list = ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH Sets list of ciphers to use when SSL is enabled for reporting. The list should use OpenSSL cipherlist format, see https://www.openssl.org/docs/man1.0.2/apps/ciphers.html (SPE…

Download Speech Platform

…XL5 Diarization (DIAR) – model XL4 Language Identification (LID) – model L4 Gender Identification (GID) – model XL5 Age Estimation (AGE) ) – model XL5 Voice Activity Detection (VAD) – model GENERIC_3 and SID4_XL5 Speech Quality Estimation (SQE) Time Analysis Extraction (TAE) Waveform Denoiser (DENOISER) Phonexia Browser example audio (in ./BROWSER/example/ and ./SPE/bsapi/{technology}/example/) Step #2 – First start To get…

Releases and Changelogs (VIN)

…for Expert and Organization Voice Inspector 5.0 Voice Inspector 5.0.0, BSAPI 3.57.0 (2023-06-20) New: Speaker Identification XL5 technology model New: Data in lists/tables are now sorted alphabetically New: Enlarge the initial set of speakers included in examples; some of the speakers are multilingual ❗❗❗ Voice Inspector 5.0 requires a new license. To upgrade from version 4 or 3, please contact…

STT: Adding words to language model on the fly

Adding words to STT language model on-the-fly is possible in SPE 3.45 or newer as part of preferred phrases feature. The POST /technologies/stt or POST /technologies/stt/input_stream API calls actually serve two purposes: specify the actual preferred phrases (in the phrases part) specify words to be added to STT language model (in the dictionary part) Each part can be used independently,…

Arabic dialects in Phonexia LID and STT

…TEXT (used for STT language model training) MSA is used in all formal writing such as official correspondence, literature, newspapers, webpages so there is no problem to accumulate loads of texts, but it will be more formal and far from spontaneous speech Support for MSA in Phonexia products Name LID L4 STT Description Arabic (MSA) arb — Modern Standard Arabic,…

FAQs (PSP)

…FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What do LLR, LR and score mean? A: These abbreviations mean the following: LR – likelihood ratio, result from statistical test for two models comparison. It returns a number which expresses how many times more likely the data are under one model than the other. LR meets numbers in interval <0;+inf). LLR…

Understand SPE benchmark

…if such directory is found, audio files from that directory are used (expecting that the audio contains speech in that corresponding language). If not found, it falls back to default directory. The reason for language-specific data is that processing audio in different language than the language for which the model was trained negatively affects the processing speed (basically, the processing…

Speech Engine update

…up to you, based on the actual content of the directory and your new package NOTE: If you created any user configuration files, or made any changes in configuration files, make sure to keep the respective .bs.usr or .bs files! If you created any customized STT language models using LMC, it’s recommended practice to recreate the STT model using the…

Understand SPE multithreaded technologies initialization

…of single-threaded initialization is that it may take longer time to fully initialize the whole system, depending on the actual technologies configuration (number of initialized technologies and instances). In multi-threaded configuration, instances of each technology are initialized in multiple parallel threads, one separate thread for each technology–model combination. This, in general, results in faster initialization of the whole system. On…

Speech To Text / Keyword Spotting supported languages

…(Ukraine) UK_UA_6 2023-04 8th gen. Standard Vietnamese (Vietnam) VI_VN_6 2021-10 8th gen. Standard Deprecated languages/models (not supported, after end-of-life) Older/other languages or models not listed in the above table are no longer supported and reached end-of-life. These are 1st, 2nd, 3rd or 4th generation models, typically marked with a number 1, 2, 3 or 4 in the model name. …

Understand SPE configuration

…server.port = 8600 # Server logging # Level (trace, debug, information, warning, error, fatal) server.logging.level = information # Destination (console, file, database) # Logging to database is supported only for MySQL server.logging.destination = file # Path to file where log is stored server.logging.file = ${application.dir}log/phxspe.log Did you notice the server.logging.file directive? They present the first example of using variables in…

KWS: Results explained

…before the keyword (1), the Keyword model (2) and a Background model of any speech parallel with the keyword model (3). Models 2 and 3 produce two likelihoods – Lkw and Lbg (any speech = background). Raw score is calculated as log likelihood ratio (LLR): score = loge(Lkw/Lbg) Confidence is calculated from the raw score using a sigmoid function: where:…

Speaker Identification (SID)

…this Microsoft Excel sheet demonstrating the sigmoid function: Score-to-Confidence. SID evaluation Before implementing Speaker Identification, it’s important to evaluate its accuracy using real data from the production environment. To evaluate the SID system, you’ll need enough of labeled data, i.e. recordings with speaker labels. The principle of SID system evaluation is to compare (voiceprints of) all the individual recordings against…

STT: What is Words-To-Numbers feature and how to use it

…numbers conversion is based on set of grammar rules, describing how the conversion should work. Conversion rules are stored in numeric.pegjs file, located in grm subdirectory inside the STT model directory. For example: in Czech 6th generation STT it’s located in {SPE_directory}/bsapi/stt/data/models_cs_cz_6/grm in Spanish 6th generation STT it’s located in {SPE_directory}/bsapi/stt/data/models_es_6/grm Can it be extended or tuned? You can edit…

Language Identification (LID)

…Routing particular calls (languages) to human operators (language experts) Scoring and results The LID language pack defines a set of recognizable languages (represented by a language models). When identifying the language in audio recording (or languageprint), LID does the following: creates languageprint of the recording (if the input is audio recording) compares that languageprint with each language model in a…