Search: speech%20%20%20%20%20ytics

127 results

FAQs (Browser)

…Browser. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What languages do you offer? It depends on the technology. Phonexia Language Identification (LID) is pre-trained for 60+ languages. Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT) for 20+ languages including English, French, German, Russian, Spanish and many more. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What…

Key Features (PSP)

…– detects the audio part that contains voice, Speech Quality Estimation (SQE) – measures the quality of speech, Phoneme Recognizer (PHNREC) – several languages supported – converts speech into phonemes (written characters representing pronunciation), Waveform Denoiser (DENOISER) – automatically improves the audibility of speech for human listeners. Supported Languages The LID, STT and KWS technologies support various languages as listed…

Speech to Text (STT)

…n-grams. Using this the user can adjust a language model focusing on a specific domain to get better results. Result types During the process of transcribing the speech there are always several alternatives for a given speech segment. The technology can provide one or more results. 1-best result type provides only the result with highest score. Speech is returned in…

Download Speech Platform

…issues and malfunctions, please take the free RAM requirement seriously. See also additional information on Recommended OS and HW page. While downloading, you can check the updates: Speech Engine changes and Browser changes. Speech Platform 3.60.1 for Windows 64-bit 4 GB Download Speech Platform 3.60.1 for Linux 64-bit 4 GB Download To keep the download size reasonable, the package includes…

Understand SPE benchmark

…SPE in the {SPE}/data/benchmark directory. The second option uses single audio file of your choice uploaded to SPE storage, specified by the path parameter. The set of audio files supplied with SPE contains recordings of various length (from 30 seconds to 5 minutes) and with various speech/non-speech ratio. This is to account for the fact that both the length of…

STT: Language Model Customization tutorial

…copy of the word list file, as a backup) – see below for the best location for usage in Speech Engine Using customized STT model in Speech Engine STT To use customized STT model in Speech Engine STT, it’s necessary to place the customized model in correct location, so that Speech Engine can find it register and enable the customized…

Recommended OS and HW (PSP)

Recommended operating systems Windows 64-bit – Windows Server 2019 (*), latest version of Windows 10 (*) Linux 64-bit – latest version of RHEL/CentOS 7 (*) Compatible Operating Systems (**) : 64-bit Windows 8.1, Windows Server 2016, and newer 64-bit Linux with glibc >= 2.17, e.g. Ubuntu 20.04, Mint 19.3, RHEL/CentOS 8.2, … (*) Speech Platform components (e.g. Speech Engine) are…

Voice Inspector – supporting technologies

This part requires higher (and non-anonymous) access level.
How to solve this situation:

Log in here if you are not logged in.
Register here. It takes just a few clicks and it’s free.

Phonexia Speech Engine

Phonexia Speech Engine (SPE) is main part of Phonexia Speech Platform. SPE is a server application for 64-bit Linux or Windows, providing REST API to entire portfolio of Phonexia speech technologies. SPE capabilities overview: Audio files and stream processing Audio files RTP / HTTP streams Speaker Identification (SID) ✓ ✓ Speech To Text (STT) ✓ ✓ Keyword Spotting (KWS) ✓…

Understand SPE technologies, instances and workers

Configuring Speech Engine to utilize effectively the full power of underlying hardware can get challenging – one can easily get lost in all the strange terms like technologies, instances, slots, or workers… This article should shed some light in it. Speech Engine is like post office Thinking about Speech Engine, there is actually a very nice analogy with post office…

Speech Quality Estimation (SQE)

Phonexia’s Speech Quality Estimation quantifies the acoustic quality of recordings. This helps the user to quickly determine whether the acoustic quality of a recording is good for processing with other speech technologies or not. As an answer for SQE, the SPE returns a json/xml file. This file includes general information about the technology and statistics of all (one or two)…

LID: Terminology and adaptation

…Engine chapter for details. Using custom LID language pack in Speech Engine To use customized LID language pack in Speech Engine, it’s necessary to ensure that language pack placed in correct location, so that Speech Engine can find it register and enable the language pack in SPE using phxadmin 1) Put the language pack in correct location In order…

Phonexia Partner Program for Government Partners

Phonexia Partner Program for Government Partners This partnership program rewards partners in the government sector for selling and integrating the Phonexia’s speech recognition and voice biometrics product portfolio. Program Enrollment If you aspire to becoming a Phonexia partner, you can enroll into the Phonexia Partner Program and complete a three-month onboarding period. During this period, you will enjoy the same…

Q: How do you calculate SNR in Speech Quality Estimation?

A: Signal-to-Noise Ratio (SNR) is an important metric of whether a recording is worth further processing by other speech technologies, so it is part of our Speech Quality Estimation. However, calculating SNR automatically is not a trivial task. We use the fact that the statistical distribution of the frequencies in the waveform of speech has Gamma distribution. In contrast, noise…

SID: TUTORIAL: Speaker Identification – How to Do a Basic Test

…to download for commercial/research purposes under a Creative Commons 4.0 license. Data originates from OXFORD VGG VoxCeleb Dataset which detailed license can be found here. SpeakerID Example Data Set v1.0 83.89 MB Download Publications: S. Chung, A. Nagrani, A. Zisserman VoxCeleb2: Deep Speaker Recognition INTERSPEECH, 2018. Nagrani, J. S. Chung, A. Zisserman VoxCeleb: a large-scale speaker identification dataset INTERSPEECH, 2017….