Skip to content Skip to main navigation Skip to footer

Search: time%20unit

59 results

Phonexia Academy

About Main idea of the Phonexia Academy is to help partners to understand the market, Phonexia’s products and technologies. Sell more, deliver your projects on time and at the highest quality, and support your clients effectively. We provide following trainings: Phonexia technologies introduction (online video course) Technical Training Essentials (online video course) Technical Training Advanced – 2 courses: Voice Biometrics…

Q: What do LLR, LR and score mean?

A: These abbreviations mean the following: LR – likelihood ratio, result from statistical test for two models comparison. It returns a number which expresses how many times more likely the data are under one model than the other. LR meets numbers in interval <0;+inf). LLR – abbreviation for log-likelihood ratio statistic, logarithmic function of LR. LLR meets numbers in interval…

Q: How do I get results for a pending operation?

…free time gaps until server will respond with 303 – See Other status (in a body there will be status “finished”). In HTTP header of this operation (in parameter “Location”) is a resource path for the operation. It is possible to use operation ID in body of the response. Client will ask for resource “get /done/{ID}”, where the final result…

Download Speech Platform

…XL5 Diarization (DIAR) – model XL4 Language Identification (LID) – model L4 Gender Identification (GID) – model XL5 Age Estimation (AGE) ) – model XL5 Voice Activity Detection (VAD) – model GENERIC_3 and SID4_XL5 Speech Quality Estimation (SQE) Time Analysis Extraction (TAE) Waveform Denoiser (DENOISER) Phonexia Browser example audio (in ./BROWSER/example/ and ./SPE/bsapi/{technology}/example/) Step #2 – First start To get…

Phonexia technologies introduction

…and their usages Filtering and supporting technologies 04:32 Speech Quality Estimation (SQE) 05:27 Voice Activity Detection (VAD) 06:37 Diarization (DIAR) 07:41 Age Estimation (AGE) 08:14 Waveform Denoiser Voice Biometrics technologies 08:56 Speaker Identification (SID) 10:18 Language Identification (LID) 11:10 Gender Identification (GID) Speech Analytics technologies 11:43 Speech Transcription (STT) 12:30 Keyword Spotting (KWS) 13:32 Phoneme Recognition (PHNREC) 13:54 Time Analysis…

STT: What is Preferred Phrases feature and how to use it

…from the preferred phrases and interpolate it in realtime with the generic language model: P(word|history) = Pgeneric(word|history) + αPpreferred(word|history) The preferred words and phrases are favored, while retaining the existing accuracy on common text. Preferred phrases in Speech Engine Use POST /technologies/stt or POST /technologies/stt/input_stream call to start transcription with a list of preferred phrases. To be precise, these actually…

STT: What is Words-To-Numbers feature and how to use it

…that would require to retroactively change text which was already outputted earlier… which is impossible. Alternatively, the output would have to be somehow delayed… which is undesirable in realtime stream processing, of course. So, the best compromise is to keep the word-level outputs untouched and do the conversion only on the segment/sentence level. How does it work? The words to…

Waveform Denoiser (DENOISER)

…software cannot remove unwanted speech or music in the background. Denoiser is used to remove noise from the recording and at the same time to amplify the speech signal for: Better intelligibility when listening by people (recommended use), Achieving better results with automatic speech recognition technologies (necessary to test on customer data first). Input: audio file (format details – see…

Understand SPE user accounts

…other” accounts still need to register the file to be able to actually use it in SPE… otherwise, the file would be visible only by the account which originally uploaded the file. This is because SPE keeps some file metadata (name, timestamps, …) in its database and files not having its database record (associating them with the SPE account) are…

Understand SPE audio converter

…WAVE file format” BSAPI exception is somehow confusing here, since it’s actually a harmless error, meaning just that the format detection failed. However, since the converter is enabled, SPE called the converter, file was converted and successfully recognized afterwards – the response contains the converted file attributes. The second time, the file was uploaded to SPE with converter disabled –…

Understand SPE metafiles

…i.e. should be handled by the application built on top of the SPE API. This includes handling of any metadata associated with the processed audiofiles, like phone numbers, source of the recording, date/time the audio was recorded, references to the persons speaking in the recording (names, photos, …), languages spoken in the recording, etc. – all this data is expected…

STT: Adding words to language model on the fly

…i.e. you can specify only preferred phrases, or only add words to dictionary, or use both features at the same time. Example of input for starting transcription, specifying two preferred phrases and two words to be added (one with explicitly specified pronunciation): { “preferred_phrases”: { “phrases”: [ { “phrase”: “this is preferred phrase” }, { “phrase”: “some other phrase” },…

Video – Speech Analytics technologies

MODULE 4: Speech Analytics technologies (23 min) Common generic rules for CLI, REST and GUI Speech To Text (STT) in CLI, REST and GUI Keyword Spotting (KWS) in CLI, REST and GUI Phoneme Recognizer (PHNREC) in CLI, REST and GUI Time Analysis Extraction (TAE) in CLI, REST and GUI Summary https://www.youtube.com/watch?v=-FAoRywqv7U…