Skip to content Skip to main navigation Skip to footer

Search: Browser

48 results

Releases and Changelogs (SPE)

…v1200, BSAPI 3.21.0 Added L4 model to GID and AGE technologies, i.e. they now support also SID4 L4 voiceprints [G#183] Added silence detection in Dictate [G#182] Added support for RLS capacities [G#137] Added possibility to specify multiple destinations in server.logging.destination option [G#136] Phonexia Browser configuration files are now included in data collected by phxadmin –report command [G_BSAPI#401] Fixed inability to…

Support Lifecycle Policy (PSP)

…  S1       SID XL3 2015-11 5th gen. SID   L3 2015-11 5th gen. SID   L2 2012-11 4th gen. SID   S2 2012-11 4th gen. SID   LID L2 2012-11 4th gen. LID   S2 2012-11 4th gen. LID   Phonexia Browser Version Release Date End of Support Maintained Until Release type 3.60 2023-12-05 2025-06-01 n/a Public…

Phonexia technologies introduction

…technologies 11:43 Speech Transcription (STT) 12:30 Keyword Spotting (KWS) 13:32 Phoneme Recognition (PHNREC) 13:54 Time Analysis Extraction (TAE) 14:22 Speech Platform architecture; Speech Engine, Phonexia Browser, Phonexia Voice Inspector brief 18:52 HW and SW requirements, typical deployment topologies 21:34 Supported file- and stream formats, typical implementations and data flows 27:29 Licensing technical options 32:24 Summary, recommended next steps   https://youtu.be/DDu0Y1rgQ6k…

Key Features (PSP)

Phonexia Speech Platform is provided as a set of several components: The Speech Engine (SPE) component is a REST API that includes technologies for the automated processing of audio files and audio streams. This component is usually provided in a specific configuration that meets the customer’s use case. The Phonexia Browser component is an expert-level application (on the top of…

Q: What do LLR, LR and score mean?

A: These abbreviations mean the following: LR – likelihood ratio, result from statistical test for two models comparison. It returns a number which expresses how many times more likely the data are under one model than the other. LR meets numbers in interval <0;+inf). LLR – abbreviation for log-likelihood ratio statistic, logarithmic function of LR. LLR meets numbers in interval…

Q: What languages do you offer?

It depends on the technology. Phonexia Language Identification (LID) is pre-trained for 60+ languages. Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT) for 20+ languages including English, French, German, Russian, Spanish and many more….

Q: What are the supported audio formats?

Formats supported directly and natively are: WAVE (*.wav) container including any of: unsigned 8-bit PCM (u8) unsigned 16-bit PCM (u16le) IEEE float 32-bit (f32le) A-law (alaw) µ-law (mulaw) ADPCM FLAC codec inside FLAC (*.flac) container OPUS codec inside OGG (*.opus) container Other audio formats must be converted to one of those natively supported using external tools. SPE server can be…

Video – Getting started with SPE

MODULE 1: Getting started with Speech Engine (19 min) Installation Technologies configuration Server and database configuration Users configuration Files processing Synchronous and asynchronous requests, results polling Stream processing https://youtu.be/4qrB-GfFdWY…

FAQs (VIN)

…our Service Desk. in FAQ Voice Inspector Permalink Q: I am getting the error message “Your license is not for this application.” A: Check your license file (license.dat) by opening it in Notepad. Make sure the license contains records for all required modules. See Licensing article for additional information in FAQ Phonexia Browser, FAQ Speech Platform, FAQ Voice Inspector Permalink…

Q: What are the requirements for SID evaluation dataset?

For evaluating the real life scenario of Phonexia Speaker Identification technology, the system needs to be calibrated by SID dataset. SID dataset (minimum requirements): To measure SID performance precisely, it’s important to prepare evaluation recordings set very carefully. The requirements are: 50+ known speakers, 200+ recordings in total (i.e. 3 to 5 recordings per speaker*) 1+ minute of net speech…