Skip to content Skip to main navigation Skip to footer

Search: language detection

15 results

Phonexia technologies introduction

…and their usages Filtering and supporting technologies 04:32 Speech Quality Estimation (SQE) 05:27 Voice Activity Detection (VAD) 06:37 Diarization (DIAR) 07:41 Age Estimation (AGE) 08:14 Waveform Denoiser Voice Biometrics technologies 08:56 Speaker Identification (SID) 10:18 Language Identification (LID) 11:10 Gender Identification (GID) Speech Analytics technologies 11:43 Speech Transcription (STT) 12:30 Keyword Spotting (KWS) 13:32 Phoneme Recognition (PHNREC) 13:54 Time Analysis…

Speaker Identification

…in a recording are also unique, thus the technology can be language-, accent-, text-, and channel-independent. How does Speaker Identification work? Automatic speaker recognition systems extract the features from a voice to a voiceprint. A voiceprint is a small numerical representation of the voice, capturing the most unique characteristics of a speaker’s voice. The whole voice verification process consists of…

Download Speech Platform

…XL5 Diarization (DIAR) – model XL4 Language Identification (LID) – model L4 Gender Identification (GID) – model XL5 Age Estimation (AGE) ) – model XL5 Voice Activity Detection (VAD) – model GENERIC_3 and SID4_XL5 Speech Quality Estimation (SQE) Time Analysis Extraction (TAE) Waveform Denoiser (DENOISER) Phonexia Browser example audio (in ./BROWSER/example/ and ./SPE/bsapi/{technology}/example/) Step #2 – First start To get…

Phonexia Speech Engine

…✓ Voice Activity Detection (VAD) ✓ ✓ Time Analysis Extraction (TAE) ✓ ✓ Speech Quality Estimation (SQE) ✓ ✓ Language Identification (LID) ✓ Gender Identification (GID) ✓ Age Estimation (AGE) ✓ Speaker Diarization (DIAR) ✓ Results caching Processing results can be optionally stored in results cache database to speed up eventual re-processing of the same recordings by the same technology…

Speaker Identification (SID)

…signal captured in a recording are also more or less unique, thus the technology can be language-, accent-, text-, and channel-independent. Automatic speaker recognition systems are based on the extraction of the unique features from voices and their comparison. The systems thus usually comprise two distinct steps: Voiceprint Extraction (Speaker enrollment) and Voiceprint comparison. The processing speed depends on the…

Releases and Changelogs (VIN)

…BSAPI 3.40.13 (2022-07-15) New: Added Romanian to the default set of available languages New: Added the ability to define custom language in the speaker metadata Fixed: When discarding a changed photo, the confirmation dialog “Do you want to save…” popped up infinitely Fixed: Missing file names when the SID Evaluator evaluates speakers from the workspace Fixed: Unwanted extra comparisons when…

Understand SPE technologies configuration file

…Diarization GID Gender Identification KWS Keyword Spotting KWS_STREAM Keyword Spotting Stream LIDC Language Identification Languageprint Comparator LIDE Language Identification Languageprint Extractor PHNREC Phoneme Recognition SID4C Speaker Identification 4 Voiceprint Comparator SID4C_STREAM Speaker Identification 4 Voiceprint Stream Comparator SID4CALIB Speaker Identification 4 VoicePrint Calibration SID4E Speaker Identification 4 Voiceprint Extractor SID4E_STREAM Speaker Identification 4 Voiceprint Stream Extractor SQE Speech Quality Estimation…

Voice Activity Detection (VAD)

Voice Activity Detection is a language-, domain- and channel-independent technology that identifies parts of audio recordings with speech content vs. non-speech content. It creates labels for speech and other signals in the recording; this can then serve as a decision point whether to process the recording by other technologies or not. VAD is usually part of rapid filtration process in…

Releases and Changelogs (Browser)

…Compatibility with SPE 3.45 + all changes included in Feature Preview release 3.42 (see below) Phonexia Browser 3.42 Phonexia Browser 3.42.0, BSAPI 3.42.1 (2021-08-24) New: Server Information dialog New: Widget and dialog for managing language models New: Dialog for creating new language pack Improved: Language pack widget – add/remove language packs, show metafiles and language details Phonexia Browser 3.40 (Public…

Key Features (PSP)

…in the Languages Available section. Speech To Text (STT) and Keyword Spotting (KWS) languages Language Identification (LID) languages Supported Audio input The Speech Engine server supports various audio formats as listed in API reference > Audio requirements. It also supports the RTP/HTTP stream processing as listed in API reference > RTP/HTTP streams. The Speech Engine allows the usage of some…

STT: What is Preferred Phrases feature and how to use it

…a decoder. The decoder uses the information from acoustic model, combines it with information from language model recognition network (which describes the statistics about word grouping and sentences of a given language) and provides the transcription output. (See the Speech To Text article for more details about speech transcription principles)   When using preferred phrases, we build additional language model…

SPE and Browser installation: standalone SPE

…merging the contents of two packages into one. The additional languages are provided upon request by Phonexia sales representative. If you do not have the languages you want to test, contact our sales to arrange the cooperation. Download the files with additional languages locally and unzip them. Then copy the additional languages over to where you saved the default Evaluation…

Release Notes

…use of our SPE component. LID language models and language packs management in Browser It allows users to e.g. easily customize the set of languages in LID language packs. Customers will benefit from increased precision of results by lowering the false positive scores on customer data. Available for all LID technological models. See Browser manual PDF for more details about…

Releases and Changelogs (SPE)

…when using JSON as input. Speech Engine 3.38 Speech Engine 3.38.0, DB v1700, BSAPI 3.38.0 (2021-02-25) New: Training of LID Language Packs (no more need for command line tools… finally!) New: LID Language Packs allow to store meta-files New: New entity “LID Language Model” (equivalent of *.lpa LanguagePrint Archive) Improved: Updated STT model RU_RU_A to version 4.6.0 of (updated language