Search: language print

16 results

LID: Terminology and adaptation

…you audio files using GET /technologies/languageid/extractlp endpoint Create new (yet empty) language model using POST /technologies/languageid/languagemodels/{name} endpoint Upload languageprint– or languageprint archive file to the language model using POST /technologies/languageid/languagemodels/{name}/file endpoint repeat this upload for all necessary files – e.g. when creating completely new language model from your own audio files, this would be hundreds or thousands of files (see…

Releases and Changelogs (SPE)

…messages Workers and streams info in debug log messages New: Possibility to obtain information about input RTP connection (see GET /input_stream/rtp/info) New: Endpoint to get languageprint information (see POST /technologies/languageid/lpinfo) Improved: Result of languageprint extraction now contains speech length for each languageprint (see GET /technologies/languageid/extractlp) Improved: Output RTP packet payload size changed from 480 to 160 bytes Fixed: SSRC in…

Language Identification (LID)

…Routing particular calls (languages) to human operators (language experts) Scoring and results The LID language pack defines a set of recognizable languages (represented by a language models). When identifying the language in audio recording (or languageprint), LID does the following: creates languageprint of the recording (if the input is audio recording) compares that languageprint with each language model in a…

SPE and Browser installation: standalone SPE

…Keyword Spotting Stream [disabled] 8) Language Identification LanguagePrint Comparator [disabled] 9) Language Identification LanguagePrint Extractor [disabled] 10) Speaker Identification 4 VoicePrint Extractor [disabled] 11) Speaker Identification 4 VoicePrint Comparator [disabled] 12) Speaker Identification 4 VoicePrint Calibration [disabled] 13) Speaker Identification 4 VoicePrint Stream Extractor [disabled] 14) Speaker Identification 4 VoicePrint Stream Comparator [disabled] 15) Speech Quality Estimation [disabled] 16) Speech…

FAQs (PSP)

…may use them for both training a new language pack and testing/comparing against an existing language pack. The language-prints need to be compatible only with the model of LID used for language-print extraction. in FAQ Speech Platform Permalink Q: What are the recommendations for LID adaptation set? A: The following is recommended: For adding new language to language pack 20+…

Understand SPE technologies configuration file

…Diarization GID Gender Identification KWS Keyword Spotting KWS_STREAM Keyword Spotting Stream LIDC Language Identification Languageprint Comparator LIDE Language Identification Languageprint Extractor PHNREC Phoneme Recognition SID4C Speaker Identification 4 Voiceprint Comparator SID4C_STREAM Speaker Identification 4 Voiceprint Stream Comparator SID4CALIB Speaker Identification 4 VoicePrint Calibration SID4E Speaker Identification 4 Voiceprint Extractor SID4E_STREAM Speaker Identification 4 Voiceprint Stream Extractor SQE Speech Quality Estimation…

Release Notes

…use of our SPE component. LID language models and language packs management in Browser It allows users to e.g. easily customize the set of languages in LID language packs. Customers will benefit from increased precision of results by lowering the false positive scores on customer data. Available for all LID technological models. See Browser manual PDF for more details about…

Speaker Identification (SID)

…technological model and can range from 5 to 50 times faster than real time on 1 server CPU core. Voiceprint extraction is the most time-consuming part of the process. Voiceprint comparison, on the other hand, is extremely fast – a millions of voiceprint comparisons can be done in 1 second. Voiceprint extraction (Speaker enrollment) Speaker enrollment starts with the extraction…

Q: Do the language-prints (LPs) extracted from audio sources depend on the currently available language pack?

A: The language-prints do not depend on the current language pack used. You may use them for both training a new language pack and testing/comparing against an existing language pack. The language-prints need to be compatible only with the model of LID used for language-print extraction….

Understand SPE database

…voiceprints – voiceprint data, technology model used to create the voiceprint, speaker model to which the voiceprint belongs (speaker model voiceprints), calibration set to which the voiceprint belongs (FAR calibration set voiceprints) rest_model_sid_calib_voiceprint SID speaker model voiceprints calibrated to FAR – voiceprint data, speaker model, technology model used to create the voiceprint, max. FAR, calibration set used to calibrate the…

SID: Speaker Identification: Results Enhancement

…(identified by its hash) and a voiceprint (identified by the SHA-256 hash). If the same voiceprint with the same profile is compared against any other voiceprint(s) again, the coefficients are loaded from the cache instead of being computed again. If the capacity of the cache is reached the least recently used coefficients are removed from the cache. The cache is…

Understand SPE directory structure

…for individual models settings BSAPI configuration files (*.bs) and optionally manually created user configs (*.bs.usr) There is one exception – LID – which has additional two directories containing pre-built languageprint archives (*.lpa) and language packs: lprints and models. Schemes below show examples of directories for GID (Gender Identification), STT (Speech To Text) and LID (Language Identification): – GID and LID…

Q: What are the recommendations for LID adaptation set?

A: The following is recommended: For adding new language to language pack 20+ hours of audio for each new language model (or 25+ hours of audio containing 80% of speech) Only 1 language per record For adapting the existing language model (discriminative training) 10+ hours of audio for each language May be done on customer site. May be done in…

Releases and Changelogs (Browser)

…Compatibility with SPE 3.45 + all changes included in Feature Preview release 3.42 (see below) Phonexia Browser 3.42 Phonexia Browser 3.42.0, BSAPI 3.42.1 (2021-08-24) New: Server Information dialog New: Widget and dialog for managing language models New: Dialog for creating new language pack Improved: Language pack widget – add/remove language packs, show metafiles and language details Phonexia Browser 3.40 (Public…

Voice Inspector – supporting technologies

This part requires higher (and non-anonymous) access level.
How to solve this situation:

Log in here if you are not logged in.
Register here. It takes just a few clicks and it’s free.