Search Results for: language model customization

Results 11 - 20 of 66 Page 2 of 7
Results per-page: 10 | 20 | 50 | 100

Q: How can I add new language to LID?

Relevance: 16%      Posted on: 2017-06-27

A: There are multiple methods to train a new language, please see article in Components > Speech Technologies > LID.

Speech Intelligence Resolver v1

Relevance: 15%      Posted on: 2017-05-18

About Phonexia Speech Intelligence Resolver v1 (SIR1) combines the power of speech technologies within a single application. The application automatically performs visualization of the record as well as filtering the speech metadata uncovered from your records effectively. Speech technologies implemented: Phonexia Speaker Identification (SID2) Phonexia Language Identification (LID2) Phonexia Gender identification (GID) Phonexia Voice Activity Detection (VAD) Phonexia Speaker Diarization (DIAR) Phonexia Keyword Spotting (KWS) Phonexia Speech Quality Estimator (SQE) Phonexia Speech Transcription (STT) SIR is a client application cooperating with REST servers. It can be used as a standalone application due to the integrated local REST server. It was…

Arabic dialects in Phonexia LID and STT

Relevance: 10%      Posted on: 2021-01-18

Arabic language has (a) one standardised variety, and (b) many non-standard varieties (dialects). In this article, our linguistic team explains differences between Modern Standard Arabic and Arabic dialects in the context of Phonexia Arabic models. Standard variety:  Modern Standard Arabic (MSA) All Arabs learn it at school (not from their parents, so we cannot say it is their native variety) It is lingua franca (common language) for the Arabic world – like English for Europeans; however, Arabs speak it much better since they are schooled in MSA from early age MSA is more similar to some dialects (e.g. Levantine), but…

Speech To Text results explained

Relevance: 8%      Posted on: 2019-05-27

This article aims on giving more details about Speech To Text outputs and hints on how to tailor Speech To Text to suit best your needs. In the process of transcribing speech, the Speech To Text technology usually identifies multiple alternatives for individual speech segments, as multiple phrases can have similar pronunciations, possibly with different word boundaries, e.g. “eight tea machines” vs. “eighty machines”. The technology provides various output types which show only single or multiple transcription alternatives. For processing realtime streams, two result modes are supported – one mode provides complete transcription, second mode provides incremental results. Output types…

Q: Please give me a recommendation for LID adaptation set.

Relevance: 7%      Posted on: 2017-06-27

A: The following is recommended: For adding new language to language pack 20+ hours of audio for each new language model (or 25+ hours of audio containing 80% of speech) Only 1 language per record For adapting the existing language model (discriminative training) 10+ hours of audio for each language May be done on customer site. May be done in Phonexia using anonymized data (= language-prints extracted from a .wav audio)

Browser3 – Releases and Changelogs

Relevance: 5%      Posted on: 2020-10-23

Phonexia Browser v3 (Browser3) is developed as client on top of Phonexia Speech Engine v3. Phonexia Browser is a successor of Phonexia Speech Intelligence Resolver v1 (SIR1). This page lists changes in Browser releases. Releases Changelogs Phonexia Browser v3.35.2, BSAPI 3.35.2 - Oct 21 2020 Public release Fixed: Speaker identification dialog in WaveEditor which did not work for SID4 Fixed detection of certain USB license tokens Phonexia Browser v3.35.0, BSAPI 3.35.0 - Oct 02 2020 Public release New: Compatibility with SPE 3.35 Phonexia Browser v3.30.12, BSAPI 3.30.11 - Aug 20 2020 Public release Fixed: Transcription results intermittently displays words in wrong…

Speaker Identification (SID)

Relevance: 5%      Posted on: 2019-06-13

Phonexia Speaker Identification uses the power of voice biometry to recognize speakers by their voice... i.e. to decide whether the voice in two recordings belongs to the same person or two different people. High accuracy of Speaker Identification, the Phonexia's flagship technology, has been validated in a NIST Speaker Recognition Evaluations. Basic use cases and application areas The technology can be used for various speaker recognition tasks. One basic distinction is based on the kind of question we want to answer. Speaker Identification is the case when we are asking "Whose voice is this?", such as in fake emergency calls.…

Speaker Identification: Results Enhancement

Relevance: 5%      Posted on: 2019-05-29

Speaker Identification (SID) Results Enhancement is a process that adjusts the score threshold for detecting/rejecting speakers by removing the effect of speech length and audio quality. This is achieved by use of Audio Source Profiles, that represent as closely as possible the source of the speech recording (device, acoustic channel, distance from microphone, language, gender, etc.). Although the out-of-the-box system is robust in such factors, several result enhancement procedures can provide even better results and stronger evidence. Audio Source Profile An Audio Source Profile is a representation of the speech source, e.g., device, acoustic channel, distance from microphone, language, gender,…

Keyword Spotting results explained

Relevance: 5%      Posted on: 2019-06-12

This article aims on giving more details about Keyword Spotting outputs and hints on how to tailor Keyword Spotting to suit best your needs. Scoring Keyword Spotting works by calculating likelihoods that at a given spot occurs a keyword or just any other speech, and comparing those two likelihoods. The following scheme shows Background model for anything before the keyword (1), the Keyword model (2) and a Background model of any speech parallel with the keyword model (3). Models 2 and 3 produce two likelihoods – Lkw and Lbg (any speech = background). Raw score is calculated as log likelihood…

Packages, Updates vs. Upgrades

Relevance: 4%      Posted on: 2018-04-15

Our packages follow the bug-fix /updates / upgrades approach. Some packages are distributed with limited set of speech technologies or without speech technologies. Packages Our software is distributed as ZIP file. Installation procedure is matter of unzipping archive, reconfiguration and start of software. SPE and VIN package contains speech technologies (note: SPE might contain only selected technologies).  PhxBrowser does not contain speech technologies and it needs to be combined with SPE. The software is activated by licensing file. Updates vs. Upgrades Bugfix By bugfix we understand a fix of known problem without changing components or technology models. Bugfix changes only…