Search Results for: score transformation

Results 11 - 20 of 24 Page 2 of 3
Results per-page: 10 | 20 | 50 | 100

Browser3 – Releases and Changelogs

Relevance: 8%      Posted on: 2019-10-09

Phonexia Browser v3 (Browser3) is developed as client on top of Phonexia Speech Engine v3. Phonexia Browser is a successor of Phonexia Speech Intelligence Resolver v1 (SIR1). This page lists changes in Browser releases. Releases Changelogs Phonexia Browser v3.18.0, BSAPI 3.22.0 - Oct 03 2019 New: Waveform editor can now process stereo file by Diarization in per-channel mode New: Added Gender balance and Score sharpness in Settings -> Scoring New: Multiple columns in Result pane can be turned on/off at once using context menu New: Minimum speech length changed to 7 seconds Fixed: LID results information chart is not updated…

Terminology

Relevance: 8%      Posted on: 2017-06-15

Document which briefly describes processes and relations in Phonexia Technologies with consideration on correct word usage.   SID - Speaker Identification Technology (about SID technology) which recognize the speaker in the audio based on the input data (usually database of voiceprints). XL3, L3,L2,S2 - Technology models of SID. Speaker enrollment - Process, where the speaker model is created (usually new record in the voiceprint database). Speaker model: 1/ should reach recommended minimums (net speech, audio quality), 2/ should be made with more net speech and thus be more robust. The test recordings (payload) are then compared to the model (see…

VIN3 – Releases and Changelogs

Relevance: 8%      Posted on: 2018-04-08

Phonexia Voice Inspector v3 (VIN) is developed as a desktop application on top of Phonexia BSAPI. This page lists changes in VIN releases. Releases Changelogs Voice Inspector v3.2.2, BSAPI 3.15.0 - Jun 5 2018 - Fixed possible application crash on Windows - Added phoneme type 'affricate' and fixed phoneme types: * phoneme 'C' changed from 'fricative' to 'affricate' * phoneme 'D' changed from 'fricative to 'plosive' * phoneme 'T' changed from 'fricative to 'plosive' * phoneme 'c' changed from 'plosive' to 'affricate' Voice Inspector v3.2.1, BSAPI 3.15.0 - Mar 16 2018 - Export of Speakers/Populations allows export only voiceprints -…

Language Identification (LID)

Relevance: 8%      Posted on: 2019-05-20

Phonexia Language Identification (LID) will help you distinguish the spoken language or dialect. It will enable your system to automatically route valuable calls to your experts in the given language or to send them to other software for analysis. Phonexia uses state-of-the-art language identification (LID) technology based on iVectors that were introduced by NIST (National Institute of Standards and Technology, USA) during the 2010 evaluations. The technology is independent on any text, language, dialect, or channel. This highly accurate technology uses the power of voice biometrics to automatically recognize spoken language. Application areas Preselecting multilingual sources and routing audio streams/files…

Age Estimation

Relevance: 8%      Posted on: 2018-04-12

Phonexia Age Estimation (AGE) estimates the age of a speaker from audio recording. The process of voiceprint extraction is similar to the extraction of SID, but as a result different features get extracted; therefore, the voiceprints extracted from AGE and SID are not mutually compatible. Technology Trained with emphasis on spontaneous telephony conversation The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc. Input Input format for processing: WAV or RAW (8 or 16 bits linear coding), A-law or Mu-law, PCM, 8kHz+ sampling…

Time Analysis

Relevance: 8%      Posted on: 2018-04-15

Time Analysis Extraction (TAE) by Phonexia extracts base information from dialogue in a recording, providing essential knowledge about conversation flow. That makes it easy to identify long reaction time, crosstalk, or responses of speakers in both channels. This technology is only meaningful when used on recordings with 2 channels. As an answer to the TAE technology, SPE returns a json/xml file. This file includes general information about the technology and details of the time analysis. The technology can work either with a closed recording or with a stream. Monologue Describes the statistics of a recording related to one channel. channel…

SPE3 – Releases and Changelogs

Relevance: 8%      Posted on: 2019-10-02

Speech Engine (SPE) is developed as RESTfull API on top of Phonexia BSAPI. SPE was formerly known as BSAPI-rest (up to v2.x) or as Phonexia Server (up to v3.2.x). This page lists changes in SPE releases. Releases Changelogs == SPE v3.18.x == Speech Engine 3.18.2 (10/14/2019) - DB v1300, BSAPI 3.22.1 Fixed: Customized STT model fails on Windows with Request for next state but ending state reached. error message Speech Engine 3.18.1 (10/01/2019) - DB v1300, BSAPI 3.22.0 New: DICTATE technology has been renamed to STT_STREAM (/technologies/dictate -> /technologies/stt/stream) (for backward compatibility, the /technologies/dictate endpoint is internally redirected) New: SID/SID4…

Speaker Identification (SID)

Relevance: 8%      Posted on: 2019-06-13

Phonexia Speaker Identification uses the power of voice biometry to recognize speakers by their voice... i.e. to decide whether the voice in two recordings belongs to the same person or two different people. High accuracy of Speaker Identification, the Phonexia's flagship technology, has been validated in a NIST Speaker Recognition Evaluations. Basic use cases and application areas The technology can be used for various speaker recognition tasks. One basic distinction is based on the kind of question we want to answer. Speaker Identification is the case when we are asking "Whose voice is this?", such as in fake emergency calls.…

Speaker Identification: Results Enhancement

Relevance: 8%      Posted on: 2019-05-29

Speaker Identification (SID) Results Enhancement is a process that adjusts the score threshold for detecting/rejecting speakers by removing the effect of speech length and audio quality. This is achieved by use of Audio Source Profiles, that represent as closely as possible the source of the speech recording (device, acoustic channel, distance from microphone, language, gender, etc.). Although the out-of-the-box system is robust in such factors, several result enhancement procedures can provide even better results and stronger evidence. Audio Source Profile An Audio Source Profile is a representation of the speech source, e.g., device, acoustic channel, distance from microphone, language, gender,…

Speech Quality Estimation

Relevance: 8%      Posted on: 2018-04-02

Speech Quality Estimation is a language-, domain- and channel-independent technology that serves to quantify the quality of an audio recording. 2 most important statistics that it bases its score on are SNR (Speech-to-noise ratio) and bitrate of the recording. SQE is usually part of rapid filtration process in deployment. SQE also measures over 20 other properties of the recording, all of which can be found in the output file and further processed. See description in SPE documentation. Typical use cases are: verification of recordings' quality on the input, searching based on quality of the recording, noise of environment or speaker's…