Search Results for: STT performance

Results 11 - 20 of 42 Page 2 of 5
Results per-page: 10 | 20 | 50 | 100

Sizing of the computing units for speech technologies

Relevance: 8%      Posted on: 2018-02-02

Best practices for good sizing of Phonexia technologies depend on a few facts: Intense work with large data sets requires good performance and bandwidth between RAM and CPU. It all depends on the size of the files with technological models data, usually loaded into RAM and used intensively for computing operations Always think only about physical cores of CPU (HT, VT features can't help in performance) Also seek for CPUs with a large L3 cache. And the better CPUs are those with higher l3_cache_size/#_of_physical_CPU_cores ratio. We currently assume that CPUs from the current Intel Xeon Family in the 4th generation…

Browser3 – Releases and Changelogs

Relevance: 8%      Posted on: 2020-08-21

Phonexia Browser v3 (Browser3) is developed as client on top of Phonexia Speech Engine v3. Phonexia Browser is a successor of Phonexia Speech Intelligence Resolver v1 (SIR1). This page lists changes in Browser releases. Releases Changelogs Phonexia Browser v3.30.12, BSAPI 3.30.11 - Aug 20 2020 Public release Fixed: Transcription results intermittently displays words in wrong order Versions 3.30.9, 3.30.10 and 3.30.11 were skipped Phonexia Browser v3.31.3, BSAPI 3.30.11 - Aug 20 2020 Non-public Feature Preview release Fixed: Transcription results intermittently displays words in wrong order Phonexia Browser v3.31.2, BSAPI 3.31.0 - Jul 24 2020 Non-public Feature Preview release Fixed: STT

Speech Intelligence Resolver v1

Relevance: 7%      Posted on: 2017-05-18

About Phonexia Speech Intelligence Resolver v1 (SIR1) combines the power of speech technologies within a single application. The application automatically performs visualization of the record as well as filtering the speech metadata uncovered from your records effectively. Speech technologies implemented: Phonexia Speaker Identification (SID2) Phonexia Language Identification (LID2) Phonexia Gender identification (GID) Phonexia Voice Activity Detection (VAD) Phonexia Speaker Diarization (DIAR) Phonexia Keyword Spotting (KWS) Phonexia Speech Quality Estimator (SQE) Phonexia Speech Transcription (STT) SIR is a client application cooperating with REST servers. It can be used as a standalone application due to the integrated local REST server. It was…

How to configure Speech Engine workers

Relevance: 7%      Posted on: 2020-03-28

Worker is a working thread performing the actual files- or realtime streams processing in Speech Engine. This article helps to understand the Speech Engine workers and provides information how to configure workers for optimal performance and server utilization. The default workers configuration in settings/phxspe.properties is as shown below – 8 workers for files processing and 8 workers for realtime streams processing. These numbers mean the maximum number of simultaneously running tasks. # Multithread settings server.n_workers = 8 server.n_realtime_workers = 8 Requests for additional file processing tasks are put in a queue and processed according their order and priorities. Requests for…

Speech To Text results explained

Relevance: 6%      Posted on: 2019-05-27

This article aims on giving more details about Speech To Text outputs and hints on how to tailor Speech To Text to suit best your needs. In the process of transcribing speech, the Speech To Text technology usually identifies multiple alternatives for individual speech segments, as multiple phrases can have similar pronunciations, possibly with different word boundaries, e.g. “eight tea machines” vs. “eighty machines”. The technology provides various output types which show only single or multiple transcription alternatives. For processing realtime streams, two result modes are supported – one mode provides complete transcription, second mode provides incremental results. Output types…

What is a user configuration file and how to use it

Relevance: 6%      Posted on: 2020-03-28

Advanced users with appropriate knowledge (gained e.g. by taking the Phonexia Academy Advanced Training) may want to finetune behavior of the technologies to adapt to the nature of their audio data. Modifying original BSAPI configuration files directly can be dangerous – inappropriate changes may cause unpredicatble behavior and without having a backup of the unmodified file it's difficult to restore working state. User configuration files provide a way to override processing parameters without modifying original BSAPI configuration files. WARNING: Inappropriate configuration changes may cause serious issues! Make sure you really know what you are doing. User configuration file is a…

Terminology

Relevance: 6%      Posted on: 2017-06-15

Document which briefly describes processes and relations in Phonexia Technologies with consideration on correct word usage.   SID - Speaker Identification Technology (about SID technology) which recognize the speaker in the audio based on the input data (usually database of voiceprints). XL3, L3,L2,S2 - Technology models of SID. Speaker enrollment - Process, where the speaker model is created (usually new record in the voiceprint database). Speaker model: 1/ should reach recommended minimums (net speech, audio quality), 2/ should be made with more net speech and thus be more robust. The test recordings (payload) are then compared to the model (see…

Voice Biometrics Course (technical training)

Relevance: 5%      Posted on: 2017-05-18

The Voice Biometrics course consist of the following modules. Please ask your Phonexia contact for detailed description. (YES = this part is mandatory for course)   VBS course Required time [h] Block name Block description YES 0,5 Intro & Phonexia Portfolio Intro & Phonexia Portfolio YES 0,5 Project focus - Explain basic needs Partner project related discussion focused mainly to finalizing training topics and agenda YES 0,75 Apps Designing and Developing - Licensing Gives trainee knowledge about type of licensing, and how to use the license file YES 0,75 Technologies - Data gathering and Quality measurement - basic Data gathering…

Speech Analytics Course (technical training)

Relevance: 5%      Posted on: 2017-05-18

The Speech Analytics course consists of the following modules. Please ask your Phonexia contact for detailed description. (YES = this part of the course is obligatory)   SAL course Required time [h] Block name Block description YES 0,5 Intro & Phonexia Portfolio Intro & Phonexia Portfolio YES 0,5 Project focus – Explain basic needs Discussion of partner project focused mainly on finalizing the training topics and agenda. YES 0,75 Application Design & Development – Licensing Presentation of types of licensing, and how to use the license file. YES 0,75 Technologies – Data gathering and Quality measurement – basic Description of…

SPE configuration

Relevance: 5%      Posted on: 2018-02-02

Basic explanation of configuration directives for SPE with hints & tips. Overview of phxspe.properties for beginners.