Search: train

42 results

Speech to Text (STT)

…1 CPU core (eg. standard 8 CPU core server (8 instances of STT) can process 1010 hours of audio in 1 day of computing time (flat load, depend on technology model)) Supported languages: List of supported languages. Acoustic models Acoustic model is created by training on training data. It includes characteristics of a voices of a set of speakers provided…

LID: Terminology and adaptation

…/path/to/lid/settings/lid_l4.bs -l MyLanguagePack.txt -train -M bsapi/lid/models/l4_MyLanguagePack where: -v parameter tells the tool to provide verbose console output -c parameter specifies path to .bs BSAPI configuration file for lid (use suffix according to LID technological model you are using – “l4”, “l3”, “xl3”) -l parameter specifies path to input listfile created in previous step -train parameter tells the tool to train…

Arabic dialects in Phonexia LID and STT

…difficulty in collecting spontaneous speech in dialect It might be tricky to create annotations for STT training – the dialect speakers write words down as they hear them, but given the missing standard for writing, different speakers can write words in different ways… i.e. annotations in dialect need to be double-checked and unified TEXT (used for STT language model training)…

Phonexia Academy

About Main idea of the Phonexia Academy is to help partners to understand the market, Phonexia’s products and technologies. Sell more, deliver your projects on time and at the highest quality, and support your clients effectively. We provide following trainings: Phonexia technologies introduction (online video course) Technical Training Essentials (online video course) Technical Training Advanced – 2 courses: Voice Biometrics…

Phonexia Partner Program for Government Partners

…the Starter Kit during the onboarding period? Yes, the Starter Kit can be purchased anytime during our cooperation. Can I purchase the Starter Kit multiple times? Yes, for each project, proof of concept, and product line, you can purchase a Starter Kit again. Phonexia consultants can’t wait to support your business. How do you deliver technical training? Phonexia technical training…

Understand SPE technologies, instances and workers

…post office staffing and the Speech Engine workers configuration: Some post office workers are trained only for certain types of services (e.g. postal services), while others are trained for other services (e.g. financial services). Speech Engine has separate workers for file processing and for realtime stream processing. They cannot provide other types of services than those which they were trained…

Age Estimation (AGE)

Phonexia Age Estimation (AGE) estimates the age of a speaker from audio recording or voiceprint. Technology Trained with emphasis on spontaneous telephony conversation The technology is language-, accent-, text-, and channel- independent Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc. Input Audio: WAV or RAW (8 or 16 bits linear…

Download Speech Platform

…Standalone mode – the recommended setup, requiring some manual steps using command line Further information resources Speech Engine REST API documentation online: https://download.phonexia.com/docs/spe/ offline: {SPE_directory}/doc/api_reference.html or http://{SPE_address:port}/doc Speech Engine technical documentation check the Speech Engine section and the “Understand…” articles listed in the left menu tutorials and training videos see technologies introduction video below and SPE Training videos section https://youtu.be/DDu0Y1rgQ6k…

Releases and Changelogs (SPE)

…compatibility with SID4 XL5 voiceprints) Fixed: Incorrect timestamp values in STT N-best results of stream transcription Fixed: Training of LID may get stuck in infinite loop in some cases Speech Engine 3.55 (Public release) Speech Engine 3.55.1, DB v1901, BSAPI 3.55.1 (2022-11-09) New: Added 6th generation models for STT and KWS KK_KZ_6 (Kazakh) BN_6 (Bengali) Fixed: phxcmd lpextract with -archive…

SID: Speaker Identification: Results Enhancement

…language. We have never seen this data during SID training so it is a sensible thing to calibrate the system. Since there is only a single source of data (telephony) and only a single language (Wakandan), one can assume that it is enough to create a single profile and use it for both sides of the comparison. We are monitoring…

FAQs (PSP)

…In that case you must pre-process the audio recording before uploading it to the Phonexia SPE or using it in the Phonexia Browser. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What languages do you offer? It depends on the technology. Phonexia Language Identification (LID) is pre-trained for 60+ languages. Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT)…

Waveform Denoiser (DENOISER)

Phonexia Waveform Denoiser (DENOISER) ensures automatic dereverberation (removal of echoes caused by sound in the rooms) and automatic noise reduction of the speech signal. The data model is usually trained for various types of noise using the latest generation of algorithms based on neural networks. Automatically removed are mainly noises similar to those that was software trained on. Conversely, the…

Phonexia technologies introduction

Core objective: Basic understanding of Phonexia speech technologies and products; typical use cases, implementations and deployment topologies Duration: 35 minutes intended for idea makers and product designers assumes generic knowledge of Phonexia and speech technologies in general Content 00:00 Introduction What information can we get from speech? Overview of basic use cases Phonexia Speech Platform brief 4:21 Phonexia technologies overview…

Documentation (VIN)

Partners and customers are encouraged to read the Voice Inspector End User Manual available as VIN-manual.pdf in the application’s installation directory. The manual can also be accessed from within the application by pressing F1, or selecting it in the Menu bar “Help > User guide“. You might be interested in reading the following information in the manual: Introduction Technical Requirements…

Speaker Diarization (DIAR)

Speaker Diarization labels segments of the same voice(s) in one mono-channel audio record based by the individual speaker´s voice. It is a language-, domain- and channel-independent technology. It performs not only the segmentation of speakers but of technical signals and silence as well. The outputs of the technology can be both log files with labels and/or split audio files/one new…