Skip to content Skip to main navigation Skip to footer

Search: time%20unit

59 results

Understand SPE connectors for external TTS

…capabilities of the TTS service is not a good idea as it might potentially get incorrect over the time, leading to obscure issues in the application relying on the info. Required capabilities information JSON structure: { “apiVersion”: 2, “vendor”: string, “author”: string, “version”: string, “voices”: [ { “name”: string, “languageCodes”: [string, string, …], “naturalSampleRateHertz”: number }, . . . ]…

Understand SPE multithreaded technologies initialization

The server.technology_multithread_initialization setting in SPE configuration allows SPE to initialize instances of technologies during startup using multiple parallel threads. Default setting is OFF, i.e. instances of technologies are initialized using single thread, one-by-one. This allows easier tracking of eventual issues during SPE startup and better readability of technologies initialization log messages (only single initialization happens at a time). The downside…

Q: I can’t manage to run Phonexia Browser software. I always get an error.

…happen if the initialization of SPE engine takes too long. Phonexia Browser software treats it as initialization failure and kills the server. You can fix this by doing the following: Increase timeout in Settings > Speech Engine tab > First connection timeout Use fewer instances of technologies, thus letting the Speech Engine to start faster Use smaller models of technologies…

Licensing (technical details)

…machine for processing (64-bit required) and make sure its OS, CPU, HDD and ETH configuration will not change over time (this is especially important in virtual or cloud environments) Create a TXT file containing the HW profile download the HW-GEN tool and run it (choose below, based on your operating system) Linux 64-bit: https://download.phonexia.com/utils/hw-gen64 (or in ZIP) Windows 64-bit: https://download.phonexia.com/utils/hw-gen64.exe…

Speech Quality Estimation (SQE)

…of bits used by the waveform absolute value if less than 8, the signal has insufficient quality wfilter_technical_signal_length – the length of technical signals (tones, wide-band noise, etc.), measured in seconds Processing speed Approx. 2,000x faster than real-time processing on 1 CPU core i.e. standard 8 CPU core server processes 384,000 hours of audio in 1 day of computing time

Age Estimation (AGE)

…coding), A-law or Mu-law, PCM, 8kHz+ sampling Voiceprints: AGE L4 model supports SID4 L4 voiceprints; legacy AGE models support voiceprints created by AGE itself Output Log file with processed information (age estimate) Processing speed Approx. 20x faster than real-time processing on 1 CPU core i.e. standard 8 CPU core server processes 3,840 hours of audio in 1 day of computing…

Designing and Developing Application

Before designing and developing the application, we encourage Partner to find clear answer for the following questions: Customer requirements: Do my customers need file processing (audio) or stream processing in real time? What is the human power of the customer that can analyze the results? How many minutes per day or streams in parallel do my customer need to process?…

STT: Configuring word detection parameters for stream transcription

One of the improvements implemented since Speech Engine 3.24 is neural-network based VAD, used for word- and segment detection. This article describes the segmenter configuration parameters and how they are affecting the realtime stream STT results. The default segmenter parametrs are as shown below: [vad.online_segmenter:SOnlineVoiceActivitySegmenterI] backward_extensions_length_ms=150 forward_extensions_length_ms=750 speech_threshold=0.5 Backward- and forward extension are intervals in miliseconds, which extend the part…

What is User configuration file and how to use it

…example: When using Czech STT on realtime streams, the results show that system outputs end of segment too often, i.e. longer pauses between words made by the speakers are misidentified as end of sentence, while in fact the speakers actually continue to speak. So it is desired to finetune the system to accept longer delay between words without ending a…

Phonexia Partner Program for Government Partners

…the Starter Kit during the onboarding period? Yes, the Starter Kit can be purchased anytime during our cooperation. Can I purchase the Starter Kit multiple times? Yes, for each project, proof of concept, and product line, you can purchase a Starter Kit again. Phonexia consultants can’t wait to support your business. How do you deliver technical training? Phonexia technical training…

Q: What are the requirements for SID evaluation dataset?

…recordings in order to meet the criteria of at least 3 recordings for each speaker is not the right way to proceed. This way you are not adding any details. You are essentially analyzing details of a single recording five times. In contrast, by using 5 unique recordings coming from different audio environments or even different times of the day,…

Understand SPE technologies configuration file

…SQE_STREAM Speech Quality Estimation Stream STT Speech To Text STT_STREAM Speech To Text Stream TAE Time Analysis Extraction TAE_STREAM Time Analysis Extraction Stream VAD Voice Activity Detection VAD_STREAM Voice Activity Detection Stream SIDC Speaker Identification Voiceprint Comparator (legacy) SIDC_STREAM Speaker Identification Voiceprint Stream Comparator (legacy) SIDCALIBSET Speaker Identification VoicePrint Calibration (legacy) SIDCALIBSET_STREAM Speaker Identification VoicePrint Stream Calibration (legacy) SIDE Speaker…

Phonexia Speech Engine

…✓ Voice Activity Detection (VAD) ✓ ✓ Time Analysis Extraction (TAE) ✓ ✓ Speech Quality Estimation (SQE) ✓ ✓ Language Identification (LID) ✓ Gender Identification (GID) ✓ Age Estimation (AGE) ✓ Speaker Diarization (DIAR) ✓ Results caching Processing results can be optionally stored in results cache database to speed up eventual re-processing of the same recordings by the same technology…