Skip to content Skip to main navigation Skip to footer

Search: speed

20 results

Measuring of a software processing speed – what is the FtRT (Faster than Real Time)

Faster than Real Time (FtRT) metric is developed for defining software performance reference point. Using this metric you can collect “benchmark” data of real processing speed for reviewed software, which should be found – and reproduced – on exactly defined HW. Then, comparing various benchmarks result, you can compare performance of the specified software and its parts on different HW…

Releases and Changelogs (SPE)

…Models for STT and KWS with features of CS_CZ_6 (new VAD generation, dynamic adding of words in preferred phrases, increased transcription precision via updated decoder) IT_IT_6 RU_RU_6 Fixed: Significantly improved speed of diarization model XL4 on longer audios Removed: Removed old STT/KWS model for IT_IT Speech Engine 3.51 Speech Engine 3.51.0, DB v1901, BSAPI 3.51.0 (2022-06-14) New: Added option to…

Understand SPE benchmark

…the audio, and the amount of actual speech in the audio affect the processing speed… because the the non-speech parts are stripped from the audio before processing. The processing speed is then calculated as follows: FtRT = sum_of_speech_lengths_in_all_recordings ÷ sum_of_processing_times_of_all_recordings When using the option with your specified file, only that single recording is used… so, to account for various audio…

Understand SPE configuration

…It may increase speed of logging but in case of server crash or if server is killed, # some logs may be lost. Default is false. server.logging.enable_async = false # Name of server used in log. If it is not specified hostname is used instead. #server.logging.database.identifier = spe # Set set umask value for server (Linux only) # server.umask =…

Time Analysis Extraction (TAE)

Technology description Time Analysis Extraction (TAE) by Phonexia extracts base information from dialogue in a recording, providing essential knowledge about conversation flow. That makes easy to identify: long reaction time crosstalk responses of speakers in both channels speed of speech measured in phonemes per second Typical usage domain It is typically used in contact centers for indicating weak moments in…

Understand SPE configuration file

…It may increase speed of logging but in case of server crash or if server is killed, # some logs may be lost. Default is false. server.logging.enable_async = false Sets the log writing strategy – whether logging is done in separate thread, or not. Logging using separate thread may increase performace, but in case of SPE crash or when SPE…

Phonexia Speech Engine

…✓ Voice Activity Detection (VAD) ✓ ✓ Time Analysis Extraction (TAE) ✓ ✓ Speech Quality Estimation (SQE) ✓ ✓ Language Identification (LID) ✓ Gender Identification (GID) ✓ Age Estimation (AGE) ✓ Speaker Diarization (DIAR) ✓ Results caching Processing results can be optionally stored in results cache database to speed up eventual re-processing of the same recordings by the same technology…

Speech to Text (STT)

…including discriminative training and neural network-based features Output One-best transcription – i.e. a file with a time-aligned speech transcript (time of word’s start and end) Variants for transcriptions – i.e. hypotheses for words at each moment (confusion network) or hypotheses for utterances at each slot (n-best transcription) Processing speed – several versions available: from 8x faster than real-time processing on…

Sizing of the computing units for speech technologies

…cores = 64 GB Conclusion: The best computing performance can be expected from a CPU with: l3_cache_size/#_of_physical_CPU_cores=>2.5 MB Memory bandwidth & speed is more important than CPU base frequency. Intel fixes on TLB due to Meltdown and Spectre issues matters in performance. Important notice (valid for SPE3) – due to internal SPE3 requirements you must multiple the required number of…

Speaker Identification (SID)

…signal captured in a recording are also more or less unique, thus the technology can be language-, accent-, text-, and channel-independent. Automatic speaker recognition systems are based on the extraction of the unique features from voices and their comparison. The systems thus usually comprise two distinct steps: Voiceprint Extraction (Speaker enrollment) and Voiceprint comparison. The processing speed depends on the…

Gender Identification (GID)

…7+ sec recommended (with XL4 and L4 model (9+ sec for previous generation of XL3 and L3 models) Output scoring: likelihood ratio and percentage metric (0-100%) Typical use cases: filtering calls by gender, playing advertisement focused on specific gender, getting quick demographic analysis of the recordings. The speed of Gender Identification is up to 150 FtRT (depending on the model)….

Terms of Service

…of this content. 4. Use of PHONEXIA Services 4.1. Equipment Requirements. PHONEXIA Members must provide all equipment required to use the PHONEXIA service including but not limited to a computer and a phone, as well as the respective services such as high speed Internet connection, and telephone service (land-line or cellular phone) through a third-party provider. PHONEXIA does not provide…

Voice Activity Detection (VAD)

…VAD is usually part of rapid filtration process in deployment. Typical use cases are: detection of present or absent human speech for voice processing, filtering non-speech parts of the recording, filtering out recordings with not enough net speech to be processed by other technologies voice activated process, etc. The speed of Voice Activity Detection is 140 ftRT per one instance….

Speech Quality Estimation (SQE)

…of bits used by the waveform absolute value if less than 8, the signal has insufficient quality wfilter_technical_signal_length – the length of technical signals (tones, wide-band noise, etc.), measured in seconds Processing speed Approx. 2,000x faster than real-time processing on 1 CPU core i.e. standard 8 CPU core server processes 384,000 hours of audio in 1 day of computing time…

Releases and Changelogs (Browser)

…wizard can’t create a report if server doesn’t support Diarization [G#21] Unified SID terminology Phonexia Browser v3.10.1, BSAPI 3.14.0 – Dec 6 2017 [#5068] Speed up preparing of calibration set [#5036] Use own configuration file for local SPE – original configuration file of SPE is not changed anymore [#4542] Better error message when calibration set contains invalid recordings [#5195] Added…