Skip to content Skip to main navigation Skip to footer

Search: spe-3.11.1-win64

127 results

Sizing of the computing units for speech technologies

…cores = 64 GB Conclusion: The best computing performance can be expected from a CPU with: l3_cache_size/#_of_physical_CPU_cores=>2.5 MB Memory bandwidth & speed is more important than CPU base frequency. Intel fixes on TLB due to Meltdown and Spectre issues matters in performance. Important notice (valid for SPE3) – due to internal SPE3 requirements you must multiple the required number of…

Speech Quality Estimation (SQE)

Phonexia’s Speech Quality Estimation quantifies the acoustic quality of recordings. This helps the user to quickly determine whether the acoustic quality of a recording is good for processing with other speech technologies or not. As an answer for SQE, the SPE returns a json/xml file. This file includes general information about the technology and statistics of all (one or two)…

STT: Results explained

This article aims on giving more details about Speech To Text outputs and hints on how to tailor Speech To Text to suit best your needs. In the process of transcribing speech, the Speech To Text technology usually identifies multiple alternatives for individual speech segments, as multiple phrases can have similar pronunciations, possibly with different word boundaries, e.g. “eight tea…

Licensing (technical details)

…its startup and read its content. Alternatively, you can: specify the license file location in a configuration file (only for SPE and RLS) start the product executable with a license (SPE and RLS) or l parameter (command line), specifying license file location set the license file location in BS_LICENSE environment variable (only for command line) License types NET license NET…

Phoneme Recogniser (PHNREC)

…user can add to language model of speech-to-text technology (better accuracy of KWS technology). Input audio file (format details – see Speech Engine documentation); stream not supported, technology model name (i.e. language code) to be used for phoneme transcription. Output In the process of transcribing speech-to-phonemes, the Phoneme Recogniser usually identifies individual speech segments and convert it to pronunciation. Example…

STT: Adding words to language model on the fly

Adding words to STT language model on-the-fly is possible in SPE 3.45 or newer as part of preferred phrases feature. The POST /technologies/stt or POST /technologies/stt/input_stream API calls actually serve two purposes: specify the actual preferred phrases (in the phrases part) specify words to be added to STT language model (in the dictionary part) Each part can be used independently,…

Speaker Diarization (DIAR)

…silence as well. The outputs of the technology can be both log files with labels and/or split audio files/one new multichannel audio file. Typical use cases: Preprocessing for other speech recognition technologies, labeling the parts of the utterance according to the speakers, splitting telephone conversations recorded in mono into several channels, identifying how many speakers are speaking in the recording….

Documentation (SPE)

Partners and customers are encouraged to read Speech Engine (PhxSpe | PhxSpe.exe) software API reference and various manuals available as files in [SPE]/doc in standard software package and installation. You can also find REST API reference (Speech Engine) documentation online. You might be interested in reading the following information in manual: REST API reference Structure of API queries Asynchronous request…

Q: How to fix the Error 1013: Unsupported: Server does not support authentication with token?

…would like to play with “pure” daemon installation, then phxspe.properties file should exist in ./settings subdirectory. File phxspe.properties is created by phxadmin utility or can be created from ./data/phxspe.properties.default template file. Copy template file to ./settings directory Rename it to phxspe.properties Check for server.enable_authentication_token directive and setup it as needed. Restart phxspe Basic installation steps are described in ./doc/INSTALL.html document….

Q: How to fix Error 1007: Unsupported audio format?

…ffmpeg utility, powerful and well documented. Please find your distribution package at http://ffmpeg.org Then continue as described below: Using Phonexia Browser with embed SPE Open the Browser configuration dialog by click on button “Settings” located in tool ribbon. Select tab “Speech Engine” and configure SPE as described in documentation. Don’t forget select checkbox “Enable audio converter”. Using SPE as service/daemon…

Phonexia technologies introduction

Core objective: Basic understanding of Phonexia speech technologies and products; typical use cases, implementations and deployment topologies Duration: 35 minutes intended for idea makers and product designers assumes generic knowledge of Phonexia and speech technologies in general Content 00:00 Introduction What information can we get from speech? Overview of basic use cases Phonexia Speech Platform brief 4:21 Phonexia technologies overview…

Arabic dialects in Phonexia LID and STT

…a bit, but you won’t understand Moroccan Data acquisition AUDIO (used for LID and STT training) MSA is used in formal speaking situations such as sermons, lectures, news broadcasts, and speeches so it is pretty difficult/impossible to find recordings of spontaneous phone conversations in MSA available MSA recordings are usually from broadcasting (microphone) or rather formal scripted speeches (also microphone)…

Understand SPE processing priority

…enabled in SPE configuration file (enabled by default, see server.task_priorities_enable option) and default priority value set the prioritize role enabled for SPE user creating the processing task If prioritization is enabled and processing task is started by a user without the “prioritize” role, task is started with default priority. Task priority is defined by a number from (highest priority) to…