Search: engine mono stereo

70 results

FAQs (Browser)

…localhost: giving up and kill the localhost. A: This error may happen if the initialization of SPE engine takes too long. Phonexia Browser software treats it as initialization failure and kills the server. You can fix this by doing the following: Increase timeout in Settings > Speech Engine tab > First connection timeout Use fewer instances of technologies, thus letting…

Speech Engine update

…of software and/or API (for example REST Server 2.1 -> SPE 3.0). It includes change in components or technology models. Speech Engine update procedure The update procedure is purely manual and heavily relies on your own detailed knowledge of your Speech Engine installation and its internal functionality and structures. This knowledge is crucial for tuning the Speech Engine for maximum…

Download Speech Platform

…started, please follow one of the two methods: Embedded mode – most effortless setup, but some options are unavailable Standalone mode – the recommended setup, requiring some manual steps using command line Further information resources Speech Engine REST API documentation online: https://download.phonexia.com/docs/spe/ offline: {SPE_directory}/doc/api_reference.html or http://{SPE_address:port}/doc Speech Engine technical documentation check the Speech Engine section and the “Understand…” articles listed…

Q: What is the difference between on-the-fly and off-line type of speech to text transcription (STT)?

A: Similarly as human, the ASR (STT) engine is doing the adaptation to an acoustic channel, environment and speaker. Also the ASR (STT) engine is learning more information about the content during time, that is used to improve recognition. The dictate engine, also known as on-the-fly transcription, does not look to the future and has information about just a few…

Understand SPE configuration

…database. Supported MySQL engines are based on original MySQL v5.6+ or MariaDBv10+. # Type of database # Suported are SQLite and MySql server.db.engine = SQLite The database is mainly used as the working cache – details about SPE user accounts are saved in the database as permanent objects, speech processing results are stored inside as dependent objects: the results are…

SID: TUTORIAL: Speaker Identification – How to Do a Basic Test

…Evaluation Package Evaluation package (download page) is consisting of Phonexia Browser and Phonexia Speech Engine including all necessary technologies. 2. Data We prepared the dataset for your testing. Package contains data for speaker model creation and speaker spotting too. The process of testing is the same for the data set collected by the user himself. Dataset is available to download…

Speech Engine

To create the SPE report: Go to the SPE installation directory Open command line/terminal (in Ubuntu Linux Right click + press E, in Windows type cmd in the address bar) Run ./phxadmin –report (Linux) or phxadmin.exe /report (Windows) Zip up the created directory with report and attach the ZIP file to your issue description The Report functionality is not present…

SPE and Browser installation: embedded SPE

…PhxBrowser (on Linux) You should see following the information window. Click OK to start the configuration. (You can later come back to alter the configuration by going to the Settings -> Speech Engine tab) In the Settings dialog, on the Speech Engine tab, Enable all the technologies and hit Apply. Make sure to hit Apply to apply the changes and…

Speaker Identification (SID)

…of net speech in each recording only one speaker in each recording recordings as similar to the target use case as possible mono lin16 format, 8 kHz+ sample rate This video shows how to perform SID system evaluation using Phonexia Browser: https://youtu.be/PfpOP90WC34 SID calibration The raw score must be calibrated to allow correct statistical interpretation. For example, in a…

STT: Results explained

…outputs The outputs can contain the following special tokens: Token (5th STT generation and newer) Token (legacy STT generations) Meaning <segment> <s> start of utterance </segment> </s> end of utterance <silence/> _SILENCE_ or <sil/> silent part (or no speech detected) <null/> _DELETE_ time slot should not go to one-best output Realtime stream processing output modes NOTE: Only single-channel (mono) audio…

Q: What are the requirements for SID evaluation dataset?

…in each recording (i.e. usually 2+ minutes recording length) only one speaker in each recording wide variety of gender and age is recommended recordings should be as similar to the target use case as possible (device, channel, distance from mic, languages distribution) audio files should be mono, lin16 format, 8 kHz+ sample rate *Note: splitting single recording into multiple shorter…

Understand SPE database

…kept in the database at all. Supported databases SPE supports SQLite and MariaDB 10.x (SPE 3.46+) MySQL 5.x (SPE up to 3.45) database engine. The database engine is configured in phxspe.properties SPE configuration file – see the Database section of SPE configuration file article for more details. SQLite SQLite is the out-of-the-box SPE default database type. By its nature, SQLite…

Keyword Spotting (KWS)

…a numerical expression of probability that word was said in a specified time frame. Keywords Keywords are not dependent on any dictionary. This allows to define specific, foreign or even nonexistent words like product names. However, only allowed graphemes (symbols) from a supported list can be used to define keywords. This list can be easily obtained by Speech Engine and…

Q: I can’t manage to run Phonexia Browser software. I always get an error.

…happen if the initialization of SPE engine takes too long. Phonexia Browser software treats it as initialization failure and kills the server. You can fix this by doing the following: Increase timeout in Settings > Speech Engine tab > First connection timeout Use fewer instances of technologies, thus letting the Speech Engine to start faster Use smaller models of technologies…

Understand SPE workers configuration

Worker is a working thread performing the actual files- or realtime streams processing in Speech Engine. This article helps to understand the Speech Engine workers and provides information how to configure workers for optimal performance and server utilization. Starting from SPE 3.51, new defaults in settings/phxspe.properties make SPE to configure workers automatically according to local conditions (physical CPU cores, configured…