Search: results

43 results

STT: Results explained

…transcription is started using result_mode=incremental parameter in the request. In this mode, each request for transcription results returns only changes since the last request for results. In incremental mode, the received results may correct results received previously, e.g. when one request was sent in a middle of a word, the next request contains a correction, i.e. the correct entire word….

KWS: Results explained

…file article for more details. Example of user configuration file: [score_calib:SKeywordScoreCalibrationI] confidence_shift=0.0 confidence_sharpness=0.3 Results Keyword Spotting results contain list of detected keywords, each keyword with a start- and end time of the time slot where keyword was detected, and a score and confidence. Keyword is listed in the results with a numeric suffix. This number is a 0-based index of…

Voice Inspector – Interpretation of results

This part requires higher (and non-anonymous) access level.
How to solve this situation:

Log in here if you are not logged in.
Register here. It takes just a few clicks and it’s free.

Understand SPE database

…results JSON data rest_result_gid GID processing results – file, used technology model, results JSON data rest_result_kws KWS processing results – file, used technology model, used keyword list, results JSON data rest_result_lid LID processing results – file, used technology model, used language pack, results JSON data rest_result_phnrec PHNREC processing results – file, used technology model, results JSON data rest_result_sid SID processing…

SID: Speaker Identification: Results Enhancement

Speaker Identification (SID) Results Enhancement is a process that adjusts the score threshold for detecting/rejecting speakers by removing the effect of speech length and audio quality. This is achieved by use of Audio Source Profiles, that represent as closely as possible the source of the speech recording (device, acoustic channel, distance from microphone, language, gender, etc.). Although the out-of-the-box system…

Releases and Changelogs (SPE)

…to the change in GID results content, all GID results will be removed from cache (database) during update! Speech Engine 3.17.3 (08/22/2019) – DB v1200, BSAPI 3.21.3 [G_#191] Fixed: KWS getting phonemes/graphemes in specific circumstances returns unknown error [G_BSAPI#413] Fixed: duplicated output from KWS Speech Engine 3.17.2 (08/02/2019) – DB v1200, BSAPI 3.21.2 [G_BSAPI#300] Fixed: KWS stream results are displayed…

STT: How to properly convert Confusion Network results to One-best

Confusion Network output is the most detailed Speech Engine STT output as it provides multiple word alternatives for individual timeslots of processed speech signal. Therefore many applications want use it as the main source of speech transcription and perform eventual conversion to less verbose output formats internally. This article provides the recommended way to do the conversion. Time slots and…

Releases and Changelogs (Browser)

…column in Results pane New: Added “Minimum confidence to display” setting for Keyword Spotting in Settings dialog -> Scoring tab (affects number of hits displayed in Results pane) Phonexia Browser 3.51 Phonexia Browser 3.51.0, BSAPI 3.51.0 (2022-06-14) New: Compatibility with SPE 3.51 (e.g. option to set number of workers automatically) Changed: Show also numbers for AGE results Phonexia Browser 3.50…

Release Notes

…BROWSER Update We finished small but important improvements: The Age column in Results pane now shows the numeric results instead of age groups; column name changed to Age (±10 years) to emphasize the results tolerance Added the Keyword Spotting highest confidence column in Results pane, showing the highest confidence value of all detected keywords in a recording (allowing to judge…

Understand SPE configuration file

…of technologies in database server.db.save_results = true Controls whether processing results are cached in the SPE database. Results caching is enabled by default. If results caching is enabled, result of processing by each technology is saved to database for each file. Repeated request for processing the same file using the same technology, model, etc. then simply returns the cached result,…

Q: How do I get results for a pending operation?

A: If server responds on pending request by status 200 – OK, the body of the response will have the result inside (server already has the result in cache memory and there is no need to process the file again). If server responds on pending request by status 202 – Accepted, server will create task and server will begin to…

Understand SPE configuration

…available as long as the audio recording is present in the home directory; whenever audio files are deleted from the storage using the appropriate RESTful API call, all related results are erased from database. # Store results of technologies in database server.db.save_results = true< # Set SQLite database file server.db.sqlite.data_source = ${application.dir}phxspe.sqlite When you decide for more advanced database usage,…

Phonexia Speech Engine

…✓ Voice Activity Detection (VAD) ✓ ✓ Time Analysis Extraction (TAE) ✓ ✓ Speech Quality Estimation (SQE) ✓ ✓ Language Identification (LID) ✓ Gender Identification (GID) ✓ Age Estimation (AGE) ✓ Speaker Diarization (DIAR) ✓ Results caching Processing results can be optionally stored in results cache database to speed up eventual re-processing of the same recordings by the same technology…

Speaker Identification (SID)

…each other and check the results of all the comparisons. Since it’s known which comparison is which – which compares the same speaker (called target trial) and which compares different speakers (called non-target trial) – it’s also known which comparison should give a high score and which should give a low score. In the process of voiceprint comparison, two types…

Designing and Developing Application

…specific hardware (mainly CPU, virtualized infrastructure vs. HW) or are you going to buy specific HW for customer? What is short/long time storage requirements (ie. audio and results availability, desktop vs. distributed system)? Is there any synchronization required (ie. voiceprint database to clients)? What is the topology of the solution/app (ie. where to store audio, voiceprints, results, …)? How to…