Skip to content Skip to main navigation Skip to footer

Search: escore%20*100%20formula

53 results

Q: How to fix Error 1007: Unsupported audio format?

Phonexia Browser application may return error “1007: Unsupported audio format” during uploading audio file. Please consider if your audio files are in Q: What are the supported audio formats? . But if you need use as input audio recordings in other formats, you can configure SPE for audio automated conversion. As prerequisite install external tool for audio conversion. Recommend is…

Understand SPE configuration

…of MySQL database connections at the time. Default is 32 # server.db.mysql.max_connections = 32 # Maximum size of in-memory cache for calibrated voice-prints of speaker models. Default is 100 # server.db.sid_model_calib_vp_cache_size = 100 Sizing of the system The selection of speech technologies and the number of instances per technology which are instantiated when starting the SPE is configured by the…

KWS: Results explained

…the detected pronunciation. Start- and end time is in HTK units. 1 HTK unit is 100 nanoseconds, so dividing the times by 10000 gives the amount of milliseconds. Score is log likelihood ratio from {-inf,+inf} interval. Confidence is a probability from {0,1} interval. To convert it to percentage, multiply the confidence value by 100. Example This example of Keyword Spotting…

FAQs (Browser)

…Browser. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What languages do you offer? It depends on the technology. Phonexia Language Identification (LID) is pre-trained for 60+ languages. Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT) for 20+ languages including English, French, German, Russian, Spanish and many more. in FAQ Phonexia Browser, FAQ Speech Platform Permalink Q: What…

SID: Speaker Identification: Results Enhancement

…User Calibration Mean Normalization Requirements: 100+ audio recordings from different speakers representing the source data, minimum 60 seconds net speech in each. Ideally, the set shouldn’t contain duplicates or target speaker recordings. Mean Normalization makes data coming from different domains comparable by compensating for the differences in channel, language etc. Mean Normalization is extremely lightweight and has little to no…

Licensing (technical details)

…by default. It describes all conditions and parameters that maintain the validity of the license itself, like product or technology name, unique license ID, license expiration, number of instances covered for each technology separately, etc. License file example: # Phonexia license file # generated 2017-08-10 20:18:49 UTC SERVER license.phonexia.com/lic USE_SERVER PRODUCT SPE_v3 D8091C4EA03C6A78455772A77BACC6FE 4521BD22 ED14A573 [email protected] # crc:121 slots:4 until:2017-12-10…

Speech Quality Estimation (SQE)

…of an empty recording SNR would divide by zero => is_valid would be false waveform_snr – the signal to noise ratio (SNR) describes the ratio of the useful signal to the noise signal it is measured in dB calculated from the waveform distribution, (silence – has Gaussian distribution, voice – has Gamma distribution); SNR = 20 * log10(S/N) technical signal…

Phonexia Partner Program for Government Partners

…to receive additional support for your important PoCs and demanding projects. Partnership Benefits Silver Partner Gold Partner Starter Kit Dedicated Technical Consultant X Up to 20 hours of consultation Basic Partner Portal Access X X X Advanced Partner Portal Access X X NFR License X 3 months NFR License Maintenance and Support X 3 months 2-day Live Technical Training X…

STT: What is Words-To-Numbers feature and how to use it

…variants are provided), for both file- and stream transcription. The reason for not having it available in the word-level outputs (One-best, Confusion Network) is that it would create difficulties in stream transcription – as new words keep coming, they may potentially change the previous output: two… 2 two thousand… 2000 two thousand twenty… 2020 two thousand twenty one 2021 And…

SPE and Browser installation: standalone SPE

…Quality Estimation Stream [disabled] 17) Speech To Text [disabled] 18) Speech To Text Input Stream [disabled] 19) Time Analysis [disabled] 20) Time Analysis Stream [disabled] 21) Voice Activity Detection [disabled] 22) Voice Activity Detector Stream Technology [disabled] 23) Enable all 24) Disable all 0) Quit Choose technology to configure [0]:23 Select the option to Enable all technologies (usually the option…

Language Identification (LID)

…LID score to percentage, use e score * 100 formula) LID adaptation (custom language packs) The scoring principle described above implies that score is distributed among all languages in a language pack. It means that every language has to score with non-zero value… i.e. that the scores may get diluted as they get spread among many languages. Additionally, if the…

SID4 performance on Intel® Xeon® Platinum 8124M

…32GB RAM, 30GB SSD based storage, 1000 I/O.s-1 reserved per core Benchmark data setup Data set statistic: Number of files: 32 [300 seconds each] RAW recordings total length: 9600 seconds Net speech total length: 4224.77 secons Data set contains 44% of speech signal, 56% of silence or technical signal Statistic counted by Phonexia VAD 3.22.1, “vad_2.bs” settings (AKA strict VAD,…

Q: How do I get results for a pending operation?

A: If server responds on pending request by status 200 – OK, the body of the response will have the result inside (server already has the result in cache memory and there is no need to process the file again). If server responds on pending request by status 202 – Accepted, server will create task and server will begin to…

Measuring of a software processing speed – what is the FtRT (Faster than Real Time)

Faster than Real Time (FtRT) metric is developed for defining software performance reference point. Using this metric you can collect “benchmark” data of real processing speed for reviewed software, which should be found – and reproduced – on exactly defined HW. Then, comparing various benchmarks result, you can compare performance of the specified software and its parts on different HW…