Search: log

54 results

Language Identification – Languages

…Croatian Croatian sh Serbo-Croat-Bosnian Bosnian Bosnian sn Shona Shona Shona sk-SK Slovak Slovak Slovak sl-SI Slovenian so Somali Somali Somali es-XA Spanish (America) es-ES Spanish (Europe) Spanish Spanish sw Swahili Swahili Swahili sv-SE Swedish Swedish tl-PH Tagalog Tagalog ta Tamil Tamil Tamil te-IN Telugu th-TH Thai Thai Thai bo Tibetan Tibetan Tibetan…

Understand SPE connectors for external TTS

…should log a successful TTS connector initialization: TTSSubsystem: Retrieving external connector info from ……./external/technologies/tts/acapela TTSSubsystem: External connector ‘acapela’ from ……./external/technologies/tts/acapela has been registered. If an error occurs, SPE logs the problem: TTSSubsystem: Retrieving external connector info from ……./external/technologies/tts/acapela TTSSubsystem: Cannot retrieve external connector info! ERROR: Loading configuration from “……./external/technologies/tts/acapela/connector.properties”;Error: acapela server is not running or address and ports are misconfigured;…

Phonexia Browser

When reporting issue with Phonexia Browser please attach both SPE report and Browser log. To create the SPE report: Go to the SPE installation directory Open command line/terminal (in Ubuntu Linux Right click + press E, in Windows type cmd in the address bar) Run ./phxadmin –report (Linux) or phxadmin.exe /report (Windows) Zip up the created directory with report and…

Understand SPE audio converter

…the subsystem again failed to detect the format (BSAPI exception) and SPE couldn’t call the converter because it’s disabled – the upload fails with Unsupported audio format error response. SPE log ======= 2021-01-30 20:49:26 [Debug] server: Incoming request: [RID=2] from=127.0.0.1:52762, method=POST, URI=/audiofile?path=%2Ftest1.wav&format=json 2021-01-30 20:49:26 [Trace] ConverterSubsystem: Request stream saved to temporary file: C:\TMP\tmp9408aaaaaa 2021-01-30 20:49:26 [Error] ConverterSubsystem: Error during detecting…

SPE and Browser installation: standalone SPE

…start PhxBrowser.exe (on Windows) or PhxBrowser (on Linux) You should see following the information window. Click OK: In the Settings dialog, on the Speech Engine tab, clear the “Enable Speech Engine on localhost” check box and click OK If you receive the following error message, click “No”: Now, right-click into “Sources area” and click “Add new server” In the next…

Phonexia Speech Engine

…for steps 3 to 5 are described in doc/INSTALL.html included in the distribution package. REST API documentation is in doc/api_reference.html file and also available online at https://download.phonexia.com/docs/spe/. Speech Engine is actively developed and continually improved – check the SPE changelog for latest news. Architecture and components SPE is application run from command line or as a service. Apart from running…

Keyword Spotting (KWS)

…generated automatically by grapheme-to-phoneme mechanism (for keywords not included in dictionary). Pronunciations generated by the grapheme-to-phoneme mechanism are assigned a probability value, saying how confident the system is with the generated pronunciation. The value is a logarithm of probability from {-inf,0} interval. Searched keywords may be spoken in the actual recordings using different pronunciation(s) than Keyword Spotting expects. This can…

Language Identification (LID)

…language pack and calculates probability that these two languages are the same For explanation of the terms languageprint, language model and language pack, refer to the LID: Terminology and adaptation article. The final scores are returned as logarithms of these individual probabilities – i.e. as values from {-inf,0} interval – for each language in the language pack. (to convert raw…

Q: Which authentication options are allowed by the server and how does it work?

A: The following options are supported: HTTP basic authorization – Client asks for session by resource “post /login” with HTTP basic authorization in query header. If server responds with error 405, server doesn’t support authorization by sessions and it is necessary to use basic authorization. Authorization by session – Authorization by session is done by adding parameter “X-SessionID“ into HTTP…

Download Speech Platform

Step #1 – Download Try and evaluate all Phonexia speech technologies either via REST API using Speech Engine, or using the demo/testing GUI application named Phonexia Browser. Hardware requirements recommended: Intel Core i7 or better, 32 GB free RAM, 10+ GB storage (SSD preferred) minimum: Intel Core i5, 16 GB free RAM, 10 GB storage (SSD preferred) To prevent various…

Gender Identification (GID)

…generation of XL3 and L3 models) Output scoring: log-likelihood ratio (LLR) and score (0-1). Score can be interpreted as percentage by multiplying the score by 100. Typical use cases: filtering calls by gender, playing advertisement focused on specific gender, getting quick demographic analysis of the recordings. The speed of Gender Identification is up to 150 FtRT (depending on the model)….

Terms of Service

…Rights & Intellectual Property 7.1. PHONEXIA’s Proprietary Rights. Member recognizes and agrees that all legal rights and title to the Services are owned by PHONEXIA or its licensors, including all intellectual property rights contained therein. 7.2. PHONEXIA’s Intellectual Property. The use of PHONEXIA’s brand names, logos, domain names, trademarks, service marks, copyrighted materials, patents or any other brand elements unique…

Speaker Diarization (DIAR)

Speaker Diarization labels segments of the same voice(s) in one mono-channel audio record based by the individual speaker´s voice. It is a language-, domain- and channel-independent technology. It performs not only the segmentation of speakers but of technical signals and silence as well. The outputs of the technology can be both log files with labels and/or split audio files/one new…

Speech Quality Estimation (SQE)

…of an empty recording SNR would divide by zero => is_valid would be false waveform_snr – the signal to noise ratio (SNR) describes the ratio of the useful signal to the noise signal it is measured in dB calculated from the waveform distribution, (silence – has Gaussian distribution, voice – has Gamma distribution); SNR = 20 * log10(S/N) technical signal…

Age Estimation (AGE)

…coding), A-law or Mu-law, PCM, 8kHz+ sampling Voiceprints: AGE L4 model supports SID4 L4 voiceprints; legacy AGE models support voiceprints created by AGE itself Output Log file with processed information (age estimate) Processing speed Approx. 20x faster than real-time processing on 1 CPU core i.e. standard 8 CPU core server processes 3,840 hours of audio in 1 day of computing…