Skip to content Skip to main navigation Skip to footer

Search: le

145 results

Q: What do LLR, LR and score mean?

A: These abbreviations mean the following: LR – likelihood ratio, result from statistical test for two models comparison. It returns a number which expresses how many times more likely the data are under one model than the other. LR meets numbers in interval <0;+inf). LLR – abbreviation for log-likelihood ratio statistic, logarithmic function of LR. LLR meets numbers in interval…

Q: What are the recommendations for LID adaptation set?

A: The following is recommended: For adding new language to language pack 20+ hours of audio for each new language model (or 25+ hours of audio containing 80% of speech) Only 1 language per record For adapting the existing language model (discriminative training) 10+ hours of audio for each language May be done on customer site. May be done in…

Q: How to choose answer format from server (xml/json)?

A: Via HTTP header “Accept” parameter (application/json; application/xml) Via request query “format=json/xml” If the format is not defined (or the HTTP header “Accept” parameter has one of these values: application/*,*/*,*), server will return json….

Q: Which authentication options are allowed by the server and how does it work?

A: The following options are supported: HTTP basic authorization – Client asks for session by resource “post /login” with HTTP basic authorization in query header. If server responds with error 405, server doesn’t support authorization by sessions and it is necessary to use basic authorization. Authorization by session – Authorization by session is done by adding parameter “X-SessionID“ into HTTP…

Q: What languages do you offer?

It depends on the technology. Phonexia Language Identification (LID) is pre-trained for 60+ languages. Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT) for 20+ languages including English, French, German, Russian, Spanish and many more….

Voice Activity Detection (VAD)

Voice Activity Detection is a language-, domain- and channel-independent technology that identifies parts of audio recordings with speech content vs. non-speech content. It creates labels for speech and other signals in the recording; this can then serve as a decision point whether to process the recording by other technologies or not. VAD is usually part of rapid filtration process in…