Search Results for: partner program

Results 1 - 50 of 78Page 1 of 2
Results per-page: 10 | 20 | 50 | 100

Phonexia Partner Program for Government Partners

Relevance: 100%      Posted on: 2020-08-25

Phonexia Partner Program for Government Partners This partnership program rewards partners in the government sector for selling and integrating the Phonexia’s speech recognition and voice biometrics product portfolio. Program Enrollment If you aspire to becoming a Phonexia partner, you can enroll into the Phonexia Partner Program and complete a three-month onboarding period. During this period, you will enjoy the same partnership benefits as our Silver partners. Your assigned Phonexia Account Manager will take you through all necessary legal documents, highlight every business aspect of our cooperation, and organize two calls with a pre-sales person to ensure that you understand the…

SPE3 – Releases and Changelogs

Relevance: 36%      Posted on: 2021-04-16

Speech Engine (SPE) is developed as RESTfull API on top of Phonexia BSAPI. SPE was formerly known as BSAPI-rest (up to v2.x) or as Phonexia Server (up to v3.2.x). Releases Changelogs Speech Engine 3.40.1, DB v1700, BSAPI 3.40.1 (2021-04-16) Public release Fixed: 6th generation STT/KWS stream result may start with words from end of previous stream Fixed: Some licensing error messages are not shown in log Fixed: Missing file names in log messages in SID and SID4 tasks Fixed: Keyword list may not work if XML is used as input and optional fields threshold or pronunciations are used Fixed: phxdamin2…

Site Map

Relevance: 20%      Posted on: 2017-06-23

Phonexia Speech Platform Phonexia Speech Platform for Enterprise Phonexia Speech Analytics (SAL) Phonexia Voice Biometrics (VBS) Phonexia Speech Platform for Government Phonexia Speech Analytics GOV (SAL.gov) Phonexia Voice Biometrics GOV (VBS.gov) Components and Tools Phonexia Speech Engine v3 Speech technologies available Phonexia Browser v3 Phonexia Voice Inspector v3 Speech Intelligence Resolver v1 End of Life Components & Tools Phonexia Voice Inspector v1 Knowledge Base Blog Case Studies Demos Frequently Asked Questions (FAQ) How To… Lifetime Support Policies Manuals Presale Whitepapers and Presentations Product Briefs Developer Corner Code Examples Hints for App Design Hints for App Development List of Resources Phonexia…

Voice Biometrics

Relevance: 13%      Posted on: 2018-04-07

Overview Phonexia Voice Biometrics is a special edition of Phonexia Speech Platform which allows you to understand the nature of audio without having to listen to it. The product helps people to utilize the power of voice biometrics to verify speaker or identify crimes. The technologies reveals automatically WHO, what GENDER, what LANGUAGE is speaking, and many other metadata. Voice Biometrics - Typical Use-Cases Use case Speaker Verification is tailored to banks/insurance companies/money lending companies and others, where is needed to confirm if caller/voice in audio file is the same person who is known to the customer. For this use…

Speech Analytics

Relevance: 13%      Posted on: 2018-04-06

Overview Phonexia Speech Analytics allows you to understand the  content of audio without having to listen to it. The results help both commercial entities and security/defense forces for immediate precise decision and response. The technologies reveal automatically WHAT content, TOPIC and KEY PHRASES are spoken, and many other metadata.   Speech Analytics - Typical Use-Cases Speech transcription is used in various applications. Knowledge of content of whole call is bringing business value to the customer, comparing to listening to the audio files by analytic or supervisor. Reading the text is also faster than listening to the audio. Speech Analytics output…

Phonexia Speech Engine

Relevance: 12%      Posted on: 2020-11-19

About Phonexia Speech Engine v3 (SPE3) is a main executive part of the Phonexia Speech Platform. It is a server application with REST API interface through which you can access all available speech technologies. Both, Linux 64bit and Windows 64bit operating systems are supported. Phonexia Speech Engine (SPE3) is adjustable server component which houses all speech technologies. SPE3 provides RESTfull application programming interface to access various technologies. Aside from technologies themselves the SPE has implemented other various functionality supporting work with speech technologies, recordings and streams, and others. Features Main purpose of SPE is to work as processing unit for…

Phonexia Speech Platform for Commerce

Relevance: 9%      Posted on: 2017-05-18

Phonexia Speech Analytics is a special edition of Phonexia Speech Platform COM which allows you to boost analysis of your call traffic. It is effective solution for commercial, telecom, utilities, financial sector, and other contact centers. It provides 4 main parts: Dialog Analysis, Demographic Information, Script Alignment, Speech Transcription (automatic).   Phonexia Voice Biometrics is a special edition of Phonexia Speech Platform COM which allows you to boost security and enhance customer experience with voice biometrics technologies. It is effective solution for commercial and financial sectors, especially for banks, insurance companies, and call centers. It covers both usecases: Fraud Detection…

Phonexia Speech Platform for Government

Relevance: 9%      Posted on: 2017-05-18

Phonexia Voice Biometrics GOV is a special edition of Phonexia Speech Platform for Government which allows you to understand the nature of audio without having to listen to it. The product helps people to utilize the power of voice biometrics to filter audio and prevent or identify crimes. The technologies reveal automatically WHO, what GENDER, what LANGUAGE is speaking, and many other metadata. The product can be used typically for investigation support, SIGINT or other types of operations. It serves 4 main use-cases: Voice Biometrics - Speaker Search in Archive (Investigation) Voice Biometrics - Speaker Spotting Tactical Voice Biometrics -…

Speaker Identification (SID)

Relevance: 9%      Posted on: 2019-06-13

Phonexia Speaker Identification uses the power of voice biometry to recognize speakers by their voice... i.e. to decide whether the voice in two recordings belongs to the same person or two different people. High accuracy of Speaker Identification, the Phonexia's flagship technology, has been validated in a NIST Speaker Recognition Evaluations. Basic use cases and application areas The technology can be used for various speaker recognition tasks. One basic distinction is based on the kind of question we want to answer. Speaker Identification is the case when we are asking "Whose voice is this?", such as in fake emergency calls.…

Account

Relevance: 9%      Posted on: 2018-03-21

Registered info: GDPR tools: Full name: Login name: E-mail: Change profile Change password Phonexia Partner Portal documents access level: Hints: General rules Registration for Phonexia Partner Portal is for free. But various user access levels are applied to the articles, some of them are available only for Phonexia Partners and Certified members. You may ask for promoting your access level by asking for business support on info@phonexia.com Legal documents By registration, login to and using this website you agree with the Privacy Policy and Terms of Service. .

Voice Inspector – supporting technologies

Relevance: 8%      Posted on: 2019-06-28

Automatic Speaker Identification (SID) is the most important but not the only Phonexia technology that is implemented in Voice Inspector (VIN). Apart from SID, forensic experts, users of VIN, can benefit from automatic Signal-to-Noise Ratio calculation, Voice Activity detection, Phoneme search, and a Wave editor which incorporates the waveform, spectrum and power panel. Let's have a look on how to utilize individual technologies. Signal-to-Noise Ratio Recording quality can strongly influence the reliability of SID results and so the outcome of a forensic case. Therefore, VIN uses a module of Phonexia Speech Quality Estimation (SQE) to calculate the Signal-to-Noise Ratio (SNR)…

Save Your Time

Relevance: 8%      Posted on: 2017-06-22

If you start, the following posts might be interesting for you:   Phonexia Speech Platform is defined as an umbrella concept for all our products and services related to speech technologies. Main packages are Voice Biometrics and Speech Analytics.   Phonexia Browser PhxBrowser - application for quick tests and visualization of speech technologies results.   Speech Engine SPE3 - RESTfull API - it is adjustable server component which houses all speech technologies.   Other "good to start" pages: Academy is to help partners to understand the market, Phonexia’s products and technologies. Manuals Glossary

Licensing (technical details)

Relevance: 7%      Posted on: 2018-03-02

This document describes all licensing types for Phonexia product licensing available to our partners and customers. Each partner/customer can choose the licensing variant which best fits the current project or infrastructure. The document does not describe business conditions of Phonexia licensing. What is the License? The License is a formal agreement regarding “The Product Usage Rights” between Phonexia s.r.o. and a user of any Phonexia technology or Phonexia product. Licenses are issued by the Business Department for all speech technologies and products, and may be required in order to use utilities and tools developed by Phonexia or partners. For technical…

LID adaptation

Relevance: 7%      Posted on: 2021-03-02

This article describes various ways of Language Identification adaptation. Basic terminology Languageprint (*.lp file) – numeric representation of the audio, extracted from audio file for language identification purpose of (similar to “voiceprint”, but representing the spoken language, not the speaking person) Languageprint archive (*.lpa file) – multiple languageprints combined into single archive Creation of languageprint archives is not supported by SPE, these are supported as input only.   Language model – digital characteristics of a specific language Language model can be trained from languageprints (*.lp), language prints archives (*.lpa), or from combination of both. LID language model should not be…

Speech Engine configuration file explained

Relevance: 6%      Posted on: 2021-02-19

In this article we explain details of the Speech Engine configuration file phxspe.properties, located in settings subdirectory in SPE installation location. Settings in this configuration file affect the Speech Engine behavior and performance. The configuration file is usually created after SPE installation – on first use of phxadmin, a default configuration filephxspe.properties is created in the settings directory. The file is loaded during SPE startup, i.e. you need to restart SPE to apply any changes made in the file. If Speech Engine is used together with Phonexia Browser in so-called "embedded" mode (see details about "embedded SPE" mode in Browser…

Components and Tools

Relevance: 6%      Posted on: 2017-05-18

This section collect information about specific components and tools of our Speech Platform.   API RESTfull API - Phonexia Speech Engine v3 (SPE3) - recommended   Apps and Tools Phonexia Browser v3 (Browser3) Voice Inspector v4 (VIN4) Voice Inspector v3 (VIN3)   You might be interested to see also Product Portfolio or End of Life Components & Tools. You might also browse our product support lifecycle policy to see which of our versions are supported and maintained.

Voice Inspector – Interpretation of results

Relevance: 6%      Posted on: 2019-06-24

Introduction Phonexia Voice Inspector (VIN) is a tool for forensic automatic speaker identification, compliant with the Methodological Guidelines for Best Practice in Forensic Semiautomatic and Automatic Speaker Recognition, published by the European Network of Forensic Science Institutes.  This post explains individual SID score types and ways to visualize the results in a speaker identification case implemented in Voice Inspector. Evidence In VIN, the term evidence has two meanings. In general, it refers to any SID score that the system calculates for any pair of recordings in the case. These scores are the output of the Phonexia SID technology which runs…

Phonexia Academy

Relevance: 5%      Posted on: 2017-05-18

About Main idea of the Phonexia Academy is to help partners to understand the market, Phonexia’s products and technologies. Sell more, deliver your projects on time and at the highest quality, and support your clients effectively. We provide following trainings: Phonexia technologies introduction (online video course) Technical Training Essentials (online video course) Technical Training Advanced - 2 courses: Voice Biometrics Course (in-person, 2 days) Speech Analytics Course (in-person, 2 days) In Technical Training Advanced courses, we share best practices, detailed use-cases analysis, and hands on. Both courses are adjusted to our partners’ requests considering their typical projects. You might be…

Performance of the Speaker Identification 4th generation (SID4): Intel® Xeon® Platinum 8124M

Relevance: 5%      Posted on: 2019-10-30

Benchmark goals Find realistic performance using total recording length Find FTRT based exactly on net_speech (engineering sizing data) Find system performance using all physical cores Find system performance using all logical cores Infrastructure setup Intel® Xeon® Platinum 8124M is used in virtual machine with 8 physical cores reserved exclusively for this VM, Hyper Threading is enabled [16 logical cores available], 32GB RAM, 30GB SSD based storage, 1000 I/O.s-1  reserved per core Benchmark data setup Data set statistic: Number of files: 32 [300 seconds each] RAW recordings length ∑: 9600 [sec] Net speech length ∑: 4224.77 [sec] In the data set…

How to convert STT confusion network results to one-best

Relevance: 5%      Posted on: 2020-04-06

Confusion Network output is the most detailed Speech Engine STT output as it provides multiple word alternatives for individual timeslots of processed speech signal. Therefore many applications want use it as the main source of speech transcription and perform eventual conversion to less verbose output formats internally. This article provides the recommended way to do the conversion. Time slots and word alternatives: The recommended algorithm for converting Confusion Network (CN) to One-best is as follows: loop through all CN timeslots from start to end in each timeslot, get the input alternative with highest score and if it's not <null/> or…

Knowledge Base

Relevance: 5%      Posted on: 2017-05-18

This section collects information, we see the most important or frequently discussed. Best Practices Frequently Asked Questions (FAQ) Manuals Glossary Terminology Open Source Acknowledgement  

Phonexia Speech Platform

Relevance: 5%      Posted on: 2017-05-18

  Phonexia Speech Platform (Speech Platform) provides partners a complete portfolio of speech technologies with an easy-to-use design. The platform allows users to design and deploy a wide range of speech processing systems in a short time and without extensive knowledge of the technologies background. Products On top of Speech Platform, several products provided: for commercial market Phonexia Speech Analytics Phonexia Voice Biometrics for government market Phonexia Speech Analytics GOV Phonexia Voice Biometrics GOV Characteristics Completeness – all speech technologies in one place Simple to use – RESTfull API for rapid development Modularity – build your own specific process workflow…

Product Portfolio

Relevance: 5%      Posted on: 2018-04-02

Phonexia Speech Platform is an umbrella concept for all Phonexia’s products and services related to speech technologies. It gives us the ability to customize various products to a wide range of customer needs. Platform Edition is an encapsulation of specific setup of speech technologies, modules, applications, utilities and services designed for a specific market segment. We distinguish Speech Analytics (SAL) and Voice Biometrics (VBS) as most common domain of usage. It is also a tool for marketing and sales. Voice Biometrics is focused more on identifying speaker, gender, language spoken and more. Speech Analytics focuses on gathering information about content…

What are STT preferred phrases and how to use them

Relevance: 4%      Posted on: 2020-11-26

Speech Engine version 3.32 and later includes new STT feature called Preferred phrases. This article explains what is the feature good for, how does it work internally and gives some tips for practical implementation. What are preferred phrases In the speech transcription tasks, there may be situations where similar sounding words get confused, e.g. "WiFi" vs. "HiFi", "route" vs. "root", "cell" vs. "sell", etc. Normally, the language model part of the Speech To Text does its job here and in the context of longer phrase or entire sentence prefers the correct word:  ×    I'm going to cell my car. Hmmm, such…

Voice Biometrics Course (technical training)

Relevance: 4%      Posted on: 2017-05-18

The Voice Biometrics course consist of the following modules. Please ask your Phonexia contact for detailed description. (YES = this part is mandatory for course)   VBS course Required time [h] Block name Block description YES 0,5 Intro & Phonexia Portfolio Intro & Phonexia Portfolio YES 0,5 Project focus - Explain basic needs Partner project related discussion focused mainly to finalizing training topics and agenda YES 0,75 Apps Designing and Developing - Licensing Gives trainee knowledge about type of licensing, and how to use the license file YES 0,75 Technologies - Data gathering and Quality measurement - basic Data gathering…

Speech Analytics Course (technical training)

Relevance: 4%      Posted on: 2017-05-18

The Speech Analytics course consists of the following modules. Please ask your Phonexia contact for detailed description. (YES = this part of the course is obligatory)   SAL course Required time [h] Block name Block description YES 0,5 Intro & Phonexia Portfolio Intro & Phonexia Portfolio YES 0,5 Project focus – Explain basic needs Discussion of partner project focused mainly on finalizing the training topics and agenda. YES 0,75 Application Design & Development – Licensing Presentation of types of licensing, and how to use the license file. YES 0,75 Technologies – Data gathering and Quality measurement – basic Description of…

Voice Inspector

Relevance: 3%      Posted on: 2017-05-18

About Phonexia Voice Inspector (VIN) provides police forces and forensic experts with a highly accurate speaker identification tool during investigation of criminal matters. It uses the power of voice biometry to automatically recognize speakers by their voice. Main features of the VIN application: Automatic speaker identification tool to strengthen results of the standard phonetics-based approaches Scoring in likelihood ratio (LR) – Result from statistical test for two models comparison. It gives back number which expresses how many times more likely the data are under one model than the other. LnLR or LogLR meets numbers in interval <-∞;+∞>...), and verbal presentation…

How to prepare for course?

Relevance: 3%      Posted on: 2017-05-18

Partner is encourage to ask his Phonexia contact person to send Training Preparation Questioner. It will help Partner (and to Phonexia) to adjust the content of the technical training (see available courses here). to provide download link for the Phonexia products together with the evaluation license.   The Partner (and Phonexia) can manage expectation together based on the following questions:   1. Training expectations What are your expectations regarding this training? What content do you expect? Looking at the schedule - what are top priority topics? What format do you expect? (ppt, hands-on, discussion) Do you prefer paper copies for…

Phonexia Ethical Code

Relevance: 3%      Posted on: 2018-03-24

Application of the Code It is the policy of Phonexia, s.r.o. (“Phonexia”, “we”) to maintain the highest level of ethical standards in the conduct of our business affairs. Our values guide our actions in all cases. The actions and conduct of our officers, directors and employees (collectively, “Phonexia personnel”), as well as others acting on our behalf, are essential to maintain these standards and promote highly ethical reputation of Phonexia. To that end, all our personnel including agents, consultants and contractors as well as distribution partners involved in Phonexia´s international business activities must read, become familiar and comply with this…

Measuring of a software processing speed – what is the FtRT (Faster than Real Time)

Relevance: 3%      Posted on: 2019-10-30

Faster Than Real Time (FTRT) is metrics developed for defining software performance reference point. Using this metric you can collect "benchmark" data of real processing speed for reviewed software, which should be found - and reproduced - on exactly defined HW. Then, comparing various benchmarks result, you can compare performance of the specified software and its parts on different HW configurations. And vice versa - using the same metric you can compare software from different vendors on the same HW configuration and for the same processing task. We are recognizing two measurable metrics: Recording based FTRT is calculated from real…

Phonexia – introduction

Relevance: 3%      Posted on: 2018-03-14

What we believe in At Phonexia, we find joy in pushing the boundaries of innovation in the field of speech technology by automating and simplifying solutions for many of today’s complex communication and security-strategic challenges. By providing our partners and customers with state-of-the art speech-technology software, we leverage the power, and data, in their voices. Who we are Phonexia is the only speech technology software manufacturer that reveals and leverages the most data in speech for enterprising trailblazers across the globe who want to discover and develop powerful new skills in a knowledge-based economy. We have more than 19 years…

Phonexia End User License Agreement

Relevance: 3%      Posted on: 2019-02-27

Please read the terms and conditions of this End User License Agreement (the “Agreement”) carefully before you use the Phonexia proprietary software providing speech solutions, technologies and accompanying services (the “Software”) delivered and marketed by Phonexia s.r.o.

Speech To Text

Relevance: 3%      Posted on: 2019-05-27

Phonexia Speech To Text – also known as a voice-to-text or speech recognition – converts speech signals into plain text. After the conversion, text can be easily read, edited, searched, processed by text-based data mining tools or archived. Phonexia Speech To Text is optimized for noisy recordings and colloquial speech, can process audio files as well as audio streams and can provide results in several output formats. Typical use cases look for specific information in large call archives (e.g., claims inspection) get additional value by advanced analysis of call traffic (e.g., topic detection) maintain short reaction times by routing calls…

SPE3 – Administration and Backup

Relevance: 3%      Posted on: 2018-04-15

Each Partner has its own administration and back up policy. Here, we highlight the most important SPE3 components to be administrated and backed up. Administration It is strongly recommended to describe your own administration approach with the following components SPE users (accounts) - Partner should maintain list of SPE users (accounts). There should be only few persons with “admin” role. All other should be with “user” role (do not see content of other “user”) and/or “vbs” role (dis/enables using of VoiceBiometry plugin) the SPE database and/or VBSplugin database administration – where the (temporary) results are stored user.home - where the…

Designing and Developing Application

Relevance: 3%      Posted on: 2018-04-15

Before designing and developing the application, we encourage Partner to find clear answer for the following questions: Customer requirements: Do my customers need file processing (audio) or stream processing in real time? What is the human power of the customer that can analyze the results? How many minutes per day or streams in parallel do my customer need to process? What are real benefits for customer (finding the needle in haystack, approaching new information, processing only few data with highest possible accuracy)? How the solution match the current processes and infrastructure of the customer? How many false alarms are acceptable…

Keyword Spotting results explained

Relevance: 3%      Posted on: 2019-06-12

This article aims on giving more details about Keyword Spotting outputs and hints on how to tailor Keyword Spotting to suit best your needs. Scoring Keyword Spotting works by calculating likelihoods that at a given spot occurs a keyword or just any other speech, and comparing those two likelihoods. The following scheme shows Background model for anything before the keyword (1), the Keyword model (2) and a Background model of any speech parallel with the keyword model (3). Models 2 and 3 produce two likelihoods – Lkw and Lbg (any speech = background). Raw score is calculated as log likelihood…

Terminology

Relevance: 3%      Posted on: 2017-06-15

Document which briefly describes processes and relations in Phonexia Technologies with consideration on correct word usage.   SID - Speaker Identification Technology (about SID technology) which recognize the speaker in the audio based on the input data (usually database of voiceprints). XL3, L3,L2,S2 - Technology models of SID. Speaker enrollment - Process, where the speaker model is created (usually new record in the voiceprint database). Speaker model: 1/ should reach recommended minimums (net speech, audio quality), 2/ should be made with more net speech and thus be more robust. The test recordings (payload) are then compared to the model (see…

End of Life Components

Relevance: 3%      Posted on: 2017-05-18

This section archive information about products, components and tools, we closed distribution and active technical support. End of Life Products & Tools: Speech Intelligence Resolver v1 (SIR1) Voice Inspector v1 (VIN1) API for C++ (BSAPI2)   You might also browse our product support lifecycle policy to see which of our versions are supported and maintained.

Phonexia Voice Inspector v3

Relevance: 3%      Posted on: 2021-04-09

About Phonexia Voice Inspector v3 (VIN3) provides police forces and forensic experts with a highly accurate speaker identification tool during investigation of criminal matters. It uses the power of voice biometry to automatically recognize speakers by their voice. Main features of the VIN3 application: Automatic speaker identification tool to strengthen results of the standard linguistics- and phonetics-based approach Scoring in Likelihood Ratio (LR) – result from a statistical test for a comparison of two hypotheses. The system returns a number from the interval <0, +∞>, which expresses how many times more likely the data are under one hypothesis than the…

Keyword Spotting

Relevance: 3%      Posted on: 2019-06-03

Phonexia Keyword Spotting (KWS) identifies occurrences of keywords and/or keyphrases in audio recordings. It can help you to get valuable information from huge quantities of speech recordings. You only need to specify the keywords or phrases you wish to find. This technology identifies all recordings with keyword occurrences and allows you to automatically route important recordings or calls to your experts. Typical use cases Call centers increase operator and supervisor efficiency by searching calls identify inappropriate expressions from operators check marketing campaigns with automatic script-compliance control Mass media and web search servers index and search multimedia by keyword route multimedia…

H2020

Relevance: 2%      Posted on: 2018-02-01

Horizon 2020 - EU program for inovations and developement

PHR

Relevance: 2%      Posted on: 2018-02-01

Phoneme recognizer – currently part of Keyword Spotting (Phonexia Keyword Spotting - acoustics based ASR, several tec...) technology in Phonexia Speech Engine  (REST Application Program Interface)

Software Vetting

Relevance: 2%      Posted on: 2018-04-06

The purpose of this document is to help client to satisfy their high security standards during integration of Phonexia software to their critical infrastructure. The vetting ensures that Phonexia software is not dangerous to the client’s infrastructure in any way. It means there are no backdoors, viruses, worms, Trojan horses, spyware, adware, critical bugs, unwanted functionality, no information is sent outside the client’s infrastructure. Vetting context Speech technology is a very dynamic area with a very fast development. For example the speaker identification error rate decreases to half between each two evaluations organized by National Institute of Standards and Technology,…

Packages, Updates vs. Upgrades

Relevance: 2%      Posted on: 2018-04-15

Our packages follow the bug-fix /updates / upgrades approach. Some packages are distributed with limited set of speech technologies or without speech technologies. Packages Our software is distributed as ZIP file. Installation procedure is matter of unzipping archive, reconfiguration and start of software. SPE and VIN package contains speech technologies (note: SPE might contain only selected technologies).  PhxBrowser does not contain speech technologies and it needs to be combined with SPE. The software is activated by licensing file. Updates vs. Upgrades Bugfix By bugfix we understand a fix of known problem without changing components or technology models. Bugfix changes only…

Speaker Identification: Results Enhancement

Relevance: 2%      Posted on: 2019-05-29

Speaker Identification (SID) Results Enhancement is a process that adjusts the score threshold for detecting/rejecting speakers by removing the effect of speech length and audio quality. This is achieved by use of Audio Source Profiles, that represent as closely as possible the source of the speech recording (device, acoustic channel, distance from microphone, language, gender, etc.). Although the out-of-the-box system is robust in such factors, several result enhancement procedures can provide even better results and stronger evidence. Audio Source Profile An Audio Source Profile is a representation of the speech source, e.g., device, acoustic channel, distance from microphone, language, gender,…

Speech To Text results explained

Relevance: 2%      Posted on: 2019-05-27

This article aims on giving more details about Speech To Text outputs and hints on how to tailor Speech To Text to suit best your needs. In the process of transcribing speech, the Speech To Text technology usually identifies multiple alternatives for individual speech segments, as multiple phrases can have similar pronunciations, possibly with different word boundaries, e.g. “eight tea machines” vs. “eighty machines”. The technology provides various output types which show only single or multiple transcription alternatives. For processing realtime streams, two result modes are supported – one mode provides complete transcription, second mode provides incremental results. Output types…

How to configure STT realtime stream word detection parameters

Relevance: 2%      Posted on: 2020-03-28

One of the improvements implemented since Speech Engine 3.24 is neural-network based VAD, used for word- and segment detection. This article describes the segmenter configuration parameters and how they are affecting the realtime stream STT results. The default segmenter parametrs are as shown below: [vad.online_segmenter:SOnlineVoiceActivitySegmenterI] backward_extensions_length_ms=150 forward_extensions_length_ms=750 speech_threshold=0.5 Backward- and forward extension are intervals in miliseconds, which extend the part of the signal going to the decoder. Decoder is a component, which determines what a particular part of the signal contains (speech, silence, etc.). Based on that, decoder also decides whether segment has finished or not. Unlike in file processing…

What is a user configuration file and how to use it

Relevance: 2%      Posted on: 2020-03-28

Advanced users with appropriate knowledge (gained e.g. by taking the Phonexia Academy Advanced Training) may want to finetune behavior of the technologies to adapt to the nature of their audio data. Modifying original BSAPI configuration files directly can be dangerous – inappropriate changes may cause unpredicatble behavior and without having a backup of the unmodified file it's difficult to restore working state. User configuration files provide a way to override processing parameters without modifying original BSAPI configuration files. WARNING: Inappropriate configuration changes may cause serious issues! Make sure you really know what you are doing. User configuration file is a…

Arabic dialects in Phonexia LID and STT

Relevance: 2%      Posted on: 2021-01-18

Arabic language has (a) one standardised variety, and (b) many non-standard varieties (dialects). In this article, our linguistic team explains differences between Modern Standard Arabic and Arabic dialects in the context of Phonexia Arabic models. Standard variety:  Modern Standard Arabic (MSA) All Arabs learn it at school (not from their parents, so we cannot say it is their native variety) It is lingua franca (common language) for the Arabic world – like English for Europeans; however, Arabs speak it much better since they are schooled in MSA from early age MSA is more similar to some dialects (e.g. Levantine), but…