Phonexia Speech Platform (PSP) is an umbrella concept for all Phonexia software components and services related to speech technologies with wide support of languages. It provides the ability to customize various products to a wide range of customer needs.
The platform is an encapsulation of speech technologies (available in the Speech Engine component), expert-level applications (the Browser component), and utilities (the Reporting and Licensing Server component) grouped together for a specific market segment.
We distinguish two fundamental domains: Government and Commerce (Enterprise). They differ from the business and user perspectives so much that it is necessary to recognize them. We converted our market expertise into several predefined configurations that are typical for the most common domain of usage.
Phonexia Speech Engine Component (34 Articles)
Phonexia Speech Engine (SPE) is main part of Phonexia Speech Platform.
SPE is a server application for 64-bit Linux or Windows, providing REST API to entire portfolio of Phonexia speech technologies.
SPE capabilities overview:
- Audio files and stream processing
Audio files RTP / HTTP streams Speaker Identification (SID) ✓ ✓ Speech To Text (STT) ✓ ✓ Keyword Spotting (KWS) ✓ ✓ Voice Activity Detection (VAD) ✓ ✓ Time Analysis Extraction (TAE) ✓ ✓ Speech Quality Estimation (SQE) ✓ ✓ Language Identification (LID) ✓ Gender Identification (GID) ✓ Age Estimation (AGE) ✓ Speaker Diarization (DIAR) ✓
- Results caching
Processing results can be optionally stored in results cache database to speed up eventual re-processing of the same recordings by the same technology – results are then returned immediately from the cache instead of complete re-processing of the audio file.
- Own persistent data storage
SPE keeps uploaded audio files in its own persistent storage space, so the original source files can be archived or deleted after upload.
- Data privacy
SPE keeps information about audio file or stream only as long as the file or stream exists. Once the recording is deleted from SPE storage, or stream is ended, SPE removes all information, metadata and technology results from the database.
- Basic user management
SPE allows to define multiple users with different user roles and user rights. Each SPE user has access only to its own data storage, files, metadata and processing results.
- Load management
SPE manages its own queue of incoming REST requests and serves them according to available capacity of current installation. This means that the application layer can request any number of queries and then just wait untill they are processed.
- Processing priority management
To allow off-queue high-priority or low-priority processing, SPE also allows to set priority for individual REST requests.
- Basic audio manipulation
SPE has built-in basic audio files manipulation functionality, like separating individual channels from stereo recordings, cut one audio to several files, save audio from incoming stream to file and others.
- Stream audio player
To support voicebot scenarios, SPE has the ability to play audiofiles directly to output RTP stream
- External Text-to-speech (TTS) integration
Easy integration with external TTS providers via simple plugin-like connectors interface
- Flexible integration
SPE can provide results in JSON or XML format. Result can be obtained by polling, via websockets, or via webhooks (callbacks).
- Status information
SPE can provide various status information to the application layer, e.g. license status, configuration info, current overall load, pending operations status, …
- Audio files and stream processing
Phonexia Browser Component (5 Articles)
Phonexia Browser is a client application for Phonexia Speech Engine (SPE), designed to test speech technologies using client’s recordings and visualize the results.
The Browser helps with:
- getting familiar with speech technologies,
- a deployment of Phonexia Speech Engine into the client’s infrastructure,
- a configuration of the technologies according to the client’s needs,
- a calibration and evaluation of the specific deployment.
Languages Supported (2 Articles)
How-to Guides (16 Articles)
SPE Training videos (4 Articles)
Core objective: Understanding technical essentials of using Phonexia technologies and products
Duration: ~94 minutes (7 + 19 + 22 + 23 + 23 min chapters)
- intended for product architects or developers
- assumes you have already watched Phonexia technologies introduction video
- assumes understanding of
- working in command line
- REST API principles
- processing JSON or XML
Introduction (7 min)
- technologies recap
- CLI, REST and GUI interfaces overview
Best Practices (5 Articles)
Policy and particular dates for for Phonexia products Support Lifecycle