How to use a checksum – Linux
This part requires higher (and non-anonymous) access level.
How to solve this situation:
- Log in here if you are not logged in.
- Register here. It takes just a few clicks and it’s free.
The system is inoperative, and it has a critical effect on the EndUser’s operations which can’t be solved by the End user’s or Partner’s IT/technical administrator. This condition is generally characterized by system instability and requires immediate correction. Phonexia’s software function is stopped due to its internal error, and it fails again on a different data input after Phonexia’s software restart. If the issue does not appear on the latest version of Phonexia’s software, it will not be considered a Critical Issue. A Critical Issue is being fixed on a best-effort basis and a fixed version of the Product is delivered within the next Bug fix release or Software Update.
An Issue that renders the Product partially functional, the use of which in a production environment is substantially reduced. The Issue contains an error that impairs the ability of the system to process a majority of audio files or audio streams, or that renders the setup and maintenance of the system inoperable.
Any scenario that does not fall under the Critical or Severe Issue definitions above.
The Product is still operable but contains Issues occurring in a minority of audio files or audio streams or are of a minor nature.
This error usually happens when you try to access Orbis before it has been fully initialized. When you start Orbis virtual machine, the Webserver component responsible for rendering login screen starts fast, but some other Orbis components takes a little longer to initialize.
Try to wait for a while, then refresh the page and try again.
If that doesn’t help, contact Orbis support on [email protected].
To create the SPE report:
E
, in Windows type cmd
in the address bar) ./phxadmin --report
(Linux) or phxadmin.exe /report
(Windows)The Report functionality is not present in old SPE versions (3.10 and older).
When reporting issue with Phonexia Browser please attach both SPE report and Browser log.
To create the SPE report:
E
, in Windows type cmd
in the address bar) ./phxadmin --report
(Linux) or phxadmin.exe /report
(Windows)Also, run the following command for Phonexia Browser log:
E
, in Windows type cmd
in the address bar) ./PhxBrowser &>report.txt
(Linux) or PhxBrowser.exe >report.txt
(Windows)To create the Phonexia Voice Inspector log:
E
, in Windows type cmd
in the address bar) ./VoiceInspector &>report.txt
(Linux) or VoiceInspector.exe >report.txt
(Windows)We can prepare a testing package for you with full functionality of all technologies.
The license validity is 90 days to allow you to test the technologies.
Note: by default a NET license is provided for testing. This license needs to have active Internet connection to a phonexia licensing server in order to function.
Rest assured no data – audio, metafiles or even analytical files, are ever sent to phonexia.com.
Yes, your data are imported into the system and remain only on the computer on which Phonexia Orbis is running.
The level of security thus depends on the administrator of the computer on which the system is running.
Our technologies are prepared to run on both Windows and Linux OS.
For more details of the supported operating systems as well as recommended HW setup, see Recommended OS and HW
Formats supported directly and natively are:
Other audio formats must be converted to one of those natively supported using external tools.
SPE server can be configured do this conversion automatically in background, see Understand SPE audio converter article.
Great tools for converting other than supported formats to supported are FFmpeg (http://www.ffmpeg.org) or SoX (http://sox.sourceforge.net/). Both are multiplatform software tools for Microsoft Windows, Linux and Apple OS X. Example of usage:
FFmpeg
ffmpeg -i <source_audio_file_name> <output_audio_base_name>.wav
This command converts any supported format/codec audio file to normalized WAV audio format in 16-bit PCM little-endian as it is the default system. For more parameters please check FFmpeg manual pages.
SoX
sox <source_audio_file_name> -b 16 <output_audio_base_name>.wav
Number of bits defined by -b
parameter must be specified.
Yes, Phonexia Orbis is stand alone application running only on your HW.
Also, no files (audio, metafiles or analytical) are ever sent to Phonexia or elsewhere.
Phonexia Browser application may return error “1007: Unsupported audio format” during uploading audio file. Please consider if your audio files are in Q: What are the supported audio formats? .
But if you need use as input audio recordings in other formats, you can configure SPE for audio automated conversion. As prerequisite install external tool for audio conversion. Recommend is ffmpeg
utility, powerful and well documented. Please find your distribution package at http://ffmpeg.org
Then continue as described below:
Open the Browser configuration dialog by click on button “Settings” located in tool ribbon. Select tab “Speech Engine” and configure SPE as described in documentation. Don’t forget select checkbox “Enable audio converter”.
Open file settings\phxspe.properties
using standard text editor. Then change the following line in “phxspe.properties” to enable background conversion:
audio_converter.enabled = false # change it to 'true'
Please check if the conversion tools configured below this line in phxspe.properties are configured properly. Here is an example of configuration for ffmpg:
# Set converter command # %1 is for input file # %2 is for output file ffmpeg example: audio_converter.command = ffmpeg -loglevel warning -y -i %1 %2 # sox example: # audio_converter.command = sox %1 %2
Important note: By design and saving computing resources ‘audio converter’ is not used if INPUT file ends with the extension .wav. In that case you must pre-process the audio recording before uploading it to the Phonexia SPE or using it in the Phonexia Browser.
It depends on the technology.
Phonexia Language Identification (LID) is pre-trained for 60+ languages.
Phonexia Keyword Spotting (KWS) and Phonexia Speech Transcription (STT) for 20+ languages including English, French, German, Russian, Spanish and many more.
A: Please see List of supported LID Languages. For more details, see LID technology documentation.
A: Please check SPE subdirectory ./settings for configuration files.
server.enable_authentication_token = false
phxadmin
utility or can be created from ./data/phxspe.properties.default template file.
server.enable_authentication_token
directive and setup it as needed.phxspe
Basic installation steps are described in ./doc/INSTALL.html document.
A: Similarly as human, the ASR (STT) engine is doing the adaptation to an acoustic channel, environment and speaker. Also the ASR (STT) engine is learning more information about the content during time, that is used to improve recognition. The dictate engine, also known as on-the-fly transcription, does not look to the future and has information about just a few seconds of speech at the beginning of recordings. As the output is requested immediately during processing of the audio, recording engine can’t predict what will come in next seconds of the speech.
When access to the whole recording is granted during off-line transcription, speech engine can correct result before it is printed out by taking into account also the subsequent segments. The beginning of the recording can then be recognized with high accuracy too.
A: Signal-to-Noise Ratio (SNR) is an important metric of whether a recording is worth further processing by other speech technologies, so it is part of our Speech Quality Estimation. However, calculating SNR automatically is not a trivial task.
We use the fact that the statistical distribution of the frequencies in the waveform of speech has Gamma distribution. In contrast, noise has Gaussian distribution. So we can estimate the SNR by looking at the frequency distribution in individual frames.
This approach to SNR estimation is based on the article by Kim Chanwoo, and Richard M. Stern, called “Robust Signal-to-Noise Ratio Estimation Based on Waveform Amplitude Distribution Analysis”, Interspeech 2008.
A: Please see List of supported KWS Languages. For more details, see KWS technology documentation.
A: Please see List of supported STT Languages. For more details, see STT technology documentation.
A: Yes, you can use Language Model Customization (LMC). For more details please read STT Language Model Customization tutorial.
A: There are multiple methods to train a new language, please see article in Components > Speech Technologies > LID.
A: Yes. Documentation is here: https://download.phonexia.com/docs/spe.
A: Yes, the system comes as an API (for the production license).
A: From the utilities in the package*, you can find it in ffprobe <file_name>
, it will write out the info about the file.
*Utility “ffprobe
” is not included in our package(s). It is part of ffmpeg (https://ffmpeg.org/ffprobe.html) and is necessary to be installed separately.
A: The language-prints do not depend on the current language pack used. You may use them for both training a new language pack and testing/comparing against an existing language pack.
The language-prints need to be compatible only with the model of LID used for language-print extraction.
A: The following is recommended:
For adding new language to language pack
For adapting the existing language model (discriminative training)
Username: admin
Password: phonexia
For evaluating the real life scenario of Phonexia Speaker Identification technology, the system needs to be calibrated by SID dataset.
SID dataset (minimum requirements):
To measure SID performance precisely, it’s important to prepare evaluation recordings set very carefully.
The requirements are:
*Note: splitting single recording into multiple shorter recordings in order to meet the criteria of at least 3 recordings for each speaker is not the right way to proceed. This way you are not adding any details. You are essentially analyzing details of a single recording five times.
In contrast, by using 5 unique recordings coming from different audio environments or even different times of the day, additional details can be analyzed leading to better results.
Warning: Any human error in evaluation set preparation (in speaker uniqueness, placing recordings into wrong folder, etc.) affects the evaluation results, so it’s very important to prepare the data carefully.
See SID Evaluation for more details
A: You can receive the list of running/configured technologies by running query get /technologies
or using the phxadmin utility with parameter configure-tech
A:
If the format is not defined (or the HTTP header “Accept” parameter has one of these values: application/*,*/*,*), server will return json.
A:
Windows:
PhxBrowser.exe /spe-debug /spe-output
Linux:
./PhxBrowser --spe-debug --spe-output
A: Threshold for score isn’t set up correctly. Adjust speaker score sharpness value to calibrate the recalculation.
Please see Calibration in technology documentation.
A: These abbreviations mean the following:
I always get the same error messages:
A: This error may happen if the initialization of SPE engine takes too long. Phonexia Browser software treats it as initialization failure and kills the server.
You can fix this by doing the following:
A: Please proceed by doing the following:
A: Check if you didn’t change your HW configuration or Operating System on the machine.
Please ask your Phonexia contact if the issue still occurs.
When running SPE, the following error occurs:
[Error] ApplicationStartup: Unhandled exception: BsapiException: SWaveformSegmenterI(/mnt/phxspe/home/phx/storage/dfs/a1cabcf7-c761-49f1 -a9bc-0a8209a09fd9.opus Requested segment (78056, 102056) is out of waveform range (0,91840).
A: It means that this opus file is created improperly and declares internally (in header) much more audio than available in real file.
Please check your audio source/originator for proper functionality.
Or use ffmpeg / sox utility as preprocessor of the audio and do audio normalization by self-conversion from opus to opus before recordings are processed through SPE.
A:
If server responds on pending request by status 200 – OK, the body of the response will have the result inside (server already has the result in cache memory and there is no need to process the file again).
If server responds on pending request by status 202 – Accepted, server will create task and server will begin to process the file. In response HTTP header (in parameter “Location”) there is path for pending resource. In the body there is a ID of pending operation.
Example of HTTP header:
GET /pending/ec563083-3d9b-457d-a0ac-24b197bc222f HTTP/1.1 Host: server.example.com Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw== Sec-WebSocket-Version: 13 X-SessionID: 258f505c-a6fa-4c3f-8a87-b048874ac6aa
A: We don’t provide USB without memory storage, possible solutions are:
Phonexia Speech Engine with its technologies is distributed as REST API interface.
For evaluation and testing purposes, graphical user interface (GUI) called Phonexia Browser is provided.
Upon request, technologies can be provided also in form of command lined (CLI)
Rest API documentation https://download.phonexia.com/docs/spe/
A: Please attach the licensing file (license.dat) to the support ticket at our Service Desk.
A: Please check that the licensing file (license.dat) file is stored next to VoiceInspector.exe application.
If the license.dat is already in the same directory that VoiceInspector.exe is and you are still receiving this error, please contact the technical support at our Service Desk.
A: Check your license file (license.dat) by opening it in Notepad.
Make sure the license contains records for all required modules.
See Licensing article for additional information
A: The following options are supported:
You can set this in ./settings/phxspe.properties
Currently I’m trying to install the provided binaries for Linux, but I do get the following when running phxadmin: ./phxadmin: error while loading shared libraries: libasound.so.2: cannot open shared object file: No such file or directory I’m trying to run this under CentOS 7.
A: Please install the right libraries required for manipulation with audio files from official repository into your OS.
For example for CentOS you may use:
sudo yum install alsa-utils alsa-lib
Hint: Great utility for finding subsequent Redhat/Fedora/CentOS libraries is https://www.rpmfind.net/linux/RPM/index.html