Skip to content Skip to main navigation Skip to footer

Search: cti audio stream

77 results

Releases and Changelogs (SPE)

…Added utterance_length to SID/SID4 voiceprint results New: Added /output_stream and audio file player (/utils/player/output_stream) endpoints New: Added 5th generation of AR_XL (Arabic Levantine) (Beta version) of STT, KWS and PHNREC (combines both North- and South Levantine, hence the custom code AR_XL) Changed: Changed endpoints, results and properties using the term ‘stream‘ to use ‘input_stream‘ check the SPE REST API documentation…

Understand SPE configuration file

…server.bsapi_comparator_fa_cache_size Runtime server.enable_authentication_token server.enable_resource_locker server.upload_max_filesize server.max_metadata_size server.tcp.queue server.tcp.threads server.cors_enable Tasks server.n_workers, server.n_realtime_workers server.n_task_limit server.task_priorities_enable server.task_default_priority server.finished_task_timeout Streams stream.http.enable input_stream.http.timeout stream.websocket.enable input_stream.websocket.max_payload_size stream.rtp.enable stream.rtp.bind_ip stream.rtp.min_port, stream.rtp.max_port input_stream.rtp.stream_limit input_stream.rtp.timeout output_stream.rtp.timeout Audio formats server.audio_formats.opus.enabled server.audio_formats.flac.enabled audio_converter.enabled audio_converter.command Reporting reporting.urls reporting.ssl.enabled reporting.ssl.ca_file reporting.ssl.certificate_file reporting.ssl.private_key_file reporting.ssl.private_key_password reporting.ssl.cipher_list External external.technologies.tts_connectors Generic settings server.bind_ip, server.port # IP address and port for server listening server.bind_ip = 0.0.0.0 server.port…

Understand SPE technologies configuration file

SQE_STREAM Speech Quality Estimation Stream STT Speech To Text STT_STREAM Speech To Text Stream TAE Time Analysis Extraction TAE_STREAM Time Analysis Extraction Stream VAD Voice Activity Detection VAD_STREAM Voice Activity Detection Stream SIDC Speaker Identification Voiceprint Comparator (legacy) SIDC_STREAM Speaker Identification Voiceprint Stream Comparator (legacy) SIDCALIBSET Speaker Identification VoicePrint Calibration (legacy) SIDCALIBSET_STREAM Speaker Identification VoicePrint Stream Calibration (legacy) SIDE Speaker…

Understand SPE configuration

…timeout for HTTP stream in seconds. # If stream doesn’t receive any data for given time, then stream is closed. stream.http.timeout = 30 # Enable RTP stream subsystem stream.rtp.enable = true # IP address for create rtp sessions stream.rtp.bind_ip = 0.0.0.0 # Sets starting port for creating RTP sessions stream.rtp.min_port = 10000 stream.rtp.max_port = 11000 # Number of max opened…

SPE and Browser installation: standalone SPE

…Quality Estimation Stream [disabled] 17) Speech To Text [disabled] 18) Speech To Text Input Stream [disabled] 19) Time Analysis [disabled] 20) Time Analysis Stream [disabled] 21) Voice Activity Detection [disabled] 22) Voice Activity Detector Stream Technology [disabled] 23) Enable all 24) Disable all 0) Quit Choose technology to configure [0]:23 Select the option to Enable all technologies (usually the option…

Understand SPE executable files

…SID4C (SID4 extractor and SID4 comparator) with both L4 and XL4 models, depending on actual availability of the technologies/models in that SPE installation. Due to the “…single character” pattern definition, the list won’t include SID4E_STREAM, SID4C_STREAM and SID4CALIB technologies. phxadmin2: example 3 ./phxadmin2 technology enable sid?_stream:*l?=3 sid4?_stream:*l?=1 enable 3 instances of technologies with names matching “sid followed by single character,…

Release Notes

…which can be edited by users. Speech Engine: Speaker Identification (SID4) New “floating window” feature for realtime stream processing (since 3.60.0) This new floating_window parameter allows to identify speaker or extract voiceprint from only last X seconds (default 5) of speech in the realtime stream… as opposed to using speech from entire stream audio without using this parameter. Speech Engine:…

Understand SPE audio converter

…tool, you can upload essentially any audio– or even videofile to SPE and it will be automatically converted to audio format supported natively by SPE. ⓘ NOTE: The automatic conversion is done only when uploading audiofiles to SPE, it’s not done when registering files! For more info about uploading/registering audiofiles, see Understanding SPE home directory article. Converter installation As a…

FAQs (PSP)

…Browser, FAQ Speech Platform Permalink Q: How to fix Error 1007: Unsupported audio format? Phonexia Browser application may return error “1007: Unsupported audio format” during uploading audio file. Please consider if your audio files are in Q: What are the supported audio formats? . But if you need use as input audio recordings in other formats, you can configure SPE…

Phonexia Speech Engine

audio manipulation SPE has built-in basic audio files manipulation functionality, like separating individual channels from stereo recordings, cut one audio to several files, save audio from incoming stream to file and others. Stream audio player To support voicebot scenarios, SPE has the ability to play audiofiles directly to output RTP stream External Text-to-speech (TTS) integration Easy integration with external TTS…

Understand SPE workers configuration

…CPU cores in the server. Example: Czech STT on stream is approx. 4 times faster than realtime, i.e. 1 CPU core can process 4 realtime streams simultaneously. So a server with 8 CPU cores running only STT stream can be configured as follows: keep 1 core dedicated for operating system and SPE remaining 7 cores can handle 28 realtime workers…

Input audio quality

audio codec, heavy compression, too low bitrate, etc. can damage or even completely destroy essential parts of the audio signal required by speech technologies. Commonly used audio compressions make use of perceptual limitation of human hearing and can remove frequencies which are covered by other frequencies, etc… Therefore, to get satisfactory results from speech technologies, use appropriate audio format. ⓘ…

LID: Terminology and adaptation

…20 hours of audio is required, see requirements below Enhancing existing language model by adding your own audio files to existing built-in language at least 5 hours of audio is required, see requirements below Creating custom language pack consisting of your chosen set of languages, both pre-trained or created from your audio files Audio recordings requirements Format: WAV, FLAC, RAW…

Understand SPE database

…unregistering files after processing (if using the files registering technique instead of uploading the audio files – see the Understanding SPE home directory article). This makes the files information AND the cached processing results to be kept in database. Or, you may be saving stream data to file, but not deleting the created stream audio files using the REST API…

FAQs (Browser)

audio format” during uploading audio file. Please consider if your audio files are in Q: What are the supported audio formats? . But if you need use as input audio recordings in other formats, you can configure SPE for audio automated conversion. As prerequisite install external tool for audio conversion. Recommend is ffmpeg utility, powerful and well documented. Please find…