SPE3 – Releases and Changelogs

Speech Engine (SPE) is developed as RESTfull API on top of Phonexia BSAPI.
SPE was formerly known as BSAPI-rest (up to v2.x) or as Phonexia Server (up to v3.2.x).

This page lists changes in SPE releases.

Releases

 

Version Release Date End of Support Maintained Until Release type
3.30 2020-03-27 2022-09-27 3.35 Public
3.26 2020-03-02 2022-09-02 3.30 Feature
3.25 2020-01-31 2022-07-31 3.26 Feature
3.24 2020-12-18 2022-06-18 3.25 Feature
3.23 2020-11-01 2022-05-01 3.24 Feature
3.18 2019-10-01 2022-04-01 3.19 Public
3.17 2019-06-28 2021-12-28 3.18 Public
3.16 2019-04-26 2021-10-26 3.17 Public
3.15 2019-02-28 2021-08-28 3.16 Public
3.14 2018-12-21 2020-06-21 3.15 Public
3.13 2018-11-19 2020-05-19 3.14 Public
3.12 2018-08-17 2020-02-17 3.13 Public
3.11 2018-03-15 2019-09-15 3.12 Public
3.10 2017-12-06 2019-06-06 3.11 Public
3.9 2017-09-08 2019-03-08 3.10 Public
3.8 2017-06-26 2018-12-26 3.9 Public
3.7 2017-03-27 2018-09-27 3.8 Public
3.6 2016-12-14 2018-06-14 3.7 Public
3.5 2016-10-04 2018-04-04 3.6 Public
3.4 2016-09-19 2018-03-19 3.5 Public
3.3 2016-07-11 2018-02-11 3.4 Public
3.2 2016-04-22 2017-10-22 3.3 Public
3.1 2016-02-15 2017-08-15 3.2 Public
3.0 2016-02-09 2017-08-09 3.1 Public
2.1 2015-09-16 2017-09-16 2017-09-16 Public
2.0 2015-01-06 2016-07-06 2.1 Public

 

Changelogs

Speech Engine 3.30.0 (03/25/2020) – DB v1400, BSAPI 3.30.0
Public release

  • New: Added 5th generation of FR_FR of STT, KWS and PHNREC
  • New: Updated and significantly improved phonemes document for STT and KWS (see doc directory)
  • New: Added n-best output to all 5th generation STT stream results
  • New: Added support for native numbers and dates notation in n-best output in 5th generation CS_CZ and SK_SK STT (in both file- and stream processing)
  • New: Each request in SPE log gets unique ID, allowing better request tracing. Also HTTP status and REST error code is logged in case of error
  • New: Updated STT model RU_RU_A to version 4.3.0
  • Changed: All utterance_lenght parameters (introduced in 3.24) renamed to speech_length in endpoints returning voiceprint
  • Changed: Parameters languageCode and languageCodes (introduced in 3.25) renamed to language_code and language_codes in TTS endpoints
  • Changed: Parameter target (introduced in 3.25) in POST /external/technologies/tts query renamed to path
  • Improved: Better error message on upload/registering of new file when file cannot be opened
  • Fixed: Processing long files results in premature end without error message

NOTE: STT output format has changed in 5th generation:

  • _DELETE_ token was changed to <null/>
  • _SILENCE_ and <sil/> tokens were changed to <silence/>
  • <s> and </s> tokens were changed to <segment> and </segment> respectively
Speech Engine 3.26.0 (02/28/2020) – DB v1400, BSAPI 3.26.0
Non-public Feature Preview release
  • New: Added new SID4 XL4 model
Speech Engine 3.25.1 (02/07/2020) – DB v1400, BSAPI 3.25.0
Non-public Feature Preview release
  • New: Improved handling of “Accept” HTTP header for better CORS support
  • Fixed: TTS saves raw file and returns internal server error
  • Fixed: TTS connector gets stuck when recoding takes long time
Speech Engine 3.25.0 (01/30/2020) – DB v1400, BSAPI 3.25.0
Non-public Feature Preview release
  • New: Added input stream statistics to result of DELETE /input_stream/rtp call
  • New: Added support for CORS (can be enabled by server.cors_enable property)
  • New: Added Acapela TTS integration, see External Text To Speech
    • currently supported only in Linux SPE builds!
Speech Engine 3.24.0 (12/10/2019) – DB v1400, BSAPI 3.24.0
Non-public Feature Preview release
  • New: Significantly improved 5th generation STT stream performance
    • Added neural network based voice activity detection – improves the end-of-utterance detection
    • Decoder is now restarted after each segment – i.e. “word corrections’ never go beyond segment boundary
    • Added per-segment confidence, computed as an average of all word confidences in a sentence – helps in judging the results ‘credibility’
    • Reduced delay of obtaining results in output – allows for faster detection of barge-in, e.g. in voicebot application
  • New: All 5th generation STT models now use Minimum Bayes-Risk Decoding for Confusion Network construction
    • Confusion Network results now contain precise start- and end times for each individual alternative word
  • New: KWS confidence value calculation can be modified using confidence_shift and confidence_sharpness values (see KWS results explained article for more details)
  • New: Added utterance_length to SID/SID4 voiceprint results
  • New: Added /output_stream and audio file player (/utils/player/output_stream) endpoints
  • New: Added 5th generation of AR_XL (Beta version) of STT, KWS and PHNREC
    (combines both North- and South Levantine, hence the custom code AR_XL)
  • Changed: Changed endpoints, results and properties using the term ‘stream‘ to use ‘input_stream
  • Changed: Technology models named DEFAULT are renamed to GENERIC
    • stop SPE and then run phxadmin --configure-tech to automatically update affected technologies configuration
    • modify accordingly SPE REST API calls in your application, if applicable
  • Fixed: STT doesn’t work with models customized using LMC
  • Fixed: Incorrect end times for <segment/> token in STT results

NOTE: STT output format has changed in 5th generation:

  • _DELETE_ token was changed to <null/>
  • _SILENCE_ and <sil/> tokens were changed to <silence/>
  • <s> and </s> tokens were changed to <segment> and </segment> respectively
Speech Engine 3.23.0 (11/01/2019) – DB v1300, BSAPI 3.23.0
Non-public Feature Preview release
  • Changed version to 3.23.0 to synchronize with BSAPI
  • Fixed: SPE sends IP address in Host: HTTP header instead of hostname
  • Fixed: SPE sometimes outputs “[ERRFMT]” string to log messages instead of actual value

Speech Engine 3.18.3 (12/09/2019) – DB v1300, BSAPI 3.22.2
Public release

  • Fixed: STT on stream may cause assert violation when waiting for stream timeout on no input data
  • Fixed: SPE sends IP address in Host: HTTP header instead of hostname
  • Fixed: SPE sometimes outputs “[ERRFMT]” string to log messages instead of actual value

Speech Engine 3.18.2 (10/14/2019) – DB v1300, BSAPI 3.22.1
Public release

  • Fixed: Customized STT model fails on Windows with Request for next state but ending state reached. error message

Speech Engine 3.18.1 (10/01/2019) – DB v1300, BSAPI 3.22.0
Public release

  • New: DICTATE technology has been renamed to STT_STREAM (/technologies/dictate -> /technologies/stt/stream)
    (for backward compatibility, the /technologies/dictate endpoint is internally redirected)
  • New: SID/SID4 stream now allows gradually getting voiceprint from the stream (see /technologies/speakerid4/stream/voiceprint)
  • New: Unicode characters in file names are now supported on Windows platform
  • New: Added LLR score to GID result (as score_llr value, see /technologies/genderid)
  • New: Added ‘per_channel‘ parameter to Diarization for processing multi-channel recordings
  • New: Added configuration option to not start SPE if some technology doesn’t start (server.require_all_configured_technologies)
  • Fixed: Random SIGSEGV crashes in CS_CZ_5 STT
  • Fixed: KWS CS_CZ_5 ingnores keyword thresholds
  • Fixed: Duplicated output from KWS
  • Fixed: KWS online configurations for models CS_CZ_5 and NL_NL_5
  • Fixed: phxadmin increases number of instances in configuration instead of setting it
  • Fixed: phxclient is streaming slower than expected
  • Fixed: Redefinition of block in used configuration causes segmentation faults

NOTE: Due to the change in GID results content, all GID results will be removed from cache (database) during update!


Speech Engine 3.17.3 (08/22/2019) – DB v1200, BSAPI 3.21.3

  • [G_#191] Fixed: KWS getting phonemes/graphemes in specific circumstances returns unknown error
  • [G_BSAPI#413] Fixed: duplicated output from KWS

Speech Engine 3.17.2 (08/02/2019) – DB v1200, BSAPI 3.21.2

  • [G_BSAPI#300] Fixed: KWS stream results are displayed with a delay

Speech Engine 3.17.1 (07/22/2019) – DB v1200, BSAPI 3.21.1

  • Added 5th generation of ES_ES of STT/Dictate/KWS/PHNREC

NOTE: STT output format has changed in 5th generation:

  • _DELETE_ token was changed to <null/>
  • _SILENCE_ and <sil/> tokens were changed to <silence/>
  • <s> and </s> tokens were changed to <segment> and </segment> respectively

Speech Engine 3.17.0 (06/27/2019) – DB v1200, BSAPI 3.21.0

  • Added L4 model to GID and AGE technologies, i.e. they now support also SID4 L4 voiceprints
  • [G#183] Added silence detection in Dictate
  • [G#182] Added support for RLS capacities
  • [G#137] Added possibility to specify multiple destinations in server.logging.destination option
  • [G#136] Phonexia Browser configuration files are now included in data collected by
    phxadmin --report command
  • [G_BSAPI#401] Fixed inability to define phrases in some KWS 5th generation models (caused by missing sil phoneme)

Speech Engine 3.16.3 (06/06/2019) – DB v1200, BSAPI 3.20.3

  • [G#180] Fixed regression from 3.16.2: SID4 voiceprint comparator produces inconsistent results

Speech Engine 3.16.2 (06/03/2019) – DB v1200, BSAPI 3.20.2

  • [G#178] Added 5th generation of RU_RU and EN_US of STT/Dictate/KWS/PHNREC

NOTE: STT output format has changed in 5th generation:

  • _DELETE_ token was changed to <null/>
  • _SILENCE_ and <sil/> tokens were changed to <silence/>
  • <s> and </s> tokens were changed to <segment> and </segment> respectively

Speech Engine 3.16.1 (05/17/2019) – DB v1200, BSAPI 3.20.1

  • [G#173] Fixed: Symbols with diacritics in file names (and also speaker model, group names, etc ..) causes errors when using MySQL
  • [G_BSAPI#397] Fixed: SID4 voiceprint comparator produces inconsistent results

NOTE: Due to issue in SID4 comparator, all SID4 results related to Audio Source Profiles will be deleted!

Speech Engine 3.16.0 (04/26/2019) – DB v1101, BSAPI 3.20.0

  • [G#146] Default value of server.n_realtime_workers changed from 0 to 8
  • [G#141] File size limit server.upload_max_filesize is now taken into account also when registering new file
  • [G#156] Added SID4 streams
  • [G#157] Added endpoint for updating existing Audio Source Profile
  • [G#160] SID4 calibration technology renamed: SID4CALIBSET -> SID4CALIB
  • [G#161] Mean normalization support in Audio Source Profiles
  • [G#169] Added cache for Audio Source Profiles, see server.audio_source_profiles_cache_size property
  • [G#170] Added False Acceptance Calibration cache, see server.bsapi_comparator_fa_cache_size
  • [G#149] Fixed: phxclient prints help if running without parameters
  • [G#150] Fixed: UTF-8 symbols are not escaped in phxclient output anymore
  • [G#164] Fixed: names of languages in custom language pack don’t contain \r character anymore
  • [G#166] Fixed: wrong parameter for stopping server in init.d script template

Speech Engine 3.15.6 (03/14/2018) – DB v1101, BSAPI 3.19.2

  • [BSAPI#370] Added SK_SK 5th generation of STT, Dictate, KWS and PHNREC

NOTE: STT output format has changed in 5th generation:

  • _DELETE_ token was changed to <null/>
  • _SILENCE_ and <sil/> tokens were changed to <silence/>
  • <s> and </s> tokens were changed to <segment> and </segment> respectively

Speech Engine 3.15.5 (03/08/2019) – DB v1101, BSAPI 3.19.1

  • [#147] Fixed SID4 result cache is not invalidated when speaker model is changed
  • [#145] Add ‘prioritize’ role to the default ‘admin’ user

Speech Engine 3.15.4 (02/28/2019) – DB v1100, BSAPI 3.19.0

  • [G#131] Added SID v4 technology
  • [G#133] Resource lock for language pack didn’t work with MySQL database
  • Removed SID L2 model

Speech Engine 3.14.3 (01/29/2018) – DB v1000, BSAPI 3.18.0

  • [#130] Fixed phxadmin exiting with error with some argument combinations

Speech Engine 3.14.2 (12/21/2018) – DB v1000, BSAPI 3.18.0

  • [#125] Speed up phxadmin technology listing
  • [#93] Fixed getting of Dictate’s and KWS’s results may sometimes take a long time
  • [#124] Fixed license error cause all already initialized instances of technology with same model are lost
  • [#116] Fixed command line options with wrong prefix are not ignored anymore
  • [BSAPI#225] Added KWS/STT NL_NL 5th generation
  • [BSAPI#264] Added KWS/STT CS_CZ 5th generation
  • [BSAPI#287] Added PHNREC PL_PL 5th generation
  • [BSAPI#242] Upgraded Time Analysis Extractor Technology (switched to STT 5th gen VAD, set cross talk threshold to 0.5 sec)
  • [BSAPI#291] Fixed PHNREC segmentation goes beyond recording length
  • [BSAPI#292] Fixed WAV with no speech cause error
  • [BSAPI#310] Fixed Spanish and English KWS returns incorrect timestamps
  • [BSAPI#284] Fixed pronunciation of keyword may not be generated

NOTE: STT output format has changed in 5th generation:

  • _DELETE_ token was changed to <null/>
  • _SILENCE_ and <sil/> tokens were changed to <silence/>
  • <s> and </s> tokens were changed to <segment> and </segment> respectively

Speech Engine 3.13.3 (11/28/2018) – DB v1000, BSAPI 3.17.0

  • [G#118] Fixed KWS stream is not reinitialized after usage anymore
  • [G#115] Fixed stream save data to file without name if parameter path is empty

Speech Engine 3.13.2 (11/19/2018) – DB v1000, BSAPI 3.17.0

  • [G#110] Loading of plugins is configurable, disabled by default
  • [G#36] Fixed database query may return old data – only MySQL was affected
  • [G#105] KWS now supports phrases in keyword list
  • [G#109] Added endpoint for self-compare voiceprint set (/technologies/speakerid/comparevpset)
  • [G#57] Support for Phonexia RLS
  • [G#50] Added prioritization of tasks
  • [G_BSAPI#106] Added wfilter_speech_signal_length output item into the SQE output

Speech Engine 3.12.2 (09/25/2018) – DB v900, BSAPI 3.16.1

  • [G#96] Fixed phxclient use websocket instead of polling
  • [G_BSAPI#219] Fixed bug: some corrupted recordings may lead to crash
  • [G_BSAPI#101] Fixed bug: silence and voice may overlap in VAD segmentation

Speech Engine 3.12.1 (08/17/2018) – DB v900, BSAPI 3.16.0

  • [#81] Fixed an apostrophe in a file name may cause server error
  • [#80] Fixed server may bind to the already binded port on Linux
  • [#76] Fixed cached result is send to webhook target
  • [#70] Added EULA to the production package
  • [#59] Added Denoiser technology
  • [#69] Allow comparing voiceprint with speaker model/group
  • [#41] Fixed /technologies/diarization/split fails if parameter target doesn’t contain wav suffix or if suffix missing
  • [#67] GID and AGE technologies accept also SID voiceprint as an input
  • [#60] Getting voiceprints for all speaker models for given speaker group
  • [#23] Minimum speech length for extracting SID calibration voiceprint is 60s for newly created calibration sets
  • [#83] Lower case keyword cause error with some models (cs_CZ)
  • [BSAPI] Added a new STT and KWS PL_PL model version 5.0.0 (the first model of 5th generation)
  • [BSAPI] Added more accurate G2P (5th generation only)
  • [BSAPI#72] Fixed phoneme recognizer doesn’t make phonemes for phnrec_ru_ru.bs
  • [BSAPI#99] Fixed phoneme recognizer with configuration phnrec_cs_cz.bs doesn’t transcript short recordings
  • [BSAPI#82] Fixed missing configuration of phnrec for HR_HR4
  • [BSAPI#78] Fixed STT segmentation – a segment doesnt break on a long silence, creates false crosstalks
  • [BSAPI#148] Phoneme recognizer – all phonemes has channel 0 in multi channel recording in some models (cs_CZ)

NOTE: STT output format has changed in 5th generation:

  • _DELETE_ token was changed to <null/>
  • _SILENCE_ and <sil/> tokens were changed to <silence/>
  • <s> and </s> tokens were changed to <segment> and </segment> respectively

Speech Engine 3.11.3 (19/06/2018) – BSAPI 3.15.0

  • [G#77] Update from SPE 3.9 deletes all files from SID models and calibration sets when using SQLite database

Speech Engine 3.11.2 (06/06/2018) – BSAPI 3.15.0

  • [G#65] Fixed empty keyword list produced internal server error
  • [G#71] Better recording format detection
  • [G#73] Fixed possible server crash on Windows

Speech Engine 3.11.1 (03/15/2018) – BSAPI 3.15.0

  • [G#43] Fixed SIDCalib and KWS technologies were not reinitialized if error occurs
  • [G#3] Restart MySQL DB transaction when deadlock occurs
  • [G#26] Added webhooks for asynchronous requests
  • [G#46] Changed default log verbosity level to ‘debug’
  • [G#32] Speaker model and group is possible to prepare with calibration
  • [G#21] Dictate now supports incremental mode
  • [G#9] Added resource for compare voiceprint sets
  • [G#42] Optimized SID speed, use DB cache for calibrated voiceprints of speaker models (removed option server.db.sid_model_calib_vp_cache_size)
  • [G#56] Fixed data may leak between one RTP stream to another
  • [G#55] Fixed error when client doesn’t send whole samples to stream
  • [G#63] Phxadmin now checks immediately that user already exists during adding user
  • [G#64] Fixed premature access to the result of VBS stream may lead to error
  • [G#52] Update to BSAPI 3.15.0
  • [G_BSAPI#53] Added support for 64bit float wav format
  • [G_BSAPI#3] Fixed BSAPI may crash when recording’s header is invalid
  • [G_BSAPI#5] Fixed Dictate produces different results on second and next run
  • [G_BSAPI#4] Fixed Dictate CS_CZ last segment of transcription has negative end time
  • [G_BSAPI#68] Fixed Phoneme Recognizer with configuration phnrec_pl_pl.bs not working
  • [G_BSAPI#75] Fixed bug: Dictate EN not working properly with a random input buffer size

Speech Engine 3.10.3 (01/18/2018) – BSAPI 3.14.0

  • [G#22] Fixed audio converter race condition
  • [G#4] Added configuration option “server.db.sid_model_calib_vp_cache_size”
  • [G#27, G#30, G#37, G#40] Documentation and manual update

Speech Engine 3.10.2 (12/06/2017) – BSAPI 3.14.0

  • [#4981] Saving logs to database (MySQL only)
  • [#4999] Added generating of reports (phxadmin with parameter ‘report’)
  • [#5055] Added possibility to prepare only one file in calibration set (see API changes)
  • [#5035] Speed up SID when calibration is used
  • [#5161] Use MariaDB connector instead of MySQL connector
  • [#5178] Updated systemd service template – added dependency on network-online.target
  • [#5070] Added voice-print merge resource (/technologies/speakerid/vpmerge)
  • [#5099] Added resource which returns tasks of all users (/tasks)
  • [#5132] Added version of technology model to resource /technologies
  • [#5134] Added version of BSAPI to resource /server/info
  • [#5135] Added groups which speaker model is member of to resource /technologies/speakerid/speakermodels/{name}
  • [#5133] Login of a user can contain any characters except these: \/:*?”<>|
  • [#5150] Fixed connection to MySQL database may be lost in case of hight load
  • [#5191] Fixed SID Stream requires calibration technology even if parameter ‘calibset’ was not specified
  • [#5203] Fixed premature access to the result of SID stream may lead to error
  • [#5192] Update to BSAPI 3.14.0
  • [Redmine #5130] Renamed PL -> PL_PL models for KWS and STT and updated to version 4.0.0
  • [GitLab #17] Updated STT RU_RU_A model to version 4.1.0
  • [GitLab #35] Updated KWS and STT DE_DE models to version 4.0.0
  • [Redmine #4678] Updated STT CS_CZ model to version 4.1.0

Speech Engine 3.9.3 (10/23/2017) – BSAPI 3.13.0

  • [#5138] Fixed capital letters in file suffix may cause errors if the file is registered
  • [#5090] Fixed PHNREC may return error for some audio files
  • [#5043] Fixed utils resources allow to create file without suffix. Suffix “.wav” is automatically added if the file has no suffix

Speech Engine 3.9.2 (09/08/2017) – BSAPI 3.13.0

  • [#4899] Fixed possible deadlock in MySQL database when moving files to calibration set
  • [#4946] Fixed time ranges doesn’t properly work for multichannel recordings and for FLAC and OPUS
  • [#4946] Fixed parameter “from_time” may cause corruption of processing data
  • [#4950] Fixed STT may produce incorrect time stamps in confusion network result for multichannel recordings
  • [#4985] Fixed Removing recording from Speaker model does not invalidate SID result in cache – only on MySQL
  • [#4955] Fixed concurent access may cause errors on MySQL database
  • [#4993] Fixed typo in VBS resource path “/vbs/watchlists/[name]/verify/stream” (there was “wachlist”)
  • [#5038] Fixed stream returns error when no data was sent
  • [#4910] Fixed extraction of calibration voiceprint take count only last channel in multichannel recording
  • [#4945] Resource “/technologies” doesn’t require authentication anymore
  • [#4952] Added possibility to distinguish BSAPI errors from SPE errors in response header
  • [#4971] phxadmin supports generation of hardware profile (parameter “hwgen”) same as hwgen tool
  • [#4971] phxadmin doesn’t require license anymore
  • [#4974] Added list of result versions (doc/result_versions.txt)
  • [#4983] Added STT_TR model
  • [#5038] Fixed stream returns error when no data was sent
  • [#4151] Added KWS benchmark
  • [#4862] Added PHNREC benchmark
  • [#4533] Benchmark data are versioned
  • [#4840] Added checking validity of keyword list
  • [#4896] Added SID calibration set allows store metafiles
  • [#4909] Added possibility to get calibration voice-print from calibration set
  • [#4986] Update BSAPI to v3.13.0
  • [#4679] Lower STT memory consumption
  • [#4800] Added new STT HR_HR model 4.0.0
  • [#4805] Added new STT AR_KW model 4.0.0 (replacing old AR model)
  • [#4900] Updated STT DE_DE model to version 4.0.0
  • [#4664] Fixed STT may return empty segmentation and crash without error message
  • [#4799] Updated KWS CS_CZ model to version 4.0.0
  • [#4800] Added new KWS HR_HR model 4.0.0
  • [#4987] Added stream KWS NL_NL model
  • [#4940] Fixed configuration file for PHNREC AR contains wrong IID
  • [#4942] Fixed unable to initialize PHNREC ZH
  • [#4970] Fixed PHNREC with model SLOVAK does not work
  • [#4968] Fixed KWS with model SLOVAK returns invalid pronunciation
  • [#4966] Fixed wrong IID in configuration of PHNREC PL
  • [#4571] Updated Dictate CS_CZ model to version 4.0.0
  • [#4965] Fixed SID stream extractor with model L3, XL3 does not work
  • [#4994] Fixed SID stream with model L3 / XL3 throw error after processing of multiple streams

Speech Engine 3.8.3 (06/26/2017) – BSAPI 3.12.0

  • [#4784] Fixed it is possible to create speaker model or calibration set with character that is invalid for file system
  • [#4783] Fixed remove RTP stream (created with parameter “path”) without send any data may cause stop processing all RTP streams
  • [#4781] Fixed server may stucks during shutdown
  • [#4778] Fixed unable to initialize MySQL database with init.sql script if database has not set default engine to InnoDB
  • [#4755] Added new technology Phoneme Recognition (PHNREC) – /technologies/phnrec
  • [#4605] Added new command line parameter “version” to phxspe
  • [#4713] Added new RTP payloads 35 (Lin16, 8000Hz, 2ch) and 36 (Lin16, 8000Hz, 1ch)
  • [#4714] Voice-print extractor and comparator now supports calibration
  • [#4742] Checking audio-file format during registration
  • [#4812] Update to BSAPI 3.12.0
  • [#3699] Add missing configuration for stream mode in SID models L3, XL3
  • [#4527] Update voice-print format for SID models L2 and S (added i-vector to VP). It is forward and backward compatible with previous version.
  • [#4568] Added KWS TR_TR and AR_KW models
  • [#4606] Fixed KWS ZH calibration
  • [#4564] Updated KWS PS model v1.2.0
  • [#4720] Updated STT NL_NL model v4.1.0
  • [#4770] Updated STT CS_CZ_FIN model v4.1.0
  • [#4705] Fixed STT doesn’t transcript file with model SK_TELCO3

Speech Engine 3.7.3 (04/21/2017) – BSAPI 3.11.0

  • [#4661] Remove old models for STT and KWS
  • [#4662] Fixed SPE 3.7.2 contains wrong version of BSAPI that may cause some errors

Speech Engine 3.7.2 (03/27/2017) – BSAPI 3.11.0

  • [#4579] Fixed registering VAD stream returns HTTP code 500 if realtime workers limit exceeded
  • [#2807] RTP streams now support payload 0 (PCMU) and 8 (PCMA)
  • [#4536] Added new configuration option “stream.http.timeout”
  • [#4588] Update BSAPI to 3.11.0
  • [#4529] Added French stream KWS
  • [#4305] Added new model STT DE_DE 3.0.0
  • [#4565] Added nonspeech segment to VAD output
  • [#4531] Fixed STT SK_TELCO returns empty transcription
  • [#4513] Fixed STT FR transcription of second channel was shifted
  • [#4543] Fixed KWS Pashto needs Dutch data
  • [#4378] Fixed STT ES_AMER1 may returns empty transcription
  • [#4377] Updated models STT RU_RU, RU_RU_FIN, RU_RU_A to 4.0.0
  • [#4306] Updated models STT CS_CZ, CS_CZ_FIN, CS_CZ_ENERGY, CS_CZ_TELCO, CS_CZ_IT to 4.0.0
  • [#4305] Updated KWS DE_DE model to version 3.0.0
  • [#4377] Updated KWS RU_RU model to version 4.0.0
  • [#4306] Updated KWS CS_CZ model to version 3.0.0

Speech Engine 3.6.5 (03/22/2017) – BSAPI 3.10.2

  • [#4586] All benchmark requests without optional parameter “path” ends with error

Speech Engine 3.6.4 (03/10/2017) – BSAPI 3.10.2

  • [#4516] Processing file with SID with huge calibration set may take a long time

Speech Engine 3.6.3 (02/23/2017) – BSAPI 3.10.2

  • [#4363] Fixed stream may be deleted by garbage collector immediately after creation
  • [#4404] Fixed Utils and Benchmarks may cause resource lock error
  • [#4498] Update BSAPI to 3.10.2
  • [#4322] Fixed Time analysis extractor sometimes crash
  • [#4333, #4347] Fixed STT EN 4.0.0 and NL_NL 4.0.0 returns <s>, <sil/> and “silence” segments
  • Fixed stream KWS EN configuration

Speech Engine 3.6.2 (01/05/2017) – BSAPI 3.10.1

  • [#4338] Fixed error handling when using websockets

Speech Engine 3.6.1 (12/14/2016) – BSAPI 3.10.1

  • [#4290] Fixed unable to remove HTTP stream if stream was configured to store data to a file and no data was sent
  • [#4295] Fixed unable to find license file if path contains special characters [Windows]
  • [#4145] Added VAD benchmark
  • [#4146] Added SQE benchmark
  • [#4148] Added keyword threshold to keyword list
  • [#3797] Added stream TAE
  • [#4199] Fixed websocket may not be correctly closed in some cases
  • [#4216] Changed result for SQE (see API documentation)
  • [#4188] CPU information in benchmark results does not contains processor codename anymore (it may be inaccurate)
  • [#4150] Stream technologies VAD and KWS now supports incremental mode (query parameter “result_mode” in POST /technologies/*/stream)
  • [#4313] Support for logging in separate thread (configuration parameter “server.logging.enable_async”), disabled by default
  • [#4320] Renamed and updated KWS models: ITALIAN -> IT_IT, DUTCH -> NL_NL
  • [#4320] Added Dictate model CZ_PROMPT
  • [#4320] Added STT models: IT_IT, NL_NL (based on DNN), RU_FIN, CZ_PROMPT
  • [#4320] Updated STT models: AR, CZ, CZ_ENERGY, CZ_FIN, CZ_IT, CZ_TELCO, EN (based on DNN), ZH
  • [#4320] Updated KWS model ZH
  • [#4320] Updated VAD model DEFAULT
  • [#4332] Update BSAPI to 3.10.1
  • [#4319] New default file logging destination (“log” folder) with daily file rotation and purge after 5 days
  • [#4319] VBS plugin now supports log file rotation

Speech Engine 3.5.3 (10/25/2016) – BSAPI 3.9.1

  • Fixed starting several SID tasks at the same time with newly created SID model may cause database inconsistency

Speech Engine 3.5.2 (10/21/2016) – BSAPI 3.9.1

  • Added french STT
  • Fixed “is_last” flag was not properly set in results of stream technologies SID, KWS, VAD
  • Fixed stream VAD used wrong configuration file, that caused the technology not work
  • Fixed wrong stream VAD result name (SpeakerIdentificationStreamMultiResult -> VoiceActivityDetectionStreamResult)

Speech Engine 3.5.1 (10/06/2016) – BSAPI 3.9.1

  • Update BSAPI to 3.9.1

Speech Engine 3.5.0 (10/04/2016) – BSAPI 3.9.0

  • Added global confidence to one best result in STT
  • Update BSAPI to 3.9.0

Speech Engine 3.4.4 (09/23/2016) – BSAPI 3.8.0

  • Fixed server require old database schema (v100)
  • Fixed speed up MySql database requests for file search
  • Added API changes for version 3.4.x to API documentation

Speech Engine 3.4.3 (09/20/2016) – BSAPI 3.8.0

  • Fixed server returns error for KWS phoneme request (/technologies/keywordspotting/phonemes) if only KWS or Stream KWS was running

Speech Engine 3.4.2 (09/19/2016) – BSAPI 3.8.0

  • Added stream VAD (/technologies/vad/stream)
  • Added stream KWS (/technologies/keywordspotting/stream)
  • Added technology benchmarks for AGE, DIAR, GID, LID, SID, STT (/technologies/{TECHNOLOGY}/benchmark)
  • Added request to get voice-print info (/technologies/speakerid/vpinfo)
  • Added usage examples to API documentation
  • Add configuration options for TCP connection settings
  • Added VAD segmentation to Time Analysis technology
  • Support to acquire and compare language-prints
  • LID technology was separated to LIDC (comparator) and LIDE (extractor)
  • Support websockets for pending operations
  • Added server health check request (GET /status)
  • Update BSAPI to 3.8.0

Speech Engine 3.3.2 (08/23/2016) – BSAPI 3.6.1

  • Added configuration option to disable OPUS and FLAC files in storage

Speech Engine 3.3.1 (08/19/2016) – BSAPI 3.6.1

  • Fixed resource stay locked for some time after task is finished
  • Minor fixes in documentation

Speech Engine 3.3.0 (07/11/2016) – BSAPI 3.6.1

  • Phonexia Server renamed to Speech Engine
  • Fixed some pending operations are not processed until new pending operation is created
  • Fixed early access to stream SID result may cause server crash
  • Fixed check if user is active during authentication process
  • Fixed custom pronunciation in keyword list does not take effect
  • Added parallel starting of technologies (configuration parameter ‘server.technology_multithread_initialization’) – default is disabled
  • Added resource locking (configuration parameter ‘server.enable_resource_locker’) – default is enabled
  • Added request POST /technologies/diarization/split to create multi-channel recording by diarization – each channel coresponds to one speaker
  • Added request GET /technologies/keywordspotting/phonemes to get supported phonemes
  • Added log files rotation (configuration parameters ‘server.logging.file.rotation’ and ‘server.logging.file.purge_count’)
  • Added support for FLAC and OPUS files – it is possible to upload and process these files, but requests which produce new files always produces WAV files
  • Added request GET /admin/roles to list user roles
  • Added VBS (Voice Biometry Server) plugin
  • Added result of GET /server/info contains information about plugins
  • 32-bit architecture (i386) is not supported anymore
  • Updated BSAPI to 3.6.1
Posted in PublicNews, Support and tagged , , , .