Release 1.8; September 11, 2019

  • Feature The KeenASR SDK now provides language support for Spanish.
  • Feature The KIOSRecognizer provides a static teardown method, which allows swapping of recognizers in the app in the event your app needs to support both English and Spanish languages.

  • Enhancement The KeenASR SDK is now built as a dynamic framework. You add it to your app as an Embedded Binary under Project Settings > Target > General settings. The size of the SDK has been reduced from 170MB to 105MB.
  • Enhancement In previous versions of the SDK, setting enableBluetoothA2DPOutput: to ‘True’ would enable AVAudioSessionCategoryOptionAllowBluetoothA2DP AND disable the AVAudioSessionCategoryOptionAllowBluetooth (“HDP bluetooth”) option of the AVaudioSession. Now these two options can co-exist. This means that the AVAudioSessionCategoryOptionAllowBluetooth property of AVAudioSession will always be set and the AVAudioSessionCategoryOptionAllowBluetoothA2DP parameter will be set to the value passed through the KIOSRecognizer enableBluetoothA2DPOutput method.
  • Enhancement The createJSONMetadata property of the KIOSRecognizer now defaults to ‘False’. If you are using JSON metadata files, make sure you explicitly set this property to ‘True”.
  • Enhancement Subsequent calls to activateAudioStack and deactivateAudioStack will be ignored.
  • Enhancement Uploads to Dashboard through KIOSUploader now use the HTTPS protocol.

  • Bug Fix Fixed a bug that prevented archive builds when bitcode compilation was enabled.
  • Bug Fix Occasional log warnings about JSON file removal during trigger phrase listening are not appearing any more.
  • Bug Fix Occasionally audio stack deactivation would not work correctly.

Release 1.71; March 21, 2019

  • Bug Fix Fixed a regression bug introduced in v1.7, where pronunciations of words not present in the lexicon (~200k words) were not automatically created.

Release 1.7; March 8, 2019

  • Feature The SDK now supports always on listening with trigger phrase (for example, “Hey computer”). This functionality uses the ASR engine in a way that does not require custom training for the trigger phrase. We are not using a separate small model for the wake-word, which would wake up the larger model when the wake-word is detected. Note that the current trigger phrase implementation does not support 24x7 listening because of battery constraints and thermal heating concerns. However, a few hours of ‘always on listening’ can be safely supported with this approach.

  • Bug Fix The KIOSRecognizer instance was not returning the name of the current decoding graph matching the name passed to the method that creates the decoding graph. For example, if the decoding graph was named ‘words’, the currentDecodingGraphName property would return ‘words-keenB2mQT-nnet3chain-en-us’ as a name, that is, it would append the ASR Bundle name.

  • Enhancement Cleaned up log levels for the KeenASR logger.

  • Enhancement Support for handling acronyms when passing a list of phrases to the methods that build the decoding graph.

Release 1.63; Dec 26, 2018

  • Enhancement Exposing additional methods for audio stack management. Primary use of these methods is for integration with higher-level frameworks, for example, Unity.

Release 1.62; Jun 18, 2018

  • Enhancement Support for new ASR Bundles (keenB2).

Release 1.61; Feb 12, 2018

  • Enhancement Provide longer running time for the trial SDK when running against files.

  • Enhancement Minor updates to how the licensed SDKs handles app bundle IDs.

Release 1.6; Jan 23, 2018

  • Enhancement More robust checking for prior decoding graph existence.

  • Bug Fix SDK initialization failed when using the initWithASRBundleAtPath method. This method is meant to be used when you are downloading the ASR Bundle from the internet rather than bundling it with your app.

  • Bug Fix More robust handling of errors due to missing resources on initialization, decoding graph creation, etc..

  • EnhancementIncreased maximum duration of files for file-based processing from 100 seconds to 200 seconds.

  • Enhancement Changed semantics for bluetooth output to enableBluetoothA2DPOutput to better reflect underlying functionality.

Release 1.5; Dec 22, 2017

  • Feature Added functionality to disable the SDK notification handling (for example, for audio interrupts or the app moving to the background or foreground) by configuring the KIOSRecognizer’s handleNotifications property. If this property is set to NO it is the developer’s responsibility to deactivate and reactivate the audio stack on audio interrupts using the KIOSRecognizer methods activateAudioStack and deactivateAudioStack. For most use cases we recommend keeping this property set to YES. If you would like to run the SDK when the app is in the background mode, setting this property to NO will allow you to do so.

  • Bug Fix When the SDK manages audio interrupts and foreground/background notifications it performs audio stack activation on appplicationDidBecomeActive notification; this may introduce slight delay when the app comes to the foreground, but it provides the most reliable way to handle notifications and interrupts. Prior to this fix it was possible occasionally not to receive the callback when the app moves to the foreground.

  • Feature Added support in KIOSRecognizer to enable bluetooth output through the enableBluetoothOutput method. Calling this method will restart AVAudioSession with AVAudioSessionCategoryOptionAllowBluetoothA2DP category option set. This option is supported in iOS 10 or later.

  • Enhancement The SDK is now compiled with bitcode compilation enabled.

Release 1.4; Nov 2, 2017

  • Feature you can now use KIOSUploader class to automatically upload audio recordings and speech recognition metadata to Dashboard, our cloud tool. We envision this tool being handy during the app development phase and in some cases in production too.

Release 1.3; Sep 20, 2017

  • Enhancement we reduced the memory footprint of the recognizer by about 20%. In addition, version 1.3 of the SDK supports quantized ASR Bundles, which take 3-3.5x less disk space. Currently, the 3-3.5x reduction in ASR Bundle size only applies to on-disk storage, that is, affecting your app download size. Similar reductions in memory footprint are planned for future releases].

  • Enhancement When running recognition from the audio file using the startListeningFromAudioFile: method, we now verify that sampling frequency, bps, and the number of channels match expected values (audio files should always be mono and 16bps, and sampling frequency should match Fs from the ASR Bundle).

  • Enhacement Introduced a new recognizerState readonly property for the KIOSRecognizer instances. This property replaces the listening property, which was not providing sufficient information. For more information, review the different KIOSRecognizerState values recognizerState can take.

  • Feature You can now direct the KIOSRecognizer to perform echo cancellation if this is supported on the device. This feature removes audio played by the app speaker from the signal captured by the microphone. For more information, refert to the performEchoCancellation and echoCancellationAvailable methods of the KIOSRecognizer. This feature is experimental.

  • Bug Fix Redesigned the audio interruption logic that was causing crashes on older iOS versions and prevented compilation with older versions of XCode. The current implementation takes complete control of AVAudioSession management and provides two new callback methods through the KIOSRecognizerDelegate protocol: 1) The first one supports cleaning up audio playing resources (such as stop playing audio, remember app state as necessary) before the app goes to the background. This allows deactivation of AVAudioSession before the app goes to the background. 2) The second method supports setting up the UI after an audio interrupt has finished and the app comes back to the foreground. For more information, refer to [unwindAppAudioBeforeAudioInterrupt and recognizerReadyToListenAfterInterrupt. unwindAppAudioBeforeAudioInterrupt is not optional, which means that you will have to update your controllers so that they conform to the KIOSRecognizerDelegate protocol.

  • Bug Fix Fixed a bug that caused the SDK to crash when prepareForListeningWithCustomDecodingGraphWithName or prepareForListeningWithCustomDecodingGraphAtPath: was called while the app was listening.

  • Bug Fix Fixed a bug that occasionally caused the SDK t crash when using the startListeningFromAudioFile: method to perform recognition from audio files.

Release 1.2; Jun 20, 2017

  • Enhancement Cleaned up logging and reduced log level (info->debug) for some messages.

  • Bug Fix Fixed a bug with audio interruption handling where recognition was not automatically stopped on audio interrupt or when app goes to the background.

Release 1.1; Jun 15, 2017

  • Enhancement Further improvements to logging so that the format is consistent across the SDK.

  • Enhancement Introduced another method for initialization initWithASRBundleAtPath:. If you don’t want to embed the ASR Bundle with your app to reduce its size, you can instead download the ASR Bundle after the app has been installed and use this method to initialize the SDK.

  • Enhancement Improved handling of out-of-vocabulary words, primarily for non-public ASR bundles.

  • Enhacement Minor improvements in CPU utilization. Future release will focus on CPU and memory optimization.

  • Bug Fix Improved handling of audio interruption (phone calls, app going to the background, etc.). The app will now stop listening as soon as the interrupt occurs without triggering any callbacks due to lack of time. Once the interrupt has finished and the app resumes, there is a callback method [KIOSRecognizerDelegate recognizerReadyToListenAfterInterrupt:] you can implement to get notified; this is were you would update any UI elements or start listening if your app is listening all the time.

  • Bug Fix Resolved an issue where partial results may not have been reported if they are the same as the last partial result from the previous recognition run.

Release 1.0; Apr 27, 2017

  • Announcement Keen Research changed the name of the framework to KeenASR. The previous name occasionally created confusion, and we are now working on features that are not specific to Kaldi. This change is reflected in two places: 1) framework has been renamed to KeenASR.framework and 2), the main include file has ben renamed to KeenASR.h. There are no changes to classes and method names, so the switchover should be simple. To install release 1.0 of the SDK you will need to: 1) remove the KaldiIOS.framework from your project, 2) add the KeenASR.framework to your project, and 3) replace includes of “KaldiIOS/KaldiIOS.h” with “KeenASR/KeenASR.h” in your source code.

  • Bug Fix Fixed a bug that caused partial result not to be reported when they were the same as the last partial result from the previous recognition run. For example, if you ran a recognition session and said “doctor” the result would be reported as a partial result and then as a final result. If you then ran another session and said “doctor” again, a partial result callback would not be triggered. However,the final result callback would still trigger properly. Now, partial results are always provided, even if they do not differ fro the previous one.

  • Enhancement Improved logging of messages with non-standard formatting to to implement more consistent formatting. Note that log messages from one of the modules are still reported via NSLog, which uses a different format than the rest of the log messages. This issue will be addressed in a in future release.

  • Bug Fix Release 0.9 introduced a bug which made some custom ASR Bundles incompatible with the SDK. You would experience this bug only if you received custom ASR bundles from Keen Research.

Release 0.9; Mar 11, 2017

  • Announcement We switched to libc++ STD library. When you replace the framework in your XCode project, you must perform the following steps: 1) Replace the libstdc++.6.0.9.tbd library with the libc++.tbd library. Under Targets, choose your target, select the Build Phases tab, open the Link Binary With Libraries, click the Plus sign +), select libc++.tbd, click Add. Delete libstdc++.6.0.9.tbd. 2) Under Targets, choose your target, select Build Settings tab, search for “c++” and make sure that the C++ Language Dialect is set to C++11, and that the C++ Standard Library is set to libc++.

  • Enhancement Added support for lattice rescoring when the decoding graph is bundled with the app. The process is opaque and driven by the existence of a rescoring const arpa file in the decoding graph directory. We plan to provide additional documentation and make the command line tool for creating decoding graphs available in the near future. If large vocabulary recognition is of interest to you, contact us. The SDK currently supports real-time recognition of about 80k words on iPhone 6s with this approach.

  • Bug Fix On certain occasions confidence scores and timings were not being provided in the final result. This bug would sometime occur in situations when one of the words in the decoding graph was not in the lexicon (words.txt file in the ASR Bundle) and its pronunciation was automatically derived by the SDK.

Release 0.8; Feb 27, 2017

  • Feature Introduced a new method, stopListeningAndReturnFinalResult in the KIOSRecognizer class that allows you to stop listening and obtain the final recognition result.

  • Enhancement Log messages from the framework now include a KeenASR label to enable easier filtering in apps with numerous log entries.

  • Enhancement Cleaned up log messages to remove extranous messages and clarify the text for some messages.

  • Enhancement Audio capture will now work with Bluetooth devices. NOTE: the quality of audio captured through bluetooth devices can vary significantly, depending on the quality of the bluetooth device and the type of noise-cancellation algorithms that may be used. Low-quality bluetooth devices may adversely impact recognition accuracy.

  • Bug Fix Only responses with high confidence will be used to update the adaptation state. This bug may have affected performance of your apps in cases where a significant number of words were incorrectly recognized with low-confidence, for example, m because of noisy environments.

Release 0.7; Jan 17, 2017

  • Enhancement Removed startListeningWithDecodingGraph method from the KIOSRecognizer class and added a few new methods that are used to prepare for recognition with either a custom decoding graphs built in the app or a custom decoding graphs bundled with the app. This enhancement makes this release not backward compatible with previous versions of the framework.

  • Enhancement The KIOSDecodingGraph class has gone through some refactoring. Most methods are now class-based.

  • Feature By the end of January, Keen Research plans to release a command line tool for Mac OS X, called dgBuilder. The tool will allow you to build custom decoding graphs in your development sandbox, which you can then bundle with your app. Creating decoding graphs on mobile devices takes too long and is memory intenseT. The new tool will make it easier to create decoding graphs for large vocabulary recognition tasks. Contact us if you are interested in beta testing this tool.

  • Enhancement Documentation updates to a number of pages, such Quick Start, Decoding Graphs, etc.) based on customer feedback.

  • Enhancement In addition to providing versioning information in the application log file, the framework now includes versioning information a text file called VERSION.txt. The documentation also specifies the framework version, both in the header and the footer.

Release 0.6; Dec 8, 2016

  • AnnouncementThis release requires the updated ASR Bundles; when you update the framework, you must download and update the updated ASR bundle(s) as well.

  • Feature Recognition from the audio file. KIOSRecognizer now provides a few methods to perform recognition from stored wav files. While we envision the framework being primarily used for real-time audio, the file-based recognition can be useful for controlled evaluation purposes.

  • Feature Added a more detailed KIOSResult class to support recognition data including overall confidence, word confidences, start times, and duration for each word.

  • Feature Refactored and partially optimized creation of the decoding graph.

  • Enhancement modified decoding graph creation to dynamically create various resources that were stored in the ASR Bundles; new bundles are cleaned up from the unnecessary files. Version 0.6 of the framework will not work with the older ASR bundles.

  • Deprecated: KIOSDecodingGraph The method createDecodingGraphFromBigramURL:andSaveWithName has been renamed to createDecodingGraphFromArpaURL:andSaveWithName: to better reflect the underlying functionality. The old method is deprecated and will be removed in a future releases.

Release 0.5; Oct 20, 2016

  • Feature Added support for Kaldi NNet3 models including chain models. With chain models The SDK can now support larger decoding graphs and language models with over 30k words on the iPhone 6.

  • Feature Added support for weighting down silence phones when doing adaptation through iVectors. This tends to improve recognition performance for NNet recognizers.

  • Enhancement Simplified initialization of the SDK. The type of the recognizer is determined from the ASR Bundle name and no longer needs to be passed to the init method.

  • Enhancement Increased the timeout for the trial version of the framework from 5min to 10min.

  • Bug Fix Fixed issue with the adaptation state not being carried over to subsequent interactions.

  • Deprecated KIOSRecognizer initialization methods that expect KIOSRecognizerType to be passed. Per the above comment, recognizer type is now determined based on the ASR Bundle name.

  • Other: The configuration file for the DNN ASR Bundle (librispeech-nnet-en-us) has been updated with the parameters that enforce down-weighting of the silence phones for iVector computation.

Release 0.4.2; Sep 28, 2016

  • Bug Fix Fixed a bug that occasionally caused crashes during stopListening.

  • Enhancement Remove hard-coded bundle ID from the trial version of the framework. From now on, the trial version of the framework will work with any app bundle ID, but the app will timeout (exit) after 5 minutes. If you need a fully functioning version of the framework, tied to your app bundle ID, contact us at info@keenresearch.com.

Release 0.4.1; Sep 11 2016

  • Bug Fix Fixed a bug that caused the decoding graph creation code to trigger a crash in low-memory conditions.

  • Bug Fix Removed non-public API for logging, which caused iTunes Connect to complain and reject apps.

  • Announcement Support for iOS 10 and XCode 8

Note that iOS 10 requires apps to specify a “Privacy - Microphone Usage Description” key in info.plist file, when microphone access is required by the app.

Release 0.4; Jul 20, 2016

  • Bug Fix Fixed a bug that caused the creation of a custom decoding graph to fail intermittently.

  • Bug Fix The custom decoding graph directory was not properly cleaned up after creation failed.

  • Feature KIOSRecognizer now supports Speaker Adaptation control through the following new methods: adaptToSpeakerWithName, resetSpeakerAdaptation, saveSpeakerAdaptationProfile, and others. For details see Speaker Adaptation section of the KIOSRecognizer class.

Release 0.3.2; Jul 11, 2016

  • Improved handling of sentences passed to KIOSDecodingGraph methods (removal of irrelevant punctuation, reducing accented words to their ascii representation, better number interpretation).

  • Fixed bug for 8000Hz models (the audio was still sampled at 16000Hz).

  • A few “under-the-hood” bug fixes that may have caused the framework to fail when building the decoding graph.

Release 0.3.1; Jun 20, 2016

  • Support for controlling Voice Activity Detection parameters through the KIOSRecognizer’s setVADParameter:toValue: method.

  • Added new initWithRecognizerType:andASRBundle: method in KIOSRecognizer that allows initialization of KIOSRecognizer without passing a relative path to the decoding graph. If you are not bundling the decoding graphs with your app, you will most likely use this method to initialize the engine.

  • setLogLevel: is now a class method. It used to be an instance method) of the KIOSRecognizer class.

  • The audio sampling frequency is not longer hard-coded to 16kHz. That is the default value, but if the mfcc.conf file specifies a different value derived from the sample-frequency, the engine will use that value. This is only relevant if you are using the framework with your own acoustic models.

  • Several performance enhancements and bug fixes in different methods of the KIOSDecodingGraph class.

  • Updated the proof of concept app on Github with several demos that show how to create custom decoding graphs.

Release 0.3; Jun 12, 2016

  • Support for dynamic creation of decoding graph. Using an instance of the KIOSDecodingGraph class you can now create the decoding graph in your app by providing a list of sentences or a bigram language model (see QuickStart and class documentation for more information).

  • Consolidated logging; you can now control the level of logging from the framework (defaults to WARN). See logLevel property of the KIOSRecognizer class

  • Several bug fixes and performance enhancements

Note that you will need to also download and update ASR Bundles in order to use the dynamic creation of the decoding graph (The ASR Bundles now contain additional information that allows building of decoding graphs).