KeenASR SDK version 2.1

Author: Ognjen Todic | November 17, 2025

We’re excited to announce that KeenASR SDK version 2.1 is now generally available! This release introduces a major new feature along with several enhancements across all supported platforms.

Audio Quality Estimation

Speech recognition systems often face challenging audio environments: a user speaking next to a loud TV, wind noise from an open car window, or a quiet child speaking far from the device. While our models are trained to handle a wide variety of scenarios, extremely poor audio quality can still degrade recognition performance

Version 2.1 introduces AudioQualityResult, available in an instance of the Response class. This new structure provides developers with detailed metrics to help evaluate the quality of the captured audio, diagnose potential issues, and provide relevant feedback to their users.

AudioQualityResult contains the following metrics:

Signal to Noise Ratio (SNR)

Indicates the ratio of speech signal to background noise.

Currently optimized for stationary noise rather than transient noise.
May be unreliable for very short speech segments.
SNR is not computed if no speech was recognized in the audio.
Low SNR values generally correlate with reduced ASR accuracy.

Clipped Samples Count

Counts the number of audio samples that reached maximum amplitude (positive or negative):

Often caused by extremely loud speech close to the microphone or sudden loud transients (e.g. tapping on the microphone).
Introduces nonlinear distortion that can negatively affect recognition.

Peak Speech RMS Value

Estimated peak RMS (root-mean-square) level of speech (in dB), computed as the 98th percentile of speech RMS levels:

Low values indicate faint speech or a user being far from the microphone.
May be less reliable for very short utterances or very short speech segments.

Initial Segment RMS Warning Flag

A flag indicating unusually high RMS levels at the start of the recording, possibly due to:

Audio played from the device being picked up by the microphone
User beginning to speak before the recognizer started listening
High general noise levels

Frame RMS Value

An array of RMS values (in dB) for each 25 ms frame (10 ms shift):

Computed as 20 log₁₀(√(∑ s_i²)).
Frames with zero signal report ~ -100 dB
Useful for custom analysis, visualization, and debugging

WordPronunciation Improvements

When building decoding graphs, the KeenASR SDK allows developers to supply alternative pronunciations for words. This is particularly useful when:

A word is missing from the lexicon (common in languages with non-deterministic grapheme-to-phoneme rules, or when working with made-up words in early literacy apps).
A word has multiple valid pronunciations, such as regional variations.
Modelling common mispronunciations, relevant in EdTech to detect specific mispronunciations as well as in frontline worker applications to avoid interrupting workflows for non-native speakers.

What’s New in 2.1?

AlternativePronunciation class has been renamed to WordPronunciation to better reflect its purpose.
New IsValid method to verify that a word pronunciation conforms to the constraints defined in the documentation, given the data provided by the developer.
Decoding graph creation now fails when invalid pronunciations are supplied, instead of silently ignoring them with a log warning.

These enhancements provide clearer error handling and improve developer experience when customizing pronunciations.

Other Updates

clean_text is back! The Result object once again includes clean_text, the recognition output without special tokens such as <SPOKEN_NOISE>. (This was available before SDK v2.0.)
Improved logging in keenasr-web; Logging now uses Emscripten’s native logger methods, resulting in cleaner and more readable console output.
New public developer demo. We are releasing a browser-based developer demo that lets you experiment with the SDK via a graphical interface. It’s a great way to explore the API concepts hands-on. Check out all demos.

You can browse the developer documentation, download a trial SDK, or try our web demos to get started.