KeenASR is an offline speech recognition SDK for iOS and Android. This Unity plugin provides a C# API that wraps the native SDK, giving you on-device speech recognition without requiring a network connection.
Quick Start
KeenASR.onInitializedReceived += OnInit;
void OnInit(bool success) {
if (!success) return;
KeenASR.Instance.onFinalResponseReceived += OnResponse;
string[] phrases = { "YES", "NO", "HELLO", "GOODBYE" };
}
}
Wraps a single recognition response, providing access to the recognition result, audio quality metric...
Definition KeenASR.cs:227
void Dispose()
Releases the native handle associated with this response. This method is idempotent and safe to call ...
Definition KeenASR.cs:315
ASRResult result
The recognition result (text, words, phonemes, confidence).
Definition KeenASR.cs:233
string text
Full recognition result text, may include special words (e.g. <SPOKEN_NOISE>).
Definition KeenASR.cs:161
Main facade for the KeenASR speech recognition plugin.
Definition KeenASR.cs:365
bool StartListening()
Starts listening for speech. The recognizer must be in ReadyToListen state (i.e. PrepareForListeningW...
Definition KeenASR.cs:709
static KeenASR Instance
Gets the shared KeenASR instance. Returns null if the SDK has not been initialized.
Definition KeenASR.cs:440
static bool Initialize(string bundleName)
Initializes the ASR engine with the named ASR bundle from StreamingAssets.
Definition KeenASR.cs:461
bool PrepareForListeningWithDecodingGraph(string dgName, bool computeGoP=false)
Loads a previously created decoding graph and prepares the recognizer for listening....
Definition KeenASR.cs:639
bool CreateDecodingGraphFromPhrases(string dgName, string[] phrases, SpeakingTask task=SpeakingTask.Default, WordPronunciation[] alternativePronunciations=null, float spokenNoiseProbability=0.5f)
Creates a decoding graph from the given list of phrases. The decoding graph defines the set of uttera...
Definition KeenASR.cs:570
@ Debug
Verbose output including internal processing details.
Key Concepts
ASR Bundle
An ASR bundle is a language specific asset with a pre-trained speech recognition model. It is stored in Assets/StreamingAssets/ and specified by name when calling KeenASR.Initialize().
Decoding Graph
A decoding graph defines what the recognizer can recognize. Create one from a list of phrases using CreateDecodingGraphFromPhrases(). Graphs are saved on the device and can be referenced by name and reused across sessions.
Recognizer Lifecycle
The recognizer progresses through these states (see RecognizerState):
- NeedsDecodingGraph — initialized, but no decoding graph loaded
- ReadyToListen — decoding graph loaded, ready to start recognition
- Listening — actively capturing and decoding audio
- FinalProcessing — transient state, audio capture stopped, computing final result
How Listening Stops
Listening stops in one of three ways:
- VAD threshold triggered — e.g. end-silence or max duration (delivers final result via
onFinalResponseReceived)
- Audio interrupt — phone call, notification, app backgrounded (no callback)
- Explicit
StopListening() call — stops audio processing but does not trigger onFinalResponseReceived. To get the final result, set short VAD timeouts instead so the recognizer stops naturally.
Recognition Results
Results are delivered through two events:
- onPartialASRResultReceived — called repeatedly during recognition with intermediate text
- onFinalResponseReceived — called when recognition completes via VAD, with an
ASRResponse containing the full result, per-word timing, phoneme-level detail, and audio quality metrics
Response Lifecycle
Each ASRResponse holds a reference to native resources. The caller owns the response and must call Dispose() when done. Failing to dispose will leak native memory until the garbage collector runs the finalizer.
}
bool SaveAudioFile(string directoryPath)
Saves the audio recording for this response to the specified directory. After saving,...
Definition KeenASR.cs:305
bool QueueForUpload()
Queues audio and JSON from this response for upload to Dashboard. Files are saved to an internal dire...
Definition KeenASR.cs:282
bool SaveJsonFile(string directoryPath)
Saves the JSON representation of this response to the specified directory.
Definition KeenASR.cs:293
string cleanText
Clean recognition result text with special words removed.
Definition KeenASR.cs:163
Voice Activity Detection (VAD)
VAD parameters control when the recognizer automatically stops listening:
- TimeoutForNoSpeech — stop if no speech detected within this many seconds
- TimeoutEndSilenceForGoodMatch — seconds of trailing silence for a confident match
- TimeoutEndSilenceForAnyMatch — seconds of trailing silence for any match
- TimeoutMaxDuration — maximum total listening duration
For practical purposes, TimeoutEndSilenceForGoodMatch and TimeoutEndSilenceForAnyMatch can be treated the same way.
Goodness of Pronunciation (GoP)
When computeGoP is set to true in PrepareForListeningWithDecodingGraph(), the final result includes phoneme-level pronunciation scores (0-1) in each ASRWord.phones array. This is useful for pronunciation assessment applications. This parameter also requires the ASR Bundle to support goodness of pronunciation scoring.
Dashboard Upload
To upload recognition responses to the KeenASR Dashboard for analysis:
}
static bool StartDataUploader(string appKey)
Starts a background uploader thread that uploads queued responses to Dashboard. Call this once during...
Definition KeenASR.cs:850
Platform Notes
| Feature | iOS | Android |
| Initialization | Synchronous | Asynchronous |
| Speaking task graphs | Supported | Supported |
| Echo cancellation | Supported | Not yet implemented |
Teardown
To fully release the SDK and all resources:
static bool Teardown()
Tears down the recognizer and releases all associated resources. All audio playback should be stopped...
Definition KeenASR.cs:508
After teardown, KeenASR.Instance returns null. The SDK can be re-initialized by calling KeenASR.Initialize() again.