KeenASR Unity Plugin 2.1.3
Unity plugin for KeenASR offline speech recognition SDK (iOS & Android)
Loading...
Searching...
No Matches
KeenASR Unity Plugin

KeenASR is an offline speech recognition SDK for iOS and Android. This Unity plugin provides a C# API that wraps the native SDK, giving you on-device speech recognition without requiring a network connection.

Quick Start

// 1. Subscribe to initialization event
KeenASR.onInitializedReceived += OnInit;
// 2. Initialize with an ASR bundle from StreamingAssets
KeenASR.Initialize("keenA1m-nnet3chain-en-us");
void OnInit(bool success) {
if (!success) return;
// 3. Subscribe to results
KeenASR.Instance.onFinalResponseReceived += OnResponse;
// 4. Create a decoding graph with expected phrases
string[] phrases = { "YES", "NO", "HELLO", "GOODBYE" };
// 5. Load the decoding graph and prepare the recognizer
// 6. Start listening
// Steps 4-6 can happen at any given time
}
void OnResponse(ASRResponse response) {
Debug.Log("Recognized: " + response.result.text);
// do something with the response/result
// Always dispose the response to release native resources
response.Dispose();
}
Wraps a single recognition response, providing access to the recognition result, audio quality metric...
Definition KeenASR.cs:227
void Dispose()
Releases the native handle associated with this response. This method is idempotent and safe to call ...
Definition KeenASR.cs:315
ASRResult result
The recognition result (text, words, phonemes, confidence).
Definition KeenASR.cs:233
string text
Full recognition result text, may include special words (e.g. <SPOKEN_NOISE>).
Definition KeenASR.cs:161
Main facade for the KeenASR speech recognition plugin.
Definition KeenASR.cs:365
bool StartListening()
Starts listening for speech. The recognizer must be in ReadyToListen state (i.e. PrepareForListeningW...
Definition KeenASR.cs:709
static KeenASR Instance
Gets the shared KeenASR instance. Returns null if the SDK has not been initialized.
Definition KeenASR.cs:440
static bool Initialize(string bundleName)
Initializes the ASR engine with the named ASR bundle from StreamingAssets.
Definition KeenASR.cs:461
bool PrepareForListeningWithDecodingGraph(string dgName, bool computeGoP=false)
Loads a previously created decoding graph and prepares the recognizer for listening....
Definition KeenASR.cs:639
bool CreateDecodingGraphFromPhrases(string dgName, string[] phrases, SpeakingTask task=SpeakingTask.Default, WordPronunciation[] alternativePronunciations=null, float spokenNoiseProbability=0.5f)
Creates a decoding graph from the given list of phrases. The decoding graph defines the set of uttera...
Definition KeenASR.cs:570
Definition KeenASR.cs:15
@ Debug
Verbose output including internal processing details.

Key Concepts

ASR Bundle

An ASR bundle is a language specific asset with a pre-trained speech recognition model. It is stored in Assets/StreamingAssets/ and specified by name when calling KeenASR.Initialize().

Decoding Graph

A decoding graph defines what the recognizer can recognize. Create one from a list of phrases using CreateDecodingGraphFromPhrases(). Graphs are saved on the device and can be referenced by name and reused across sessions.

Recognizer Lifecycle

The recognizer progresses through these states (see RecognizerState):

  1. NeedsDecodingGraph — initialized, but no decoding graph loaded
  2. ReadyToListen — decoding graph loaded, ready to start recognition
  3. Listening — actively capturing and decoding audio
  4. FinalProcessing — transient state, audio capture stopped, computing final result

How Listening Stops

Listening stops in one of three ways:

  • VAD threshold triggered — e.g. end-silence or max duration (delivers final result via onFinalResponseReceived)
  • Audio interrupt — phone call, notification, app backgrounded (no callback)
  • Explicit StopListening() call — stops audio processing but does not trigger onFinalResponseReceived. To get the final result, set short VAD timeouts instead so the recognizer stops naturally.

Recognition Results

Results are delivered through two events:

  • onPartialASRResultReceived — called repeatedly during recognition with intermediate text
  • onFinalResponseReceived — called when recognition completes via VAD, with an ASRResponse containing the full result, per-word timing, phoneme-level detail, and audio quality metrics

Response Lifecycle

Each ASRResponse holds a reference to native resources. The caller owns the response and must call Dispose() when done. Failing to dispose will leak native memory until the garbage collector runs the finalizer.

void OnResponse(ASRResponse response) {
// Use the response...
string text = response.result.cleanText;
// Save audio/JSON if needed
response.SaveAudioFile(Application.persistentDataPath);
response.SaveJsonFile(Application.persistentDataPath);
// Or queue for Dashboard upload
response.QueueForUpload();
// Always dispose when done
response.Dispose();
}
bool SaveAudioFile(string directoryPath)
Saves the audio recording for this response to the specified directory. After saving,...
Definition KeenASR.cs:305
bool QueueForUpload()
Queues audio and JSON from this response for upload to Dashboard. Files are saved to an internal dire...
Definition KeenASR.cs:282
bool SaveJsonFile(string directoryPath)
Saves the JSON representation of this response to the specified directory.
Definition KeenASR.cs:293
string cleanText
Clean recognition result text with special words removed.
Definition KeenASR.cs:163

Voice Activity Detection (VAD)

VAD parameters control when the recognizer automatically stops listening:

  • TimeoutForNoSpeech — stop if no speech detected within this many seconds
  • TimeoutEndSilenceForGoodMatch — seconds of trailing silence for a confident match
  • TimeoutEndSilenceForAnyMatch — seconds of trailing silence for any match
  • TimeoutMaxDuration — maximum total listening duration

For practical purposes, TimeoutEndSilenceForGoodMatch and TimeoutEndSilenceForAnyMatch can be treated the same way.

Goodness of Pronunciation (GoP)

When computeGoP is set to true in PrepareForListeningWithDecodingGraph(), the final result includes phoneme-level pronunciation scores (0-1) in each ASRWord.phones array. This is useful for pronunciation assessment applications. This parameter also requires the ASR Bundle to support goodness of pronunciation scoring.

Dashboard Upload

To upload recognition responses to the KeenASR Dashboard for analysis:

// Start the uploader once during setup
KeenASR.StartDataUploader("YOUR_APP_KEY");
// Queue individual responses for upload
void OnResponse(ASRResponse response) {
response.QueueForUpload();
response.Dispose();
}
static bool StartDataUploader(string appKey)
Starts a background uploader thread that uploads queued responses to Dashboard. Call this once during...
Definition KeenASR.cs:850

Platform Notes

Feature iOS Android
Initialization Synchronous Asynchronous
Speaking task graphs Supported Supported
Echo cancellation Supported Not yet implemented

Teardown

To fully release the SDK and all resources:

static bool Teardown()
Tears down the recognizer and releases all associated resources. All audio playback should be stopped...
Definition KeenASR.cs:508

After teardown, KeenASR.Instance returns null. The SDK can be re-initialized by calling KeenASR.Initialize() again.