After you installed the plugin, you can start using it in your app by following these steps. This guide walks through a minimal integration: initialize the SDK, create a decoding graph, listen for speech, and handle results.
The plugin exports a default object (KeenASR) with all SDK methods, and equivalent named exports. Pick whichever style you prefer; both call into the same underlying Turbo Module.
1. Import the API
import KeenASR, {
onPartialResult,
onFinalResponse,
onAudioInterruptStarted,
onAudioInterruptEnded,
VadParameter,
RecognizerState,
SpeakingTask,
WordPronunciation,
TextAligner,
} from 'keenasr-react-native';
2. Initialize the SDK
Initialize the SDK early in your app lifecycle. All methods are asynchronous and return Promises; await them before calling dependent methods.
async function setupKeenASR() {
// Optional: set log level before initialize for more detailed startup logs
// await KeenASR.setLogLevel(LogLevel.Info);
const ok = await KeenASR.initialize('keenA1m-nnet3chain-en-us');
if (!ok) {
console.error('KeenASR initialization failed (already initialized, or an error occurred)');
return;
}
}
initialize(bundleName) loads an ASR bundle from app resources (the iOS app bundle or the Android assets/ directory). For bundles downloaded at runtime, use initializeWithPath('/absolute/path/to/bundle') instead.
RECORD_AUDIO at runtime (via PermissionsAndroid) before calling startListening(). iOS prompts automatically the first time the microphone is used, using the NSMicrophoneUsageDescription string from Info.plist.3. Subscribe to Events
Subscribe to recognition events. Each event subscriber returns an unsubscribe function; call it when the screen unmounts to avoid duplicate handlers.
const unsubPartial = onPartialResult((text) => {
// Update interim UI (e.g. live transcription)
console.log('Hearing:', text);
});
const unsubFinal = onFinalResponse((response) => {
console.log('Recognized:', response.asrResult.cleanText);
// ... use the response ...
// Always release the response when done to free native memory.
response.release();
});
const unsubInterruptStarted = onAudioInterruptStarted(() => {
// e.g. show "interrupted" UI
});
const unsubInterruptEnded = onAudioInterruptEnded(() => {
// recognizer is back in ReadyToListen; you may call startListening() again
});
4. Create a Decoding Graph
A decoding graph defines what the recognizer can recognize. Build one from a list of phrases:
const phrases = ['YES', 'NO', 'START', 'STOP', 'HELLO', 'GOODBYE'];
await KeenASR.createDecodingGraphFromPhrases('myGraph', phrases);
The optional options argument lets you tune the graph for a specific task and pass custom pronunciations:
await KeenASR.createDecodingGraphFromPhrases('reading', readingPhrases, {
task: SpeakingTask.OralReading,
spokenNoiseProbability: 0.3, // be more lenient with accented speech
});
Decoding graphs are persisted on disk and survive across app launches. Use decodingGraphExists to skip rebuilding when the inputs have not changed:
if (!(await KeenASR.decodingGraphExists('myGraph'))) {
await KeenASR.createDecodingGraphFromPhrases('myGraph', phrases);
}
5. Prepare the Recognizer
Load the decoding graph into the recognizer. Pass computeGoP: true if you want phoneme-level pronunciation scores in the final response:
await KeenASR.prepareForListening('myGraph', /* computeGoP */ false);
// Configure VAD: how long to wait after the user stops speaking before finalizing
await KeenASR.setVADParameter(VadParameter.TimeoutEndSilenceForGoodMatch, 1.0);
await KeenASR.setVADParameter(VadParameter.TimeoutEndSilenceForAnyMatch, 1.0);
6. Start and Stop Listening
await KeenASR.startListening();
While the recognizer is listening, onPartialResult fires every 100-200ms with interim text. The recognizer automatically stops when a VAD rule triggers (for example, end-of-silence timeout after a good match) and fires onFinalResponse with the final result.
To stop recognition immediately (for example, the user navigates away from the screen), call stopListening(). This cancels recognition and does not produce a final response:
await KeenASR.stopListening();
stopListening(). Instead, shorten the VAD timeouts (for example set both EndSilence thresholds to 0.2, or TimeoutMaxDuration to 0) so the recognizer stops naturally and emits onFinalResponse.7. Handle the Final Response
onFinalResponse delivers an ASRResponse object with the recognized text, per-word timing, optional phoneme-level GoP scores, and audio quality metrics. The response holds a reference to native memory; you must call release() when done.
const unsubFinal = onFinalResponse((response) => {
const result = response.asrResult;
console.log('Recognized:', result.cleanText);
for (const word of result.words) {
console.log(`${word.text} start=${word.startTime}s duration=${word.duration}s`);
if (word.phones) {
for (const phone of word.phones) {
console.log(` ${phone.text} score=${phone.pronunciationScore}`);
}
}
}
// Audio quality metrics
const aq = response.audioQualityResult;
if (aq.snrValue < 10) {
console.warn('Noisy environment detected');
}
// Always release when done
response.release();
});
The response also exposes:
response.saveAudioFile(directoryPath): save the recorded audio (WAV) to diskresponse.saveJsonFile(directoryPath): save the response JSON to diskresponse.setCustomJson(jsonString): attach app-level metadata (user, lesson, score, etc.) that is merged into saved/uploaded JSONresponse.queueForUpload(): queue the response for upload to the KeenASR Dashboard
8. Switching Decoding Graphs
Create multiple graphs and switch between them when the recognizer is not listening:
await KeenASR.createDecodingGraphFromPhrases('colors', ['RED', 'BLUE', 'GREEN']);
// Switch to the colors graph (recognizer must not be listening)
await KeenASR.prepareForListening('colors');
9. Contextual Decoding Graphs
A contextual graph contains multiple phrase sets (“contexts”) that can be activated at runtime without rebuilding. This is useful for multi-page reading apps or multi-step workflows:
await KeenASR.createContextualDecodingGraphFromPhrases('reading', [
['Alice was beginning to get very tired'], // context 0
['And what is the use of a book'], // context 1
['When suddenly a White Rabbit appeared'], // context 2
]);
// Prepare with a specific context
await KeenASR.prepareForListeningWithContextualDecodingGraph('reading', 0);
// Later, switch contexts (recognizer must not be listening)
await KeenASR.prepareForListeningWithContextualDecodingGraph('reading', 1);
10. Alternative Pronunciations
For reading assessment, language learning, or made-up words, pass WordPronunciation entries when creating a decoding graph. When a tagged pronunciation is matched, the recognized word appears with its tag suffix (for example, PEAK#WRONG):
const altPronunciations = [
new WordPronunciation('PEAK', 'P IH0 K', 'WRONG'),
new WordPronunciation('zorblax', 'Z AO R B L AE K S'),
];
// Optionally validate before passing to the graph builder
for (const wp of altPronunciations) {
if (!(await wp.isValid())) {
console.warn('Invalid pronunciation:', wp.word, wp.pronunciation);
}
}
await KeenASR.createDecodingGraphFromPhrases('reading', readingPhrases, {
alternativePronunciations: altPronunciations,
});
Phones must come from the ASR bundle’s lang/phones.txt. See Alternative Word Pronunciations in the EdTech use case for guidance on when (and when not) to model mispronunciations.
11. Text Alignment
For oral reading apps, the plugin exposes TextAligner for comparing recognized text against a reference passage. See Text Alignment for the conceptual overview and the EdTech use case for an applied example.
const aligner = await TextAligner.create(
'Alice was beginning to get very tired',
'en-us'
);
// On each partial result
const unsubPartial = onPartialResult(async (text) => {
const alignment = await aligner.incrementalAlign(text);
highlightWordsRead(alignment.matchedRefIndices);
});
// On the final response, call incrementalAlign one more time
const unsubFinal = onFinalResponse(async (response) => {
const final = await aligner.incrementalAlign(response.asrResult.text);
console.log('Accuracy:', final.matches / final.refLength);
await aligner.reset(); // ready for another attempt at the same passage
response.release();
});
// When you are done with this passage, release the aligner
await aligner.close();
Use TextAligner.createFromRecognizer(reference) to source the language code from the currently initialized recognizer’s ASR bundle automatically.
12. Dashboard Integration
To upload recognition responses to the KeenASR Dashboard for analysis, start the uploader once after initialization and queue responses in the final-response handler:
// Start once after initialize
await KeenASR.startDataUploader('YOUR_APP_KEY');
onFinalResponse((response) => {
// ... process result ...
response.queueForUpload();
response.release();
});
// Optional: pause/resume or stop the uploader
await KeenASR.pauseUploader();
await KeenASR.resumeUploader();
await KeenASR.stopUploader();
You can also attach app-level metadata to the response JSON before queuing for upload:
onFinalResponse(async (response) => {
await response.setCustomJson(JSON.stringify({
userId: 'user-123',
lessonId: 'lesson-42',
score: 0.87,
}));
await response.queueForUpload();
response.release();
});
13. Teardown
When the SDK is no longer needed, release native resources. The recognizer must not be listening when teardown() is called; the snippet below covers the typical states:
const state = await KeenASR.getRecognizerState();
if (state === RecognizerState.Listening) {
await KeenASR.stopListening();
await new Promise((r) => setTimeout(r, 200)); // let the audio thread settle
} else if (state === RecognizerState.FinalProcessing) {
await new Promise((r) => setTimeout(r, 200));
}
await KeenASR.teardown();
// Unsubscribe any active listeners
unsubPartial();
unsubFinal();
unsubInterruptStarted();
unsubInterruptEnded();
After teardown you can re-initialize with KeenASR.initialize(...) if needed.
Tips
- VAD tuning: lower both EndSilence VAD thresholds for faster finalization at the cost of potentially cutting off the user. Higher values wait longer for additional speech. See VAD Thresholds in the EdTech use case for guidance.
- Audio interrupts: handle
onAudioInterruptStarted/onAudioInterruptEndedso your UI reflects when phone calls, Siri, or backgrounding pause recognition. - Input level:
getInputLevel()returns the current microphone level in dB (ornullwhen not initialized). Useful for driving UI elements that indicate speech activity (for example, a pulsing mic icon or an input meter). - Metro hot-reload: the native singleton survives a JS-bundle reload. If
initialize()fails after a reload, restart the app (rebuild on iOS,adb shell am force-stop <pkg>on Android). Proper recovery is on the roadmap.
Complete Example
The plugin tarball ships with two ready-to-run apps:
example/: a minimal Start/Stop integration showing recognition flow end-to-endedtech-poc/: an EdTech POC demonstrating pronunciation scoring and oral reading withTextAligner
Run them with scripts/run-example.sh as described in Installation.
