keenasr-web SDK Quick Start

After you installed KeenASR SDK for Web library, you can start using it in your web page app by following these steps:

1. Import the Library

Import keenasr-web.js file wherever you want to use SDK.

import KeenASR from "path-to-sdk/keenasr-web.js";

3. Initialize the SDK

Initialize the SDK using the URL of the ASR Bundle; this process runs asynchroniously and may take a few seconds. During this process ASR Bundle will be downloaded and installed in the local file system.

await KeenASR.initialize({
    asrBundleURL: 'https://FULLPATH/keenAK3m-nnet3chain-en-us.tgz',
    onCoreReady() {
    //	    KeenASR.setLogLevel(KeenASR.LogLevel.DEBUG);    
       // wasm is prepared/loaded (we can now call static methods of the SDK)
       // e.g. set log levels to see in more detail what's happening during
       // ASR engine initialization
    },
    onASRBundleReady() {
       // ASR Bundle is downloaded and now ASR Engine initialization will start
    },
});

Once recognizer is initialized you will need to setup handlers (callback methods) for final response and partial results.

KeenASR.onPartialResult = (result) => setPartialResultText(result.text); // here we just set partial result text 
KeenASR.onFinalResponse = ({ asrResult }) => {
   // DO SOMETHING WITH THE FINAL RESPONSE
};

At this point you can also further configure the recognizer, such as setting different VAD parameters, etc..

4. Create the Decoding Graph

The decoding graph combines the language model with all other recognition resources (acoustic models, lexicon) and provides the recognizer with a data structure that simplifies the decoding process. You can build the decoding graph dynamically from within your web app by providing a list of sentences/phrases users are likely to say. In this case, the SDK will first build the n-gram language model and then create the decoding graph. This functionality is provided through the createDecodingGraphFromPhrases method:

const phrases = ["Once upon a time there was an old mother pig who had three little pigs and not enough food to feed them.", "This is just another phrase"]
await KeenASR.createDecodingGraphFromPhrases('readinggraph', phrases);

This method also accepts optional config parameter which allows you to further customize the graph.

This example code creates a decoding graph called readinggraph and saves the graph in the local file system. Later on you can refer to this decoding graph by its name. You typically create the decoding graph only once and re-create it only when you know that the data used to build it may have changed. You can use decodingGraphWithNameExists method to check if the graph with the given name already exists; this way you don’t need to create decoding graph on every page load.

if (!KeenASR.decodingGraphWithNameExists('readinggraph')) {
  await KeenASR.createDecodingGraphFromPhrases('readinggraph', phrases);
}

Warning: If phrases used to build a decoding graph contain words that are not in the lexicon (ASRBUNDLE/lang/lexicon.txt), their phonetic transcription will be assigned algorithmically. Since this method is not perfect for English and wrong phonetic transcriptions will affect recognition performance, this might have unwanted effects if you are aiming to recognize unusual words. We periodically release updates to the ASR Bundle, which includes updated lexicon. Contact us if you need help.

5. Prepare to Listen

Before starting to listen you will need to tell the SDK which decoding graph to use, by calling the prepareForListeningWithDecodingGraphWithName(graphName) method. For a given decoding graph you do this only once before you start listening.

if (!KeenASR.prepareForListeningWithDecodingGraphWithName('reading', false))
  throw new Error("SDK is not prepared for listening");

6. Start/Stop Listening

To start capturing audio from the device microphone and decoding it, call the SDK’s startListening() method. While you can stop the device listening by explicitly calling the stopListening() method, we highly recommend you rely on Voice Activity Detection and let the recognizer stop automatically when one of the VAD rules is triggered.

While the recognizer is listening, it periodically (every 100-200ms) calls delegate’s handlePartialResult event handler IF there are partial recognition results AND they are different than the most recent partial result.

The recognizer automatically stops listening when one of the Voice Activity Detection (VAD) rules is triggerred. You can control VAD configuration parameters through the setVADParameter() method. When the recognizer stops listening due to VAD triggering, it will call the onFinalResult() event handler.

Refer to the KeenASR.VADParameter constants for information on different VAD settings.

7. Switching Decoding Graphs

If your app needs to support multiple decoding graphs you can dynamically build multiple decoding graphs. At any time while the recognizer is not listening you can call one of the prepareForListening methods to load a different decoding graph.

8. Other

Upon loading the web page that uses KeenASR SDK for Web browser will ask the user to allow the page to access the microphone. You might want to prime the user and explain in simple language why this is required before you initalize the SDK (web browser explanation might be sparse and confusing for the user).

For information on how to specify what the recognizer is listening for (decoding graph), refer to Decoding Graphs and Acoustic Models.

You can also review the oral reading demo and view its source code.

Warning: The SDK trial version includes all supported functionality, but it runs for only 15 minutes at the time; after 15 minutes, the SDK ‘crashes’ the app. For commercial licensing inquires please get in touch .