Decoding Graphs Document

KeenASR framework matches audio from the microphone or audio file to the words specified in the language model. Decoding graph combines language model with all the other resources (acoustic models, lexicon) in a way that simplifies the decoding process.

KeenASR framework supports programmatic creation of decoding graphs, either from a set of phrases/words users are likely to say, or from an ARPA language model file that you built in yor development sandbox or obtained elsewhere. In either case, if the number of words (ngrams actually) is large, creation of decoding graph on a mobile device may exhaust the memory or take too long, especially on devices with slower CPUs and < 1GB of RAM. In such cases we recommend you create decoding graph in your development sandbox and then bundle it with your app. (we will soon release a Mac OS X command-line tool that will allow you do build decoding graph in your sendbox; contact us if you’d like to beta test this tool)

If your app is creating custom decoding graphs, you will need to include the full ASR Bundle with your app (there will be a lang/ subdirectory with several files within the ASR Bundle directory). If you ship your app with prebuilt decoding graphs, you can remove lang/ subdirectory from the ASR bundle and make your app smaller.

ASR Bundle contains acoustic model, lexicon, and various configuration files. ASR Bundles are language specific as well as recognizer type specific. They are typically trained using a large number of audio files with their corresponding transcripts. If necessary, we can train custom acoustic models (e.g. for children, non-native speakers, for adverse acoustic environment, etc.). Deep Neural Network acoustic models are more compute and memory intensive but provide much better accuracy as well as robustness versus the GMM acoustic models.

NOTE: Our proof-of-concept app on Github includes both gmm and nnet2 ASR Bundles; your app will only need to include the bundle that matches the recognizer type you are using in your app.

NOTE: We have several ASR Bundles that work better for large vocabulary tasks, and generally have smaller memory footprint and CPU utilization. Contact us to inquire for details.