com.keenresearch.keenasr.KASRDecodingGraph

public class KASRDecodingGraph extends Object

KASRDecodingGraph class manages decoding graphs in the filesystem. For more details on the concept of decoding graphs in automated speech recognition see this link.

For ASR tasks in which domain and vocabulary are defined ahead of time and not dependent on information available only during the runtime, it is recommended that decoding graph is created offline and packaged in the ASR bundle directory.

If user specific information is necessary to create decoding graphs, you can use various KASRDecodingGraph class methods to dynamically create decoding graphs, which will be saved in the filesystem on the device. Typically, you will provide a list of sentences/phrases to createDecodingGraphFromSentences(String[], KASRRecognizer, String) method, which will then create a custom decoding graph. Later on, you can refer to the custom decoding graph by its name. Alternatively, instead of list of sentences/phrases you can provide an ARPA language model (bundled with your app), which will be used to build a custom decoding graph.

If your app needs to support continuous listening with trigger phrase support you will need to build the decoding graph using createDecodingGraphFromSentencesWithTriggerPhrase(String[], String, KASRRecognizer, String) method.

Decoding graphs can only be built dynamically if the lang/ subdirectory in the ASR bundle exists.

Note: When dynamically creating decoding graphs, any words that do not have phonetic representation in the lexicon (ASRBUNDLE/lang/lexicon.txt) will be assigned one algorithmically. For English language algorithmic representation is imperfect, thus you should aim to manually augment the lexicon text file with pronunciations for as many additional words that are likely to be encountered in your app. For example, if your app is dealing with ASR of names you would augment the lexicon with additional names and their proper pronunciation before releasing your app.

In the current version of the framework, creating of decoding graph can take on the order of 10-30sec for medium size vocabulary task (more than thousand words). For larger language models we recommend you create decoding graph ahead of time and bundle it with your app.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static enum

KASRDecodingGraph.KASRSpeakingTask

KASRSpeakingTask defines a type of interaction which might be useful when building a decoding graph, so that the way decoding graph is built can be fine tuned to the task.
Constructor Summary

Constructors

Constructor

Description

KASRDecodingGraph()
Method Summary

Modifier and Type

Method

Description

static boolean

createContextualDecodingGraphFromPhrases(ArrayList<ArrayList<String>> contextualPhrases, KASRRecognizer recognizer, ArrayList<KASRAlternativePronunciation> alternativePronunciations, KASRDecodingGraph.KASRSpeakingTask task, float spokenNoiseProbability, String dgName)

Create custom decoding graph from an array of sentences/phrases, for a specific task, using provided array of word mispronunciations and save it in the filesystem under for later use.

static boolean

createDecodingGraphFromPhrases(String[] phrases, KASRRecognizer recognizer, String dgName)

Create custom decoding graph from an array of sentences/phrases and save it in the filesystem under for later use.

static boolean

createDecodingGraphFromPhrases(String[] phrases, KASRRecognizer recognizer, ArrayList<KASRAlternativePronunciation> alternativePronunciations, KASRDecodingGraph.KASRSpeakingTask task, float spokenNoiseProbability, String dgName)

Create custom decoding graph from an array of sentences/phrases, for a specific task, using provided array of word mispronunciations and save it in the filesystem under for later use.

static boolean

createDecodingGraphFromPhrasesWithTriggerPhrase(String[] phrases, String triggerPhrase, KASRRecognizer recognizer, String dgName)

Create custom decoding graph from an array of sentences/phrases, using specified triggerPhase, and save it in the filesystem under for later use.

static boolean

createDecodingGraphFromSentences(String[] sentences, KASRRecognizer recognizer, String dgName)

Deprecated.
Use createDecodingGraphFromPhrases(String[], KASRRecognizer, String)

static boolean

createDecodingGraphFromSentencesWithTriggerPhrase(String[] sentences, String triggerPhrase, KASRRecognizer recognizer, String dgName)

Deprecated.
Please see createDecodingGraphFromPhrasesWithTriggerPhrase(String[], String, KASRRecognizer, String) method.

static boolean

decodingGraphExistsAtPath(String dgPath)

Returns TRUE if a valid decoding graph exists at the given absolute filepath.

static boolean

decodingGraphWithNameExists(String dgName, KASRRecognizer recognizer)

Returns true if valid custom decoding graph with the given name exists in the filesystem

static Date

getDecodingGraphCreationDate(String dgName, KASRRecognizer recognizer)

Returns date when custom decoding graph was created.

static boolean

isValidPronunciation(String pronunciation, KASRRecognizer recognizer)

Verify if pronunciation specified in the input string is composed of valid phones that are supported for the given recognizer.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- KASRDecodingGraph
  
  public KASRDecodingGraph()
Method Details
- createDecodingGraphFromPhrases
  
  public static boolean createDecodingGraphFromPhrases(String[] phrases, KASRRecognizer recognizer, String dgName)
  
  Create custom decoding graph from an array of sentences/phrases and save it in the filesystem under for later use. Custom decoding graphs can be referenced by their name by various methods in the framework.
  
  Parameters:
  
  phrases - an array of String objects that specify phrases recognizer should listen for. These sentences are used to create an ngram language model, from which decoding graph is created. Text in sentences should be normalized (e.g. numbers and dates should be represented by words, so 'two hundred dollars' not $200)
  
  recognizer - KASRRecognizer object that will be used to perform recognition with this decoding graph. Note that decoding graph is persisted in the filesystem and can be resued at the later time with a different KASRRecognizer object as long as such recognizer uses the same ASR bundle as the KASRRecognizer object used to create the decoding graph.
  
  dgName - a name of the custom decoding graph. All graph resources will be stored in a directy named DECODING_GRAPH_NAME-ASR_BUNDLE_NAME in context.getApplicationInfo().dataDir + " + dgName + "-" + asrBundleName
  
  Returns:
  
  True on success, false otherwise
- createDecodingGraphFromPhrases
  
  public static boolean createDecodingGraphFromPhrases(String[] phrases, KASRRecognizer recognizer, ArrayList<KASRAlternativePronunciation> alternativePronunciations, KASRDecodingGraph.KASRSpeakingTask task, float spokenNoiseProbability, String dgName)
  
  Create custom decoding graph from an array of sentences/phrases, for a specific task, using provided array of word mispronunciations and save it in the filesystem under for later use. Custom decoding graphs can be referenced by their name by various methods in the framework.
  
  Parameters:
  
  phrases - an array of String objects that specify phrases recognizer should listen for. These sentences are used to create an ngram language model, from which decoding graph is created. Text in sentences should be normalized (e.g. numbers and dates should be represented by words, so 'two hundred dollars' not $200)
  
  recognizer - KASRRecognizer object that will be used to perform recognition with this decoding graph. Note that decoding graph is persisted in the filesystem and can be reused at the later time with a different KASRRecognizer object as long as such recognizer uses the same ASR bundle as the KASRRecognizer object used to create the decoding graph.
  
  alternativePronunciations - An array of KASRAlternativePronunciation objects specifying alternative pronunciation for the words, and (optional) tags that can be used to identify those pronunciations. If recognized, these words will be reported in partial/final result with #tag tag appended to the word.
  
  task - one of KASRDecodingGraph.KASRSpeakingTask specifying a type of interaction.
  
  spokenNoiseProbability - a value between 0 and 1 that determines how likely <SPOKEN_NOISE> background word will be. Default value is 0.5. Setting this value to 0 means that <SPOKEN_NOISE> will practically not appear in the result regardless of what was said. Setting it to 1.0 means that even slightly mispronounced words might be mapped to <SPOKEN_NOISE>.
  
  dgName - a name of the custom decoding graph. All graph resources will be stored in a directory named DECODING_GRAPH_NAME-ASR_BUNDLE_NAME. NOTE: decoding graph name (dgName input parameter) cannot contain - characters. We recommend it only contains alphanumeric characters.
  
  Returns:
  
  True on success, false otherwise.
- createContextualDecodingGraphFromPhrases
  
  public static boolean createContextualDecodingGraphFromPhrases(ArrayList<ArrayList<String>> contextualPhrases, KASRRecognizer recognizer, ArrayList<KASRAlternativePronunciation> alternativePronunciations, KASRDecodingGraph.KASRSpeakingTask task, float spokenNoiseProbability, String dgName)
  
  Create custom decoding graph from an array of sentences/phrases, for a specific task, using provided array of word mispronunciations and save it in the filesystem under for later use. Custom decoding graphs can be referenced by their name by various methods in the framework.
  
  Parameters:
  
  contextualPhrases - an ArrayList of ArrayList of String objects that specify per-context phrases recognizer should listen for. These sentences are used to create an ngram language model, from which decoding graph is created. Text in sentences should be normalized (e.g. numbers and dates should be represented by words, so 'two hundred dollars' not $200)
  
  recognizer - KASRRecognizer object that will be used to perform recognition with this decoding graph. Note that decoding graph is persisted in the filesystem and can be reused at the later time with a different KASRRecognizer object as long as such recognizer uses the same ASR bundle as the KASRRecognizer object used to create the decoding graph.
  
  alternativePronunciations - An array of KASRAlternativePronunciation objects specifying alternative pronunciation for the words, and (optional) tags that can be used to identify those pronunciations. If recognized, these words will be reported in partial/final result with #tag tag appended to the word.
  
  task - one of KASRDecodingGraph.KASRSpeakingTask specifying a type of interaction.
  
  spokenNoiseProbability - a value between 0 and 1 that determines how likely <SPOKEN_NOISE> background word will be. Default value is 0.5. Setting this value to 0 means that <SPOKEN_NOISE> will practically not appear in the result regardless of what was said. Setting it to 1.0 means that even slightly mispronounced words might be mapped to <SPOKEN_NOISE>
  
  dgName - a name of the custom decoding graph. All graph resources will be stored in a directory named DECODING_GRAPH_NAME-ASR_BUNDLE_NAME in context.getApplicationInfo().dataDir + dgName + "-" + asrBundleName
  
  Returns:
  
  True on success, false otherwise.
- createDecodingGraphFromPhrasesWithTriggerPhrase
  
  public static boolean createDecodingGraphFromPhrasesWithTriggerPhrase(String[] phrases, String triggerPhrase, KASRRecognizer recognizer, String dgName)
  
  Create custom decoding graph from an array of sentences/phrases, using specified triggerPhase, and save it in the filesystem under for later use. Custom decoding graphs can be referenced by their name by various methods in the framework. When using decoding graphs created with the trigger phrase support, upon calling StartListening method the SDK will listen continuously until it hears the trigger phrase; only then will partial results start occurring.
  
  Parameters:
  
  phrases - an array of String objects that specify sentences/phrases recognizer should listen for. These phrases are used to create an ngram language model, from which decoding graph is created. Text in phrases should be normalized (e.g. numbers and dates should be represented by words, so 'two hundred dollars' not $200)
  
  triggerPhrase - a String representing a trigger phrase used to initiate recognition when using this decoding graph, for example "Hey computer". When using decoding graph with trigger phrase, recognizer will continuously listen until it hears the trigger phrase. No partial callback results will be provided until trigger phrase is recognized.
  
  recognizer - KASRRecognizer object that will be used to perform recognition with this decoding graph. Note that decoding graph is persisted in the filesystem and can be reused at the later time with a different KASRRecognizer object as long as such recognizer uses the same ASR bundle as the KASRRecognizer object used to create the decoding graph.
  
  dgName - a name of the custom decoding graph. All graph resources will be stored in a directy named DECODING_GRAPH_NAME-ASR_BUNDLE_NAME in context.getApplicationInfo().dataDir + dgName + "-" + asrBundleName
  
  Returns:
  
  True on success, false otherwise.
- decodingGraphWithNameExists
  
  public static boolean decodingGraphWithNameExists(String dgName, KASRRecognizer recognizer)
  
  Returns true if valid custom decoding graph with the given name exists in the filesystem
  
  Parameters:
  
  dgName - name of the custom decoding graph
  
  recognizer - KASRRecognizer object equivalent to the KASRRecognizer object that was used to create the decoding graph.
  
  Returns:
  
  True if decoding graph with such name exists, false otherwise. This method will also check for existence of all the necessary files in the decoding graph directory.
- decodingGraphExistsAtPath
  
  public static boolean decodingGraphExistsAtPath(String dgPath)
  
  Returns TRUE if a valid decoding graph exists at the given absolute filepath.
  
  Parameters:
  
  dgPath - absolute path to the decoding graph directory.
  
  Returns:
  
  true if decoding graph with such name exists, false otherwise. This method will also check for existence of all the necessary files in the decoding graph directory.
- getDecodingGraphCreationDate
  
  public static Date getDecodingGraphCreationDate(String dgName, KASRRecognizer recognizer)
  
  Returns date when custom decoding graph was created.
  
  Parameters:
  
  dgName - name of the decodingGraph
  
  recognizer - KIOSRecognizer object equivalent to the KIOSRecognizer object that was used to create the decoding graph (initialized with the same ASR Bundle).
  
  Returns:
  
  date when decoding graph was created and saved in the filesystem. null if not available.
- isValidPronunciation
  
  public static boolean isValidPronunciation(String pronunciation, KASRRecognizer recognizer)
  
  Verify if pronunciation specified in the input string is composed of valid phones that are supported for the given recognizer. Returns true if pronunciation is valid, false otherwise.
  
  Parameters:
  
  pronunciation - string that represents pronunciation of a word. For example @"k ae t"
  
  recognizer - KIOSRecognizer object equivalent to the KIOSRecognizer object that was used to create the decoding graph.
  
  Returns:
  
  True if pronunciation is valid, false otherwise.
- createDecodingGraphFromSentences
  
  @Deprecated public static boolean createDecodingGraphFromSentences(String[] sentences, KASRRecognizer recognizer, String dgName)
  
  Deprecated.
  Use createDecodingGraphFromPhrases(String[], KASRRecognizer, String)
  
  Create custom decoding graph from an array of sentences/phrases and save it in the filesystem under for later use. Custom decoding graphs can be referenced by their name by various methods in the framework.
  
  Parameters:
  
  sentences - an array of String objects that specify sentences/phrases recognizer should listen for. These sentences are used to create an ngram language model, from which decoding graph is created. Text in sentences should be normalized (e.g. numbers and dates should be represented by words, so 'two hundred dollars' not $200)
  
  recognizer - KASRRecognizer object that will be used to perform recognition with this decoding graph. Note that decoding graph is persisted in the filesystem and can be resued at the later time with a different KASRRecognizer object as long as such recognizer uses the same ASR bundle as the KASRRecognizer object used to create the decoding graph.
  
  dgName - a name of the custom decoding graph. All graph resources will be stored in a directy named DECODING_GRAPH_NAME-ASR_BUNDLE_NAME in context.getApplicationInfo().dataDir + " + dgName + "-" + asrBundleName
  
  Returns:
  
  True on success, false otherwise
- createDecodingGraphFromSentencesWithTriggerPhrase
  
  @Deprecated public static boolean createDecodingGraphFromSentencesWithTriggerPhrase(String[] sentences, String triggerPhrase, KASRRecognizer recognizer, String dgName)
  
  Deprecated.
  Please see createDecodingGraphFromPhrasesWithTriggerPhrase(String[], String, KASRRecognizer, String) method.
  
  Create custom decoding graph from an array of sentences/phrases, using specified triggerPhase, and save it in the filesystem under for later use. Custom decoding graphs can be referenced by their name by various methods in the framework. When using decoding graphs created with the trigger phrase support, upon calling StartListening method the SDK will listen continuously until it hears the trigger phrase; only then will partial results start occurring.
  
  Parameters:
  
  sentences - an array of String objects that specify sentences/phrases recognizer should listen for. These sentences are used to create an ngram language model, from which decoding graph is created. Text in sentences should be normalized (e.g. numbers and dates should be represented by words, so 'two hundred dollars' not $200)
  
  triggerPhrase - a String representing a trigger phrase used to initiate recognition when using this decoding graph, for example "Hey computer". When using decoding graph with trigger phrase, recognizer will continuously listen until it hears the trigger phrase. No partial callback results will be provided until trigger phrase is recognized.
  
  recognizer - KASRRecognizer object that will be used to perform recognition with this decoding graph. Note that decoding graph is persisted in the filesystem and can be reused at the later time with a different KASRRecognizer object as long as such recognizer uses the same ASR bundle as the KASRRecognizer object used to create the decoding graph.
  
  dgName - a name of the custom decoding graph. All graph resources will be stored in a directy named DECODING_GRAPH_NAME-ASR_BUNDLE_NAME in context.getApplicationInfo().dataDir + dgName + "-" + asrBundleName
  
  Returns:
  
  True on success, false otherwise.

Class KASRDecodingGraph

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

KASRDecodingGraph

Method Details

createDecodingGraphFromPhrases

createDecodingGraphFromPhrases

createContextualDecodingGraphFromPhrases

createDecodingGraphFromPhrasesWithTriggerPhrase

decodingGraphWithNameExists

decodingGraphExistsAtPath

getDecodingGraphCreationDate

isValidPronunciation

createDecodingGraphFromSentences

createDecodingGraphFromSentencesWithTriggerPhrase