Class KASRDecodingGraph
For ASR tasks in which domain and vocabulary are defined ahead of time and not dependent on information available only during the runtime, it is recommended that decoding graph is created offline and packaged in the ASR bundle directory.
If user specific information is necessary to create decoding graphs, you can use
various KASRDecodingGraph class methods to dynamically create decoding graphs,
which will be saved in the filesystem on the device. Typically, you will
provide a list of sentences/phrases to createDecodingGraphFromSentences(String[], KASRRecognizer, String)
method, which will then create a custom decoding graph. Later on, you can refer
to the custom decoding graph by its name. Alternatively, instead of list of
sentences/phrases you can provide an ARPA language model (bundled with your app),
which will be used to build a custom decoding graph.
If your app needs to support continuous listening with trigger phrase support you will need to
build the decoding graph using createDecodingGraphFromSentencesWithTriggerPhrase(String[], String, KASRRecognizer, String)
method.
Decoding graphs can only be built dynamically if the lang/ subdirectory in the ASR bundle exists.
Note: When dynamically creating decoding graphs, any words that do not have phonetic representation in the lexicon (ASRBUNDLE/lang/lexicon.txt) will be assigned one algorithmically. For English language algorithmic representation is imperfect, thus you should aim to manually augment the lexicon text file with pronunciations for as many additional words that are likely to be encountered in your app. For example, if your app is dealing with ASR of names you would augment the lexicon with additional names and their proper pronunciation before releasing your app.In the current version of the framework, creating of decoding graph can take on the order of 10-30sec for medium size vocabulary task (more than thousand words). For larger language models we recommend you create decoding graph ahead of time and bundle it with your app.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic enum
KASRSpeakingTask defines a type of interaction which might be useful when building a decoding graph, so that the way decoding graph is built can be fine tuned to the task. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic boolean
createContextualDecodingGraphFromPhrases
(ArrayList<ArrayList<String>> contextualPhrases, KASRRecognizer recognizer, ArrayList<KASRAlternativePronunciation> alternativePronunciations, KASRDecodingGraph.KASRSpeakingTask task, float spokenNoiseProbability, String dgName) Create custom decoding graph from an array of sentences/phrases, for a specific task, using provided array of word mispronunciations and save it in the filesystem under for later use.static boolean
createDecodingGraphFromPhrases
(String[] phrases, KASRRecognizer recognizer, String dgName) Create custom decoding graph from an array of sentences/phrases and save it in the filesystem under for later use.static boolean
createDecodingGraphFromPhrases
(String[] phrases, KASRRecognizer recognizer, ArrayList<KASRAlternativePronunciation> alternativePronunciations, KASRDecodingGraph.KASRSpeakingTask task, float spokenNoiseProbability, String dgName) Create custom decoding graph from an array of sentences/phrases, for a specific task, using provided array of word mispronunciations and save it in the filesystem under for later use.static boolean
createDecodingGraphFromPhrasesWithTriggerPhrase
(String[] phrases, String triggerPhrase, KASRRecognizer recognizer, String dgName) Create custom decoding graph from an array of sentences/phrases, using specified triggerPhase, and save it in the filesystem under for later use.static boolean
createDecodingGraphFromSentences
(String[] sentences, KASRRecognizer recognizer, String dgName) Deprecated.static boolean
createDecodingGraphFromSentencesWithTriggerPhrase
(String[] sentences, String triggerPhrase, KASRRecognizer recognizer, String dgName) Deprecated.static boolean
decodingGraphExistsAtPath
(String dgPath) Returns TRUE if a valid decoding graph exists at the given absolute filepath.static boolean
decodingGraphWithNameExists
(String dgName, KASRRecognizer recognizer) Returns true if valid custom decoding graph with the given name exists in the filesystemstatic Date
getDecodingGraphCreationDate
(String dgName, KASRRecognizer recognizer) Returns date when custom decoding graph was created.static boolean
isValidPronunciation
(String pronunciation, KASRRecognizer recognizer) Verify if pronunciation specified in the input string is composed of valid phones that are supported for the given recognizer.
-
Constructor Details
-
KASRDecodingGraph
public KASRDecodingGraph()
-
-
Method Details
-
createDecodingGraphFromPhrases
public static boolean createDecodingGraphFromPhrases(String[] phrases, KASRRecognizer recognizer, String dgName) Create custom decoding graph from an array of sentences/phrases and save it in the filesystem under for later use. Custom decoding graphs can be referenced by their name by various methods in the framework.- Parameters:
phrases
- an array of String objects that specify phrases recognizer should listen for. These sentences are used to create an ngram language model, from which decoding graph is created. Text in sentences should be normalized (e.g. numbers and dates should be represented by words, so 'two hundred dollars' not $200)recognizer
-KASRRecognizer
object that will be used to perform recognition with this decoding graph. Note that decoding graph is persisted in the filesystem and can be resued at the later time with a different KASRRecognizer object as long as such recognizer uses the same ASR bundle as the KASRRecognizer object used to create the decoding graph.dgName
- a name of the custom decoding graph. All graph resources will be stored in a directy named DECODING_GRAPH_NAME-ASR_BUNDLE_NAME in context.getApplicationInfo().dataDir + " + dgName + "-" + asrBundleName- Returns:
- True on success, false otherwise
-
createDecodingGraphFromPhrases
public static boolean createDecodingGraphFromPhrases(String[] phrases, KASRRecognizer recognizer, ArrayList<KASRAlternativePronunciation> alternativePronunciations, KASRDecodingGraph.KASRSpeakingTask task, float spokenNoiseProbability, String dgName) Create custom decoding graph from an array of sentences/phrases, for a specific task, using provided array of word mispronunciations and save it in the filesystem under for later use. Custom decoding graphs can be referenced by their name by various methods in the framework.- Parameters:
phrases
- an array of String objects that specify phrases recognizer should listen for. These sentences are used to create an ngram language model, from which decoding graph is created. Text in sentences should be normalized (e.g. numbers and dates should be represented by words, so 'two hundred dollars' not $200)recognizer
-KASRRecognizer
object that will be used to perform recognition with this decoding graph. Note that decoding graph is persisted in the filesystem and can be reused at the later time with a different KASRRecognizer object as long as such recognizer uses the same ASR bundle as the KASRRecognizer object used to create the decoding graph.alternativePronunciations
- An array of KASRAlternativePronunciation objects specifying alternative pronunciation for the words, and (optional) tags that can be used to identify those pronunciations. If recognized, these words will be reported in partial/final result with #tag tag appended to the word.task
- one ofKASRDecodingGraph.KASRSpeakingTask
specifying a type of interaction.spokenNoiseProbability
- a value between 0 and 1 that determines how likely <SPOKEN_NOISE> background word will be. Default value is 0.5. Setting this value to 0 means that <SPOKEN_NOISE> will practically not appear in the result regardless of what was said. Setting it to 1.0 means that even slightly mispronounced words might be mapped to <SPOKEN_NOISE>.dgName
- a name of the custom decoding graph. All graph resources will be stored in a directory named DECODING_GRAPH_NAME-ASR_BUNDLE_NAME. NOTE: decoding graph name (dgName input parameter) cannot contain - characters. We recommend it only contains alphanumeric characters.- Returns:
- True on success, false otherwise.
-
createContextualDecodingGraphFromPhrases
public static boolean createContextualDecodingGraphFromPhrases(ArrayList<ArrayList<String>> contextualPhrases, KASRRecognizer recognizer, ArrayList<KASRAlternativePronunciation> alternativePronunciations, KASRDecodingGraph.KASRSpeakingTask task, float spokenNoiseProbability, String dgName) Create custom decoding graph from an array of sentences/phrases, for a specific task, using provided array of word mispronunciations and save it in the filesystem under for later use. Custom decoding graphs can be referenced by their name by various methods in the framework.- Parameters:
contextualPhrases
- an ArrayList of ArrayList of String objects that specify per-context phrases recognizer should listen for. These sentences are used to create an ngram language model, from which decoding graph is created. Text in sentences should be normalized (e.g. numbers and dates should be represented by words, so 'two hundred dollars' not $200)recognizer
-KASRRecognizer
object that will be used to perform recognition with this decoding graph. Note that decoding graph is persisted in the filesystem and can be reused at the later time with a different KASRRecognizer object as long as such recognizer uses the same ASR bundle as the KASRRecognizer object used to create the decoding graph.alternativePronunciations
- An array of KASRAlternativePronunciation objects specifying alternative pronunciation for the words, and (optional) tags that can be used to identify those pronunciations. If recognized, these words will be reported in partial/final result with #tag tag appended to the word.task
- one ofKASRDecodingGraph.KASRSpeakingTask
specifying a type of interaction.spokenNoiseProbability
- a value between 0 and 1 that determines how likely <SPOKEN_NOISE> background word will be. Default value is 0.5. Setting this value to 0 means that <SPOKEN_NOISE> will practically not appear in the result regardless of what was said. Setting it to 1.0 means that even slightly mispronounced words might be mapped to <SPOKEN_NOISE>dgName
- a name of the custom decoding graph. All graph resources will be stored in a directory named DECODING_GRAPH_NAME-ASR_BUNDLE_NAME in context.getApplicationInfo().dataDir + dgName + "-" + asrBundleName- Returns:
- True on success, false otherwise.
-
createDecodingGraphFromPhrasesWithTriggerPhrase
public static boolean createDecodingGraphFromPhrasesWithTriggerPhrase(String[] phrases, String triggerPhrase, KASRRecognizer recognizer, String dgName) Create custom decoding graph from an array of sentences/phrases, using specified triggerPhase, and save it in the filesystem under for later use. Custom decoding graphs can be referenced by their name by various methods in the framework. When using decoding graphs created with the trigger phrase support, upon calling StartListening method the SDK will listen continuously until it hears the trigger phrase; only then will partial results start occurring.- Parameters:
phrases
- an array of String objects that specify sentences/phrases recognizer should listen for. These phrases are used to create an ngram language model, from which decoding graph is created. Text in phrases should be normalized (e.g. numbers and dates should be represented by words, so 'two hundred dollars' not $200)triggerPhrase
- a String representing a trigger phrase used to initiate recognition when using this decoding graph, for example "Hey computer". When using decoding graph with trigger phrase, recognizer will continuously listen until it hears the trigger phrase. No partial callback results will be provided until trigger phrase is recognized.recognizer
-KASRRecognizer
object that will be used to perform recognition with this decoding graph. Note that decoding graph is persisted in the filesystem and can be reused at the later time with a different KASRRecognizer object as long as such recognizer uses the same ASR bundle as the KASRRecognizer object used to create the decoding graph.dgName
- a name of the custom decoding graph. All graph resources will be stored in a directy named DECODING_GRAPH_NAME-ASR_BUNDLE_NAME in context.getApplicationInfo().dataDir + dgName + "-" + asrBundleName- Returns:
- True on success, false otherwise.
-
decodingGraphWithNameExists
Returns true if valid custom decoding graph with the given name exists in the filesystem- Parameters:
dgName
- name of the custom decoding graphrecognizer
- KASRRecognizer object equivalent to the KASRRecognizer object that was used to create the decoding graph.- Returns:
- True if decoding graph with such name exists, false otherwise. This method will also check for existence of all the necessary files in the decoding graph directory.
-
decodingGraphExistsAtPath
Returns TRUE if a valid decoding graph exists at the given absolute filepath.- Parameters:
dgPath
- absolute path to the decoding graph directory.- Returns:
- true if decoding graph with such name exists, false otherwise. This method will also check for existence of all the necessary files in the decoding graph directory.
-
getDecodingGraphCreationDate
Returns date when custom decoding graph was created.- Parameters:
dgName
- name of the decodingGraphrecognizer
- KIOSRecognizer object equivalent to the KIOSRecognizer object that was used to create the decoding graph (initialized with the same ASR Bundle).- Returns:
- date when decoding graph was created and saved in the filesystem. null if not available.
-
isValidPronunciation
Verify if pronunciation specified in the input string is composed of valid phones that are supported for the given recognizer. Returns true if pronunciation is valid, false otherwise.- Parameters:
pronunciation
- string that represents pronunciation of a word. For example @"k ae t"recognizer
- KIOSRecognizer object equivalent to the KIOSRecognizer object that was used to create the decoding graph.- Returns:
- True if pronunciation is valid, false otherwise.
-
createDecodingGraphFromSentences
@Deprecated public static boolean createDecodingGraphFromSentences(String[] sentences, KASRRecognizer recognizer, String dgName) Deprecated.Create custom decoding graph from an array of sentences/phrases and save it in the filesystem under for later use. Custom decoding graphs can be referenced by their name by various methods in the framework.- Parameters:
sentences
- an array of String objects that specify sentences/phrases recognizer should listen for. These sentences are used to create an ngram language model, from which decoding graph is created. Text in sentences should be normalized (e.g. numbers and dates should be represented by words, so 'two hundred dollars' not $200)recognizer
-KASRRecognizer
object that will be used to perform recognition with this decoding graph. Note that decoding graph is persisted in the filesystem and can be resued at the later time with a different KASRRecognizer object as long as such recognizer uses the same ASR bundle as the KASRRecognizer object used to create the decoding graph.dgName
- a name of the custom decoding graph. All graph resources will be stored in a directy named DECODING_GRAPH_NAME-ASR_BUNDLE_NAME in context.getApplicationInfo().dataDir + " + dgName + "-" + asrBundleName- Returns:
- True on success, false otherwise
-
createDecodingGraphFromSentencesWithTriggerPhrase
@Deprecated public static boolean createDecodingGraphFromSentencesWithTriggerPhrase(String[] sentences, String triggerPhrase, KASRRecognizer recognizer, String dgName) Deprecated.Create custom decoding graph from an array of sentences/phrases, using specified triggerPhase, and save it in the filesystem under for later use. Custom decoding graphs can be referenced by their name by various methods in the framework. When using decoding graphs created with the trigger phrase support, upon calling StartListening method the SDK will listen continuously until it hears the trigger phrase; only then will partial results start occurring.- Parameters:
sentences
- an array of String objects that specify sentences/phrases recognizer should listen for. These sentences are used to create an ngram language model, from which decoding graph is created. Text in sentences should be normalized (e.g. numbers and dates should be represented by words, so 'two hundred dollars' not $200)triggerPhrase
- a String representing a trigger phrase used to initiate recognition when using this decoding graph, for example "Hey computer". When using decoding graph with trigger phrase, recognizer will continuously listen until it hears the trigger phrase. No partial callback results will be provided until trigger phrase is recognized.recognizer
-KASRRecognizer
object that will be used to perform recognition with this decoding graph. Note that decoding graph is persisted in the filesystem and can be reused at the later time with a different KASRRecognizer object as long as such recognizer uses the same ASR bundle as the KASRRecognizer object used to create the decoding graph.dgName
- a name of the custom decoding graph. All graph resources will be stored in a directy named DECODING_GRAPH_NAME-ASR_BUNDLE_NAME in context.getApplicationInfo().dataDir + dgName + "-" + asrBundleName- Returns:
- True on success, false otherwise.
-
createDecodingGraphFromPhrases(String[], KASRRecognizer, String)