com.keenresearch.keenasr.KASRRecognizer

public class KASRRecognizer extends Object

An instance of the KASRRecognizer class, called recognizer, manages recognizer resources and provides speech recognition capabilities to your application.

You typically initialize the engine at the app startup time by calling initWithASRBundleAtPath(String, Context) method, and then use sharedInstance() static method when you need to access the recognizer.

Recognition results are provided via callbacks. To obtain results one of your classes will need to adopt a KASRRecognizerListener interface and implement some of its methods.

Note: Only a single instance of the recognizer can exist at any given time.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static enum

KASRRecognizer.KASRRecognizerLogLevel

These constants indicate the log levels for the framework.

static enum

KASRRecognizer.KASRRecognizerState

These constants indicate different states recognizer can assume.

static enum

KASRRecognizer.KASRVadParameter

These constants correspond to different Voice Activity Detection parameters that are used for endpointing during recognition.
Method Summary

Modifier and Type

Method

Description

boolean

activateAudioStack()

Activate audio stack that has been previously deactivated.

void

adaptToSpeakerWithName(String speakerName)

Defines the name that will be used to uniquely identify speaker adaptation profile.

void

addListener(KASRRecognizerListener listener)

Adds listener.

void

addTriggerPhraseListener(KASRRecognizerTriggerPhraseListener listener)

Adds trigger phrase listener.

boolean

deactivateAudioStack()

Deactivate audio stack.

String

getAsrBundleName()

Obtains a name of the ASR Bundle used to initialize the SDK.

String

getAsrBundlePath()

Obtains a path to the directory where ASR Bundle used to initialize the SDK is stored

String

getDecodingGraphName()

Returns name of the decoding graph that's used for recognition.

float

getInputLevel()

The most recent signal input level in dB

KASRRecognizer.KASRRecognizerState

getRecognizerState()

Returns recognizer state, one of KASRRecognizerState values

Boolean

getRescore()

Value of the rescoring flag

static boolean

initWithASRBundleAtPath(String pathToAsrBundle, android.content.Context context)

Initialize ASR engine with the ASR Bundle located at the provided path.

boolean

isEchoCancellationAvailable()

boolean

performEchoCancellation(boolean value)

EXPERIMENTAL Specifies if echo cancellation should be performed.

boolean

prepareForListeningWithContextualDecodingGraphAtPath(String pathToDecodingGraphDirectory, Integer contextId, boolean computeGoP)

Prepare for recognition by loading contextual decoding graph that was bundled with the application.

boolean

prepareForListeningWithContextualDecodingGraphWithName(String dgName, Integer contextId, boolean computeGoP)

Prepare for recognition by loading contextual decoding graph that was prepared via KASRDecodingGraph.createContextualDecodingGraphFromPhrases(ArrayList, KASRRecognizer, ArrayList, KASRDecodingGraph.KASRSpeakingTask, float, String) method.

boolean

prepareForListeningWithDecodingGraphAtPath(String pathToDecodingGraphDirectory, boolean computeGoP)

Prepare for recognition by loading decoding graph that's stored in the filesystem (for example, downloaded or copied from the app bundle.

boolean

prepareForListeningWithDecodingGraphWithName(String dgName, boolean computeGoP)

Prepare for recognition by loading decoding graph that was prepared via on of the methods that create decoding graphs.

boolean

removeAllSpeakerAdaptationProfiles()

Remove all adaptation profiles for all speakers.

void

removeListener(KASRRecognizerListener listener)

Removes listener.

boolean

removeSpeakerAdaptationProfiles(String speakerName)

Removes all adaptation profiles for the speaker with name speakerName.

void

removeTriggerPhraseListener(KASRRecognizerTriggerPhraseListener listener)

Removes trigger phrase listener.

void

resetSpeakerAdaptation()

Resets speaker adaptation profile in the current recognizer session.

void

saveSpeakerAdaptation()

Saves speaker profile (used for adaptation) in the filesystem.

static void

setLogLevel(KASRRecognizer.KASRRecognizerLogLevel logLevel)

Set log level for the framework.

void

setRescore(Boolean value)

If set to true, recognizer will perform rescoring for the final result, using rescoring language model provided in the custom decoding graph that's bundled with the application.

boolean

setVADGating(boolean value)

VAD (voice activity detection) gating introduces a super lightweight voice activity detection before speech recognition.

void

setVADParameter(KASRRecognizer.KASRVadParameter parameter, float value)

Set any of KASRRecognizer.KASRVadParameter Voice Activity Detection parameters.

static KASRRecognizer

sharedInstance()

boolean

startListening()

Start processing incoming audio.

boolean

stopListening()

Stop the recognizer from processing incoming audio.

KASRResult

stopListeningAndReturnFinalResult()

Deprecated.
This method is deprecated.

static boolean

teardown()

Teardown current singleton instance of the recognizer and all associated resources.

static String

version()

Version of the KeenASR framework.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Details
- sharedInstance
  
  public static KASRRecognizer sharedInstance()
  
  Returns:
  
  singleton instance of KASRRecognizer. If you previously initialized the recognizer you can use this method to obtain the instance of that recognizer.
- initWithASRBundleAtPath
  
  public static boolean initWithASRBundleAtPath(String pathToAsrBundle, android.content.Context context)
  
  Initialize ASR engine with the ASR Bundle located at the provided path. SDK initialization needs to occur before any other work can be performed.
  
  Parameters:
  
  pathToAsrBundle - full path to the ASR Bundle
  
  context - application context
  
  Returns:
  
  True if successful, false otherwise.
  Note: When initializing the recognizer, make sure that the bundle directory contains all the necessary resources needed for the specific recognizer type. If your app is dynamically creating decoding graphs, ASR bundle directory needs to contain lang subdirectory with the relevant resources (lexicon, etc.).
- teardown
  
  public static boolean teardown()
  
  Teardown current singleton instance of the recognizer and all associated resources. This method will return false if recognizer is actively listening.
  
  Returns:
  
  True if successful, false otherwise.
- activateAudioStack
  
  public boolean activateAudioStack()
  
  Activate audio stack that has been previously deactivated.
  
  Returns:
  
  True if successful, false otherwise
- deactivateAudioStack
  
  public boolean deactivateAudioStack()
  
  Deactivate audio stack. You will typically call this method when there is an audio route change (e.g. Bluetooth headset has been connected). In such scenario you will typically stop listening, deactivate the audio stack, activate audio stack in order for KeenASR SDK to pickup the default route, and then start listening again. This method will return false if recognizer instance is actively listening or if audio stack has already been deactivated.
  
  Returns:
  
  True if successful, false otherwise
- prepareForListeningWithDecodingGraphWithName
  
  public boolean prepareForListeningWithDecodingGraphWithName(String dgName, boolean computeGoP)
  
  Prepare for recognition by loading decoding graph that was prepared via on of the methods that create decoding graphs. Calls to this method will be ignored if the recognizer is listening.
  After calling this method, recognizer will load the decoding graph into memory and it will be ready to start listening via startListening method.
  
  Parameters:
  
  dgName - name of the custom decoding graph
  
  computeGoP - set to true if you would like Goodness of Pronunciation scores to be computed in the final result. There is additional overhead when computing these scores, and they require additional assets to be present in the ASR Bundle.
  
  Returns:
  
  True if successful, false otherwise.
- prepareForListeningWithDecodingGraphAtPath
  
  public boolean prepareForListeningWithDecodingGraphAtPath(String pathToDecodingGraphDirectory, boolean computeGoP)
  
  Prepare for recognition by loading decoding graph that's stored in the filesystem (for example, downloaded or copied from the app bundle. You will typically use this approach for large vocabulary tasks, where it would take too long to build the decoding graph on the mobile device. Call to this method will be ignored if the recognizer is listening.
  After calling this method, recognizer will load the decoding graph into memory and it will be ready to start listening via startListening method.
  
  Parameters:
  
  pathToDecodingGraphDirectory - absolute path to the custom decoding graph directory which was created ahead of time and packaged with the app.
  
  computeGoP - set to true if you would like Goodness of Pronunciation scores to be computed in the final result. There is additional overhead when computing these scores, and they require additional assets to be present in the ASR Bundle.
  
  Returns:
  
  True if successful, false otherwise.
  Note: If custom decoding graph was built with rescoring capability, all the resources will be loaded regardless of how rescore paramater is set.
- prepareForListeningWithContextualDecodingGraphWithName
  
  public boolean prepareForListeningWithContextualDecodingGraphWithName(String dgName, Integer contextId, boolean computeGoP)
  
  Prepare for recognition by loading contextual decoding graph that was prepared via KASRDecodingGraph.createContextualDecodingGraphFromPhrases(ArrayList, KASRRecognizer, ArrayList, KASRDecodingGraph.KASRSpeakingTask, float, String) method. Calls to this method will be ignored if the recognizer is listening.
  After calling this method, recognizer will load the decoding graph into memory and it will be ready to start listening via startListening method.
  
  Parameters:
  
  dgName - name of the custom decoding graph
  
  contextId - 0-based index of the context group
  
  computeGoP - set to true if you would like Goodness of Pronunciation scores to be computed in the final result. There is additional overhead when computing these scores, and they require additional assets to be present in the ASR Bundle.
  
  Returns:
  
  True if successful, false otherwise.
- prepareForListeningWithContextualDecodingGraphAtPath
  
  public boolean prepareForListeningWithContextualDecodingGraphAtPath(String pathToDecodingGraphDirectory, Integer contextId, boolean computeGoP)
  
  Prepare for recognition by loading contextual decoding graph that was bundled with the application. Call to this method will be ignored if the recognizer is listening.
  After calling this method, recognizer will load the decoding graph into memory and it will be ready to start listening via startListening method.
  
  Parameters:
  
  pathToDecodingGraphDirectory - absolute path to the custom decoding graph directory which was created ahead of time and packaged with the app.
  
  contextId - 0-based index of the context group
  
  computeGoP - true if Goodness of Pronunciation scores at the phoneme level should be computed, false otherwise
  
  Returns:
  
  True if successful, false otherwise.
  Note: If custom decoding graph was built with rescoring capability, all the resources will be loaded regardless of how rescore paramater is set.
- setRescore
  
  public void setRescore(Boolean value)
  
  If set to true, recognizer will perform rescoring for the final result, using rescoring language model provided in the custom decoding graph that's bundled with the application.
  Default is true.
  
  Note: If the resources necessary for rescoring are not available in the custom decoding graph directory bundled with the app, and rescore is set to true, rescoring step will be skipped.
  
  Parameters:
  
  value - boolean value that determines if rescoring is to be performed
- getRescore
  
  public Boolean getRescore()
  
  Value of the rescoring flag
  
  Returns:
  
  True is rescoring is set, false otherwise
- startListening
  
  public boolean startListening()
  
  Start processing incoming audio.
  
  Returns:
  
  True if successful, false otherwise
  After calling this method, recognizer will listen to and decode audio coming through the microphone using decoding graph you specified via one of the prepareForListening methods. For decoding graphs created without trigger phrase support listening process will stop either due to: a) an explicit call to stopListening or b) if one of the Voice Activity Detection rules are triggered (for example, max duration without speech, or end-silence, etc.). We recommend using VAD based end-pointing instead of explicitly calling stopListening
  
  If decoding graph was created with the triggerPhrase support the SDK will listen continuously until the trigger phrase is recognized, then it will switch over to the standard mode with partial results being reported via onPartialResult callback.
  
  When the recognizer stops listening due to VAD triggering, it will call KASRRecognizerListener.onFinalResponse(KASRRecognizer, KASRResponse) callback method.
  
  When the recognizer stops listening due to audio interrupt, *no callback methods* will be triggered until audio interrupt is over.
  
  VAD settings can be modified via setVADParameter(KASRVadParameter, float) method.
  Note: You will need to call either prepareForListeningWithDecodingGraphWithName(String, boolean) or prepareForListeningWithDecodingGraphAtPath(String, boolean) before calling this method. You will also need to make sure that user has granted audio recording permission before calling this method; see android.support.v4.app.ActivityCompat#requestPermissions for details.
- stopListening
  
  public boolean stopListening()
  
  Stop the recognizer from processing incoming audio.
  
  Returns:
  
  True if successful, false otherwise.
  Note: Calling this method will not trigger recognizerFinalResult delegate call.
- stopListeningAndReturnFinalResult
  
  @Deprecated public KASRResult stopListeningAndReturnFinalResult()
  
  Deprecated.
  This method is deprecated. Use KASRRecognizer.KASRVadParameter to instruct recognizer to stop listening and obtain KASRResponse via callback.
  
  Stop the recognizer from processing incoming audio and return the final result.
  
  Returns:
  
  Final result of the recognition.
  If your application is using Voice Activity Detection parameters, it possible that this method doesn't return the result if one of the Voice Activity Detection thresholds triggers. In that case, recognizerFinalResult delegate will be called.
  
  Note: This method runs synchronously. For large decoding graphs there may be noticeable delay (few hundred ms) on lower-end devices.
- getRecognizerState
  
  public KASRRecognizer.KASRRecognizerState getRecognizerState()
  
  Returns recognizer state, one of KASRRecognizerState values
  
  Returns:
  
  state of the recognizer, one of KASRRecognizerState values
- getDecodingGraphName
  
  public String getDecodingGraphName()
  
  Returns name of the decoding graph that's used for recognition.
  
  Returns:
  
  name of the decoding graph that's used for recognition, null if the recognizer is not prepared for listening.
- addListener
  
  public void addListener(KASRRecognizerListener listener)
  
  Adds listener.
  
  Parameters:
  
  listener - listener that should be added
- removeListener
  
  public void removeListener(KASRRecognizerListener listener)
  
  Removes listener.
  
  Parameters:
  
  listener - listener to be removed
- addTriggerPhraseListener
  
  public void addTriggerPhraseListener(KASRRecognizerTriggerPhraseListener listener)
  
  Adds trigger phrase listener.
  
  Parameters:
  
  listener - listener that should be added
- removeTriggerPhraseListener
  
  public void removeTriggerPhraseListener(KASRRecognizerTriggerPhraseListener listener)
  
  Removes trigger phrase listener.
  
  Parameters:
  
  listener - listener to be removed
- getAsrBundlePath
  
  public String getAsrBundlePath()
  
  Obtains a path to the directory where ASR Bundle used to initialize the SDK is stored
  
  Returns:
  
  String containing a full path to the ASR Bundle directory
- getAsrBundleName
  
  public String getAsrBundleName()
  
  Obtains a name of the ASR Bundle used to initialize the SDK.
  
  Returns:
  
  String containing a name of the ASR Bundle
- adaptToSpeakerWithName
  
  public void adaptToSpeakerWithName(String speakerName)
  
  Defines the name that will be used to uniquely identify speaker adaptation profile. When recognizer starts to listen, it will try to find a matching speaker profile in the filesystem (profiles are matched based on speakername, asrbundle, and audio route). When saveSpeakerAdaptationProfile method is called, it uses the name to uniquely identify the profile file that will be saved in the filesystem.
  
  Parameters:
  
  speakerName - (pseduo)name of the speaker for which adaptation is to be performed. Default value is 'default'.
  The name used here does not have to correspond to the real name of user (thus we call it pseudo name). The exact value does not matter as long as you can match the value to the specific user in your app. For example, you could use 'user1', 'user2', etc..
  
  Note: If you cannot match names to your users, it's recommended to not use this method, and to not save adaptation profiles between sessions. Adaptation will still be performed throughout the session, but each new session (activity after initialization of recognizer) will start from the baseline models.
  
  In-memory speaker adaptation profile can always be reset by calling resetSpeakerAdaptation method.
  
  If this method is called while recognizer is listening, it will only affect subsequent calls to startListening methods.
- resetSpeakerAdaptation
  
  public void resetSpeakerAdaptation()
  Resets speaker adaptation profile in the current recognizer session. Calling this method will also reset the speakerName to 'default'. If the corresponding speaker adaptation profile exists in the filesystem for 'default' speaker, it will be used. If not, initial models from the ASR Bundle will be the baseline. You would typically use this method id there is a new start of a certain activity in your app that may entail new speaker. For example, a practice view is started and there is a good chance a different user may be using the app. If speaker (pseudo)identities are known, you don't need to call this method, you can just switch speakers by calling adaptToSpeakerWithName: with the appropriate speakerName Following are the tradeoffs when using this method:
  
  the downside of resetting user profile for the existing user is that ASR performance will be reset to the baseline (no adaptation), which may slightly degrade performance in the first few interactions
  
  the downside of NOT resetting user profile for a new user is that, depending on the characteristics of the new user's voice, ASR performance may initially be degraded slightly (when comparing to the baseline case of no adaptation)
  
  Calls to this method will be ignored if recognizer is in LISTENING state. If you are resetting adaptation profile and you know user's (pseudo)identity, you may want to call saveSpeakerAdaptationProfile method prior to calling this method so that on subsequent user switches, adaptation profiles can be reloaded and recognition starts with the speaker profile trained on previous sessions audio.
- saveSpeakerAdaptation
  
  public void saveSpeakerAdaptation()
  
  Saves speaker profile (used for adaptation) in the filesystem. Speaker profile will be saved in the file system, in files/KeenASR-speaker-profiles/ directory. Profile filename is composed of the speakerName, asrBundle, and audioRoute.
- removeAllSpeakerAdaptationProfiles
  
  public boolean removeAllSpeakerAdaptationProfiles()
  
  Remove all adaptation profiles for all speakers.
  
  Returns:
  
  true if successfully removed all the profiles for all the speakers
- removeSpeakerAdaptationProfiles
  
  public boolean removeSpeakerAdaptationProfiles(String speakerName)
  
  Removes all adaptation profiles for the speaker with name speakerName.
  
  Parameters:
  
  speakerName - name of the speaker whose profiles should be removed
  
  Returns:
  
  true if successfully removed, false otherwise
- setLogLevel
  
  public static void setLogLevel(KASRRecognizer.KASRRecognizerLogLevel logLevel)
  
  Set log level for the framework.
  
  Parameters:
  
  logLevel - one of KIOSRecognizerLogLevel
  Default value is KIOSRecognizerLogLevelWarning.
- setVADGating
  
  public boolean setVADGating(boolean value)
  
  VAD (voice activity detection) gating introduces a super lightweight voice activity detection before speech recognition. After you call startListening method, if VADGating is turned on recognition will start only after voice was detected by VADGating module. You would typically use this only in always-on listening scenarios when you want to preserve battery life.
  
  Parameters:
  
  value - true if VAD Gating should be turned on, false otherwise.
  
  Returns:
  
  true if successfully set, false otherwise
- setVADParameter
  
  public void setVADParameter(KASRRecognizer.KASRVadParameter parameter, float value)
  
  Set any of KASRRecognizer.KASRVadParameter Voice Activity Detection parameters. These parameters can be set at any time and they will go into effect immediately.
  
  Parameters:
  
  parameter - one of KASRVadParameter
  
  value - duration in seconds for the parameter
  Note: Setting VAD rules in the config file within the ASR bundle will NOT have any effect. Values for these parameters are set to their defaults upon initialization of KASRRecognizer. They can only be changed programmatically, using this method.
- getInputLevel
  
  public float getInputLevel()
  
  The most recent signal input level in dB
  
  Returns:
  
  signal input level in dB
- isEchoCancellationAvailable
  
  public boolean isEchoCancellationAvailable()
  
  Returns:
  
  True if the device natively supports echo cancellation, false otherwise.
- performEchoCancellation
  
  public boolean performEchoCancellation(boolean value)
  
  EXPERIMENTAL Specifies if echo cancellation should be performed. If value is set to true and the device supports echo cancellation, then audio played by the application will be removed from the audio captured via the microphone.
  
  Parameters:
  
  value - set to YES to turn on echo cancellation processing, NO to turn it off. Default is NO.
  
  Returns:
  
  true if value was successfully set, false otherwise. If the device does not support echo cancellatio and you pass true to this method, it will return false. WARNING: Calls to this method while the recognizer is listening will be ignored end the method will return false.
- version
  
  public static String version()
  
  Version of the KeenASR framework.
  
  Returns:
  
  string containing SDK version

Class KASRRecognizer

Nested Class Summary

Method Summary

Methods inherited from class java.lang.Object

Method Details

sharedInstance

initWithASRBundleAtPath

teardown

activateAudioStack

deactivateAudioStack

prepareForListeningWithDecodingGraphWithName

prepareForListeningWithDecodingGraphAtPath

prepareForListeningWithContextualDecodingGraphWithName

prepareForListeningWithContextualDecodingGraphAtPath

setRescore

getRescore

startListening

stopListening

stopListeningAndReturnFinalResult

getRecognizerState

getDecodingGraphName

addListener

removeListener

addTriggerPhraseListener

removeTriggerPhraseListener

getAsrBundlePath

getAsrBundleName

adaptToSpeakerWithName

resetSpeakerAdaptation

saveSpeakerAdaptation

removeAllSpeakerAdaptationProfiles

removeSpeakerAdaptationProfiles

setLogLevel

setVADGating

setVADParameter

getInputLevel

isEchoCancellationAvailable

performEchoCancellation

version