keenasr-web - v2.1.1

Classes

KeenASR class provides a high-level JavaScript module that provides core ASR functionality.

LogLevel defines logging levels used by the SDK.

These constants indicate different states recognizer can assume.

SpeakingTask defines a type of speaking task that will be handled. It is primarily used to indicate to methods that create decoding graphs what type of task they need to handle, so that appropiate customization can be done when creating language model and decoding graph.

VADParameter

VADParameter constants correspond to different Voice Activity Detection parameters that are used for endpointing during recognition. You can change values of these parameters using setVADParameters method.

WordPronunciation

WordPronunciation is a class that defines a mapping between a word and its phonetic pronunciation. A phonetic pronunciation is a space separated string of phonemes that define how the word is pronounced. The names of the phonemes are language-specific and defined in the ASR Bundle. For some languages, where this mapping is not deterministic, the ASR Bundle will contain a large lookup table in the lang/lexicon.txt file.

You can use WordPronunciation to provide alternatives to existing pronunciations, or to define pronunciations for words that are not in the ASR Bundle lexicon. Some scenarios where this could be useful is modeling of: a) made-up words, b) common mispronunciations. The latter can be useful in use cases involving language learning and reading instruction. In those scenarios you can also provide an arbitrary tag value when constructing a WordPronunciation. The tag will be appended to the recognized word in the result, if such pronunciation was the most likely hypothesis. For example, given a word “PEAK” and tag “WRONG” the word in the result will be “PEAK#WRONG”.

Example: a pronunciation for PEAK that could potentially be tagged as WRONG is

let wordPronunciation = new WordPronunciation("PEAK", "B IY1 K", "WRONG");

Interfaces

ASRPhone: ASRPhone represents a phone (speech segment that possesses distinct physical or perceptual properties and serves as the basic unit of phonetic speech analysis). ASR Bundle used to initialize the recognizer will contain a list of phones that were used to train the models. In the context of KeenASR SDK, each recognized word is represented via ASRWord object, which will contain an array of ASRPhone object that correspond to the most likely phonetic transcription of the given word.
ASRResponse: ASRResponse contains various metadata related to the single interaction with the speech recognition system, from calling startListening until the recognizer stopped listening. It is provided to the application via onFinalResponse callback method.
ASRResult: ASRResult represents results of the recognition.
ASRWord: ASRWord provides word text, timing information, and the confidence of the word.
AudioQualityResult: AudioQualityResult interface contains various metrics for audio quality estimation, returned as part of the Response, including Signal to Noise Ratio (SNR) and various signal level metrics.

keenasr-web - v2.1.1

Classes

Interfaces

Settings

On This Page