Class KASRTextAligner

java.lang.Object
com.keenresearch.keenasr.KASRTextAligner
All Implemented Interfaces:
AutoCloseable

public class KASRTextAligner extends Object implements AutoCloseable
KASRTextAligner aligns hypothesis text (typically from speech recognition result against a reference text using dynamic programming edit distance. Typical use is oral-reading scoring: a user reads a known passage, the ASR produces a hypothesis, and the aligner reports which reference words were read correctly, which were skipped, and how far the reader got. The same primitive supports word-error-rate reporting for batch ASR evaluation. The reference text is normalized and tokenized once at construction and cached. Subsequent calls to align(java.lang.String) and incrementalAlign(java.lang.String) normalize their recognized inputs using the same rules, so reference and recognized sides share a canonical form. The language code passed at construction must match the ASR bundle that produced (or will produce) the recognized text — normalization differs subtly between languages (accent handling, & expansion, etc.) and a mismatch silently degrades accuracy.

Stateless vs. incremental

  • align(java.lang.String) recomputes alignment from scratch each call. Safe to call concurrently from multiple threads.
  • incrementalAlign(java.lang.String) caches DP rows from the previous call and reuses them up to the longest common prefix of consecutive recognized inputs, amortizing cost across a stream of partial ASR results. Not thread-safe. Call reset() between utterances.
Implements AutoCloseable; callers should close the aligner when done to release the underlying native object promptly, or rely on the finalizer as a safety net.
  • Constructor Details

    • KASRTextAligner

      public KASRTextAligner(String reference, String langCode)
      Constructs an aligner bound to the given reference text and language code. The reference is normalized and tokenized at construction; the cached token vector is available via getReferenceTokens().
      Parameters:
      reference - Reference text to align against. Must not be null.
      langCode - ASR bundle language code, e.g. "en-us", "fr-fr", "de-de", "es-es". Must not be null.
      Throws:
      IllegalArgumentException - if either argument is null.
      IllegalStateException - if the underlying native aligner could not be constructed.
    • KASRTextAligner

      public KASRTextAligner(String reference, KASRRecognizer recognizer)
      Convenience constructor that sources the language code from a recognizer instance. Equivalent to new KASRTextAligner(reference, recognizer.getASRBundleLang()).
      Parameters:
      reference - Reference text to align against. Must not be null.
      recognizer - Initialized recognizer whose ASR bundle defines the language code. Must not be null.
  • Method Details

    • getLangCode

      public final String getLangCode()
      Returns:
      Language code this aligner was constructed with.
    • getReferenceTokens

      public final String[] getReferenceTokens()
      Returns:
      Normalized reference tokens, in order. Indices in alignment results refer to positions in this array.
    • align

      public final KASRAlignmentResult align(String recognized)
      Aligns recognized text against the reference using default configuration. Stateless — safe to call concurrently with itself.
      Parameters:
      recognized - Recognized text from the ASR engine. null is treated as an empty string.
      Returns:
      Alignment result. Never null.
      Throws:
      IllegalStateException - if this aligner has been closed.
    • align

      public final KASRAlignmentResult align(String recognized, KASRAlignmentConfig config)
      Aligns recognized text against the reference with custom configuration. Stateless — safe to call concurrently with itself.
      Parameters:
      recognized - Recognized text from the ASR engine. null is treated as an empty string.
      config - Alignment configuration. Pass null for defaults.
      Returns:
      Alignment result. Never null.
      Throws:
      IllegalStateException - if this aligner has been closed.
    • incrementalAlign

      public final KASRAlignmentResult incrementalAlign(String recognized)
      Aligns recognized text incrementally, reusing DP state from the previous call. Intended for use in partial-result callbacks where the recognized text grows or changes slightly between calls. Call reset() when starting a new utterance to clear the cache. Not thread-safe. Do not call concurrently from multiple threads.
      Parameters:
      recognized - Recognized text from the ASR engine. null is treated as an empty string.
      Returns:
      Alignment result. Never null.
      Throws:
      IllegalStateException - if this aligner has been closed.
    • incrementalAlign

      public final KASRAlignmentResult incrementalAlign(String recognized, KASRAlignmentConfig config)
      Incremental align with custom configuration. Changing cost values between successive calls invalidates the cache and forces a full recompute on the next call.
      Parameters:
      recognized - Recognized text from the ASR engine. null is treated as an empty string.
      config - Alignment configuration. Pass null for defaults.
      Returns:
      Alignment result. Never null.
      Throws:
      IllegalStateException - if this aligner has been closed.
    • reset

      public final void reset()
      Drops cached DP state. Call between utterances when using incrementalAlign(java.lang.String).
      Throws:
      IllegalStateException - if this aligner has been closed.
    • close

      public void close()
      Releases the native resources associated with this aligner. After calling this method, any methods that access native data will throw IllegalStateException.

      Idempotent — calling it multiple times has no effect. Also called by the finalizer as a safety net, but callers should not rely on finalization for timely resource cleanup.

      Specified by:
      close in interface AutoCloseable