Class KASRAlignmentConfig

java.lang.Object
com.keenresearch.keenasr.KASRAlignmentConfig

public class KASRAlignmentConfig extends Object
Per-call configuration for a text-alignment operation. Defaults reproduce classic Levenshtein word edit distance with unit costs and noise-token filtering enabled. For most oral-reading and WER use cases the defaults are appropriate; instantiate this class and tweak fields to override. Fields are public for ergonomics — this is a configuration data object, not an invariant-bearing class.
  • Field Details

    • insertCost

      public float insertCost
      Cost of inserting a recognized token (one not present in the reference). Default: 1.0.
    • deleteCost

      public float deleteCost
      Cost of deleting a reference token (one not present in the recognized text). Default: 1.0.
    • substituteCost

      public float substituteCost
      Cost of substituting one token for another. When substituteCost > insertCost + deleteCost the aligner degenerates to LCS-style behavior (no substitutions emitted). Default: 1.0.
    • detectRepetitions

      public boolean detectRepetitions
      When true, the result populates KASRAlignmentResult.getRepetitionRefIndices() for words the reader appears to have stuttered or repeated. Default: false.
    • filterNoiseTokens

      public boolean filterNoiseTokens
      When true, recognized tokens whose text begins with '<' (e.g. <SPOKEN_NOISE>, <UNK>) are dropped before alignment. Leave on when the recognized text comes from an ASR result. Default: true.
  • Constructor Details

    • KASRAlignmentConfig

      public KASRAlignmentConfig()
      Constructs a configuration object with default values.