User Tools

Site Tools


language_model_settings

This is an old revision of the document!


Table of Contents

Language model settings

In SPRAAK you can use Finite State Grammars (FSGs) and N-gram language models. Both have a different set of settings when using them in the recognition process. Below an explanation of several settings as I used them:

The parameters can be set by the following command on the interactive SPRAAK commandline (spr_cwr_main): search lmi cost_A cost_C

FSGs

For FSGs I used the following settings in recognition experiments:

  • fail_cost (inside the text file defining the FSG): the threshold at which the recognition of a word fails and results in <PARTIAL> tags. Increasing or decreasing this value (can also be negative) helps in removing the <PARTIAL> tags in the recognition results.

N-gram

For N-gram language models (e.g. bigram, trigram, etc.) I used the below settings in recognition experiments:

  • cost_C (in .ini config file for recognition experiment): word startup cost: cost that is added whenever a new word is started.
  • cost_A (in .ini config file for recognition experiment): scaling factor of the LM vs. the AM.

Also see: http://www.spraak.org/documentation/doxygen/doc/html/spr__lm__scale.html for detailed information on cost_C and cost_A.

language_model_settings.1430381529.txt.gz · Last modified: 2015/04/30 10:12 by mganzeboom