====== Preprocessing ====== SPRAAK comes with a range of filters and other signal processing algorithms which can be used to preprocess the audio before sending it to a recogniser. These algorithms are for example used to extract the standard MFCC features from the audio on which the recognition takes place. See [[http://www.spraak.org/documentation/doxygen/doc/html/spr__um__feat.html]] for an example preprocessing and feature extraction script. ===== Tips & Tricks ===== ==== Preprocessing and existing acoustic models ==== When using existing acoustic models, it is probably not wise to change the preprocessing scripts settings. This was tried with the models received from the KU Leuven. Those were trained on the CGN corpus and retrained on elderly speech from the JASMIN corpus. The preprocessing setting 'flength' (or audio frame length) was changed from 0.025s to 0.032s which resulted in better freephone recognition rates. However, when doing word recognition experiments with a bigram language model, results contained tags (explained [[http://www.spraak.org/documentation/doxygen/doc/html/spr__ling__transcriptions.html|here]]). Returning flength to 0.025s resolved this issue.