====== Kaldi ASR Toolkit ====== Under this topic you can find information about the Kaldi ASR Toolkit, like URLs and paths where to find it. Kaldi is a more recent ASR toolkit compared to SPRAAK. Like SPRAAK, it contains functionality to train different types of GMM-HMM acoustic models, but also various types of Deep Neural Networks (DNNs), the current standard in ASR. This page provides links to Kaldi's own documentation pages and tips & tricks on how to use Kaldi for certain contexts. Feel free to add experiences which you feel are useful to others (i.e. to not 'reinvent the wheel').\\ ===== Recommendation (DEPRECATED!): user your own LaMachine on Ponyland ===== It is recommended to use your own LaMachine to use Kaldi on Ponyland instead of the shared one (however, LaMachine is now deprecated!). The instructions to install/prepare your own LaMachine are here: [[https://www.cls.ru.nl/clst-asr/doku.php?id=kaldi_and_kald_nl_on_ponyland#step_1prerequisites|Your own Kaldi-CLAM-LaMachine]] (just Step 1: Prerequisites). Please note that you should change 'cristian' by other name in every step. Once you have your own LaMachine-CLAM-Kaldi, you can use these commands to activate your environment: $ ssh thunderlane $ lamachine-lacristianmachine-activate​​ (lacristianmachine)$ cd `echo $KALDI_ROOT/egs` ​Also, //thunderlane// and //rarity// are the best servers to work with Kaldi. ===== Details ===== **Name:** Kaldi ASR\\ **Type:** open source software\\ **Developer info page:** [[http://www.kaldi-asr.org]]\\ **Documentation page:** [[http://www.kaldi-asr.org/doc/]]\\ **Compile and installation instructions:** [[http://kaldi-asr.org/doc/install.html]] **Location of Kaldi sources:** [[https://github.com/kaldi-asr/kaldi]]\\ **Location of linux x86_64 compiled binaries (usable on Ponyland):** applejack.science.ru.nl:/vol/custom/opt/lamachine2/opt/kaldi. This is a central location for the Kaldi binaries and (for now) seem to always be the most recent version. These are compiled with NVIDIA CUDA 9.1.85 GPU support. In other words [[https://developer.nvidia.com/cuda-downloads|NVIDIA CUDA 9.1]] should be installed on the target machine (already installed on Ponyland).\\ Note: for the above applejack urls a Ponyland SSH account is required from [[http://organisatiegids.ru.nl/tabProfiel.aspx?isEngels=true&RBSID=394272&frm=organisatiegids&doHide=False/|Wessel Stoop]] ===== AlexASR: a kaldi-based incremental online decoder ===== AlexASR is an incremental online decoder based on Kaldi. It can be used if you'd like to use ASR in a time sensitive context. It immediately decodes the speech as it comes in and only requires some finalization after the last audio packet was received. For example, we used this decoder in the game developed in the CHASING project. To reduce player waiting time on ASR results, AlexASR was used to decode speech as it was being recorded from the player.\\ **Location of sources and info:** [[https://github.com/UFAL-DSG/alex-asr]]\\ See also [[alex_asr|Alex ASR]]. ===== Useful tutorial links ===== There are some useful tutorials to get to know the Kaldi toolkit and how to operate it to do various things. Below a list sorted per topic.\\ **Training acoustic models** * A Kaldi introduction by former CLST employee Emre Yilmaz: [[kaldi_intro|Developing ASR Systems Using the KALDI toolkit]] * Kaldi's own relatively detailed tutorial providing an overview of the toolkit: [[http://kaldi-asr.org/doc/tutorial.html]] * Kaldi for Dummies tutorial for absolute beginners of Kaldi to create a simple ASR system: [[http://kaldi-asr.org/doc/kaldi_for_dummies.html]] * Eleanor Chodroff's tutorial: [[http://pages.jh.edu/~echodro1/tutorial/kaldi/kaldi-intro.html]] * Josh Meyer's tutorial on training nnet2 models (very useful to get to know the basics of the existing training scripts in Kaldi): [[http://jrmeyer.github.io/kaldi/2016/12/15/DNN-AM-Kaldi.html]] * John Hopkins's tutorial: [[https://www.dropbox.com/s/az474t7trsqkpri/2016-05-SLTU-Workshop.pdf?dl=0]] or {{:wiki:2016-05-sltu-workshop.pdf|}} * The offical Kaldi lectures: http://www.danielpovey.com/kaldi-lectures.html **More practical tutorials**: * https://github.com/SethiPawandeep/kaldi-for-dummies (with raw audio data) * https://github.com/keighrim/kaldi-yesno-tutorial * You might find this repo useful for keighrim tutorial: https://github.com/jmolina116/kaldi-yesno-tutorial (with raw audio data) * https://github.com/jmolina116/kaldi-digits (with raw audio data) * http://www.eleanorchodroff.com/tutorial/kaldi/training-acoustic-models.html * https://www.oxinabox.net/Kaldi-Notes/tidigits/ * https://www.oxinabox.net/Kaldi-Notes/tidigits/eval * https://www.oxinabox.net/Kaldi-Notes/tidigits/lang_prep * http://jrmeyer.github.io/asr/2019/08/17/Kaldi-troubleshooting.html * http://codingandlearning.blogspot.com/search/label/KWS14 **Forced alignment using existing acoustic models** * Eleanor Chodroff's page: [[https://www.eleanorchodroff.com/tutorial/kaldi/forced-alignment.html]]