|
1 |
| - |
2 |
| -# Table of Contents |
3 |
| - |
4 |
| -- [Installation](#installation) |
5 |
| - * [Install k2](#install-k2) |
6 |
| - * [Install lhotse](#install-lhotse) |
7 |
| - * [Install icefall](#install-icefall) |
8 |
| -- [Run recipes](#run-recipes) |
| 1 | +<div align="center"> |
| 2 | +<img src="https://raw.githubusercontent.com/k2-fsa/icefall/master/docs/source/_static/logo.png" width=168> |
| 3 | +</div> |
9 | 4 |
|
10 | 5 | ## Installation
|
11 | 6 |
|
12 |
| -`icefall` depends on [k2][k2] for FSA operations and [lhotse][lhotse] for |
13 |
| -data preparations. To use `icefall`, you have to install its dependencies first. |
14 |
| -The following subsections describe how to setup the environment. |
15 |
| - |
16 |
| -CAUTION: There are various ways to setup the environment. What we describe |
17 |
| -here is just one alternative. |
| 7 | +Please refer to <https://icefall.readthedocs.io/en/latest/installation/index.html> |
| 8 | +for installation. |
18 | 9 |
|
19 |
| -### Install k2 |
| 10 | +## Recipes |
20 | 11 |
|
21 |
| -Please refer to [k2's installation documentation][k2-install] to install k2. |
22 |
| -If you have any issues about installing k2, please open an issue at |
23 |
| -<https://github.com/k2-fsa/k2/issues>. |
| 12 | +Please refer to <https://icefall.readthedocs.io/en/latest/recipes/index.html> |
| 13 | +for more information. |
24 | 14 |
|
25 |
| -### Install lhotse |
| 15 | +We provide two recipes at present: |
26 | 16 |
|
27 |
| -Please refer to [lhotse's installation documentation][lhotse-install] to install |
28 |
| -lhotse. |
| 17 | + - [yesno][yesno] |
| 18 | + - [LibriSpeech][librispeech] |
29 | 19 |
|
30 |
| -### Install icefall |
| 20 | +### yesno |
31 | 21 |
|
32 |
| -`icefall` is a set of Python scripts. What you need to do is just to set |
33 |
| -the environment variable `PYTHONPATH`: |
| 22 | +This is the simplest ASR recipe in `icefall` and can be run on CPU. |
| 23 | +Training takes less than 30 seconds and gives you the following WER: |
34 | 24 |
|
35 |
| -```bash |
36 |
| -cd $HOME/open-source |
37 |
| -git clone https://github.com/k2-fsa/icefall |
38 |
| -cd icefall |
39 |
| -pip install -r requirements.txt |
40 |
| -export PYTHONPATH=$HOME/open-source/icefall:$PYTHONPATHON |
41 | 25 | ```
|
42 |
| - |
43 |
| -To verify `icefall` was installed successfully, you can run: |
44 |
| - |
45 |
| -```bash |
46 |
| -python3 -c "import icefall; print(icefall.__file__)" |
| 26 | +[test_set] %WER 0.42% [1 / 240, 0 ins, 1 del, 0 sub ] |
47 | 27 | ```
|
| 28 | +We do provide a Colab notebook for this recipe. |
48 | 29 |
|
49 |
| -It should print the path to `icefall`. |
| 30 | +[](https://colab.research.google.com/drive/1tIjjzaJc3IvGyKiMCDWO-TSnBgkcuN3B?usp=sharing) |
50 | 31 |
|
51 |
| -## Recipes |
52 | 32 |
|
53 |
| -At present, two recipes are provided: |
| 33 | +### LibriSpeech |
54 | 34 |
|
55 |
| - - [LibriSpeech][LibriSpeech] |
56 |
| - - [yesno][yesno] [](https://colab.research.google.com/drive/1tIjjzaJc3IvGyKiMCDWO-TSnBgkcuN3B?usp=sharing) |
| 35 | +We provide two models for this recipe: [conformer CTC model][LibriSpeech_conformer_ctc] |
| 36 | +and [TDNN LSTM CTC model][LibriSpeech_tdnn_lstm_ctc]. |
57 | 37 |
|
58 |
| -### Yesno |
| 38 | +#### Conformer CTC Model |
59 | 39 |
|
60 |
| -For the yesno recipe, training with 50 epochs takes less than 2 minutes using **CPU**. |
| 40 | +The best WER we currently have is: |
61 | 41 |
|
62 |
| -The WER is |
| 42 | +||test-clean|test-other| |
| 43 | +|--|--|--| |
| 44 | +|WER| 2.57% | 5.94% | |
63 | 45 |
|
64 |
| -``` |
65 |
| -[test_set] %WER 0.42% [1 / 240, 0 ins, 1 del, 0 sub ] |
66 |
| -``` |
| 46 | +We provide a Colab notebook to run a pre-trained conformer CTC model: [](https://colab.research.google.com/drive/1huyupXAcHsUrKaWfI83iMEJ6J0Nh0213?usp=sharing) |
| 47 | + |
| 48 | +#### TDNN LSTM CTC Model |
67 | 49 |
|
68 |
| -## Use Pre-trained models |
| 50 | +The WER for this model is: |
69 | 51 |
|
70 |
| -See [egs/librispeech/ASR/conformer_ctc/README.md](egs/librispeech/ASR/conformer_ctc/README.md) |
71 |
| -for how to use pre-trained models. |
72 |
| -[](https://colab.research.google.com/drive/1huyupXAcHsUrKaWfI83iMEJ6J0Nh0213?usp=sharing) |
| 52 | +||test-clean|test-other| |
| 53 | +|--|--|--| |
| 54 | +|WER| 6.59% | 17.69% | |
73 | 55 |
|
| 56 | +We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model: [](https://colab.research.google.com/drive/1kNmDXNMwREi0rZGAOIAOJo93REBuOTcd?usp=sharing) |
74 | 57 |
|
75 |
| -[yesno]: egs/yesno/ASR/README.md |
76 |
| -[LibriSpeech]: egs/librispeech/ASR/README.md |
77 |
| -[k2-install]: https://k2.readthedocs.io/en/latest/installation/index.html# |
78 |
| -[k2]: https://github.com/k2-fsa/k2 |
79 |
| -[lhotse]: https://github.com/lhotse-speech/lhotse |
80 |
| -[lhotse-install]: https://lhotse.readthedocs.io/en/latest/getting-started.html#installation |
| 58 | +[LibriSpeech_tdnn_lstm_ctc]: egs/librispeech/ASR/tdnn_lstm_ctc |
| 59 | +[LibriSpeech_conformer_ctc]: egs/librispeech/ASR/conformer_ctc |
| 60 | +[yesno]: egs/yesno/ASR |
| 61 | +[librispeech]: egs/librispeech/ASR |
0 commit comments