Skip to content

Commit 96e7f5c

Browse files
authored
Release v0.1 (#26)
1 parent f4223ee commit 96e7f5c

File tree

2 files changed

+39
-119
lines changed

2 files changed

+39
-119
lines changed

README.md

+37-56
Original file line numberDiff line numberDiff line change
@@ -1,80 +1,61 @@
1-
2-
# Table of Contents
3-
4-
- [Installation](#installation)
5-
* [Install k2](#install-k2)
6-
* [Install lhotse](#install-lhotse)
7-
* [Install icefall](#install-icefall)
8-
- [Run recipes](#run-recipes)
1+
<div align="center">
2+
<img src="https://raw.githubusercontent.com/k2-fsa/icefall/master/docs/source/_static/logo.png" width=168>
3+
</div>
94

105
## Installation
116

12-
`icefall` depends on [k2][k2] for FSA operations and [lhotse][lhotse] for
13-
data preparations. To use `icefall`, you have to install its dependencies first.
14-
The following subsections describe how to setup the environment.
15-
16-
CAUTION: There are various ways to setup the environment. What we describe
17-
here is just one alternative.
7+
Please refer to <https://icefall.readthedocs.io/en/latest/installation/index.html>
8+
for installation.
189

19-
### Install k2
10+
## Recipes
2011

21-
Please refer to [k2's installation documentation][k2-install] to install k2.
22-
If you have any issues about installing k2, please open an issue at
23-
<https://github.com/k2-fsa/k2/issues>.
12+
Please refer to <https://icefall.readthedocs.io/en/latest/recipes/index.html>
13+
for more information.
2414

25-
### Install lhotse
15+
We provide two recipes at present:
2616

27-
Please refer to [lhotse's installation documentation][lhotse-install] to install
28-
lhotse.
17+
- [yesno][yesno]
18+
- [LibriSpeech][librispeech]
2919

30-
### Install icefall
20+
### yesno
3121

32-
`icefall` is a set of Python scripts. What you need to do is just to set
33-
the environment variable `PYTHONPATH`:
22+
This is the simplest ASR recipe in `icefall` and can be run on CPU.
23+
Training takes less than 30 seconds and gives you the following WER:
3424

35-
```bash
36-
cd $HOME/open-source
37-
git clone https://github.com/k2-fsa/icefall
38-
cd icefall
39-
pip install -r requirements.txt
40-
export PYTHONPATH=$HOME/open-source/icefall:$PYTHONPATHON
4125
```
42-
43-
To verify `icefall` was installed successfully, you can run:
44-
45-
```bash
46-
python3 -c "import icefall; print(icefall.__file__)"
26+
[test_set] %WER 0.42% [1 / 240, 0 ins, 1 del, 0 sub ]
4727
```
28+
We do provide a Colab notebook for this recipe.
4829

49-
It should print the path to `icefall`.
30+
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1tIjjzaJc3IvGyKiMCDWO-TSnBgkcuN3B?usp=sharing)
5031

51-
## Recipes
5232

53-
At present, two recipes are provided:
33+
### LibriSpeech
5434

55-
- [LibriSpeech][LibriSpeech]
56-
- [yesno][yesno] [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1tIjjzaJc3IvGyKiMCDWO-TSnBgkcuN3B?usp=sharing)
35+
We provide two models for this recipe: [conformer CTC model][LibriSpeech_conformer_ctc]
36+
and [TDNN LSTM CTC model][LibriSpeech_tdnn_lstm_ctc].
5737

58-
### Yesno
38+
#### Conformer CTC Model
5939

60-
For the yesno recipe, training with 50 epochs takes less than 2 minutes using **CPU**.
40+
The best WER we currently have is:
6141

62-
The WER is
42+
||test-clean|test-other|
43+
|--|--|--|
44+
|WER| 2.57% | 5.94% |
6345

64-
```
65-
[test_set] %WER 0.42% [1 / 240, 0 ins, 1 del, 0 sub ]
66-
```
46+
We provide a Colab notebook to run a pre-trained conformer CTC model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1huyupXAcHsUrKaWfI83iMEJ6J0Nh0213?usp=sharing)
47+
48+
#### TDNN LSTM CTC Model
6749

68-
## Use Pre-trained models
50+
The WER for this model is:
6951

70-
See [egs/librispeech/ASR/conformer_ctc/README.md](egs/librispeech/ASR/conformer_ctc/README.md)
71-
for how to use pre-trained models.
72-
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1huyupXAcHsUrKaWfI83iMEJ6J0Nh0213?usp=sharing)
52+
||test-clean|test-other|
53+
|--|--|--|
54+
|WER| 6.59% | 17.69% |
7355

56+
We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1kNmDXNMwREi0rZGAOIAOJo93REBuOTcd?usp=sharing)
7457

75-
[yesno]: egs/yesno/ASR/README.md
76-
[LibriSpeech]: egs/librispeech/ASR/README.md
77-
[k2-install]: https://k2.readthedocs.io/en/latest/installation/index.html#
78-
[k2]: https://github.com/k2-fsa/k2
79-
[lhotse]: https://github.com/lhotse-speech/lhotse
80-
[lhotse-install]: https://lhotse.readthedocs.io/en/latest/getting-started.html#installation
58+
[LibriSpeech_tdnn_lstm_ctc]: egs/librispeech/ASR/tdnn_lstm_ctc
59+
[LibriSpeech_conformer_ctc]: egs/librispeech/ASR/conformer_ctc
60+
[yesno]: egs/yesno/ASR
61+
[librispeech]: egs/librispeech/ASR

egs/librispeech/ASR/README.md

+2-63
Original file line numberDiff line numberDiff line change
@@ -1,64 +1,3 @@
11

2-
## Data preparation
3-
4-
If you want to use `./prepare.sh` to download everything for you,
5-
you can just run
6-
7-
```
8-
./prepare.sh
9-
```
10-
11-
If you have pre-downloaded the LibriSpeech dataset, please
12-
read `./prepare.sh` and modify it to point to the location
13-
of your dataset so that it won't re-download it. After modification,
14-
please run
15-
16-
```
17-
./prepare.sh
18-
```
19-
20-
The script `./prepare.sh` prepares features, lexicon, LMs, etc.
21-
All generated files are saved in the folder `./data`.
22-
23-
**HINT:** `./prepare.sh` supports options `--stage` and `--stop-stage`.
24-
25-
## TDNN-LSTM CTC training
26-
27-
The folder `tdnn_lstm_ctc` contains scripts for CTC training
28-
with TDNN-LSTM models.
29-
30-
Pre-configured parameters for training and decoding are set in the function
31-
`get_params()` within `tdnn_lstm_ctc/train.py`
32-
and `tdnn_lstm_ctc/decode.py`.
33-
34-
Parameters that can be passed from the command-line can be found by
35-
36-
```
37-
./tdnn_lstm_ctc/train.py --help
38-
./tdnn_lstm_ctc/decode.py --help
39-
```
40-
41-
If you have 4 GPUs on a machine and want to use GPU 0, 2, 3 for
42-
mutli-GPU training, you can run
43-
44-
```
45-
export CUDA_VISIBLE_DEVICES="0,2,3"
46-
./tdnn_lstm_ctc/train.py \
47-
--master-port 12345 \
48-
--world-size 3
49-
```
50-
51-
If you want to decode by averaging checkpoints `epoch-8.pt`,
52-
`epoch-9.pt` and `epoch-10.pt`, you can run
53-
54-
```
55-
./tdnn_lstm_ctc/decode.py \
56-
--epoch 10 \
57-
--avg 3
58-
```
59-
60-
## Conformer CTC training
61-
62-
The folder `conformer-ctc` contains scripts for CTC training
63-
with conformer models. The steps of running the training and
64-
decoding are similar to `tdnn_lstm_ctc`.
2+
Please refer to <https://icefall.readthedocs.io/en/latest/recipes/librispeech.html>
3+
for how to run models in this recipe.

0 commit comments

Comments
 (0)