Skip to content

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Notifications You must be signed in to change notification settings

zhhao1/SpecAugment

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition Park, Daniel S. and Chan, William and Zhang, Yu and Chiu, Chung-Cheng and Zoph, Barret and Cubuk, Ekin D. and Le, Quoc V. Interspeech 2019 [Paper]

About

This repository contains a implementation of the augmentation methodology proposed in the above paper.

Base Input

SpecAugmented Output (Policy = 'LB')

Requirements:

  1. python3
  2. librosa
  3. libsndfile
  4. audioread
  5. ffmpeg
  6. numpy
  7. tensorflow
  8. tensorflow_addons

Usage:

main.py [--dir][--policy]

--dir | path/to/dataset | default='./LibriSpeech/'
--policy | augmentation policy to use from {'LB','LD', 'SS', 'SM'} | deafault='LD'

OR

refer to demo/demo.ipynb for jupyter notebook demo

References:

  1. @article{Park_2019, title={SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition}, url={http://dx.doi.org/10.21437/Interspeech.2019-2680}, DOI={10.21437/interspeech.2019-2680}, journal={Interspeech 2019}, publisher={ISCA}, author={Park, Daniel S. and Chan, William and Zhang, Yu and Chiu, Chung-Cheng and Zoph, Barret and Cubuk, Ekin D. and Le, Quoc V.}, year={2019}, month={Sep} }

About

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%