This repository demonstrates application of latent space simulators in PyTorch for two exemplar biomolecules alanine dipeptide (ADP) and the WLALL pentapeptide.
The Latent Space Simulators (LSS) framework provides a workflow for generating simulation data by splitting the task into three, relatively simple, supervised learning problems:
- Starting with some simulation data used to train the network, use State-free (non-)reversible VAMPNets (SNRVs) to learn a low-dimentional encoding of the system kinetics. The typical procedure here involves featurizing each simulation snapshot using some roto-translationally invariant descriptor (such as pairwise distances). The SNRV then translates these high-dimentional features into a low-dimentional embeddings that encaptures the slow (maximally autocorrelated) kinetics in your system.
- Given your SNRV featurization, we then train a Mixture Density Network (MDN) propagator that learns to propagate trajectory timesteps within the low-dimensional kinetic subspace. Using this MDN network we can generate synthetic trajectories in the space of our low-dimensional SNRV coordinates at a fraction of the cost that propagating dynamics in the full 3N-dimensional configurational space would be.
- In the final step we train a generative model, either a Generative Adversarial Network (GAN) or a Denoising Diffusion Probabilistic Model (DDPM), to produce realistic coordinate associated with any given low-dimensional SNRV featurization. Using this trained model we can decode the synthetic trajectory in our low-dimensional SNRV space generated by the MDN into assocaited full-dimensional coordinate space, rendering these structures to produce a snynthetic molecular trajectory.
Jupyter notebooks demonstrating the LSS workflow can be found in ADP_backbone_LSS.ipynb
and WLALL_backbone_LSS.ipynb
To run the examples you will need to clone and install the following repositories that impliment the three separate supervised learning components in your environment:
Some additional dependencies for running and visualizing the examples:
Tutorials for Alanine Dipeptide and BBA proteins along with Google Collab notebooks are linked below:
If you use this code in your work, please cite:
Sidky, Hythem, Wei Chen, and Andrew L. Ferguson. "Molecular latent space simulators." Chemical Science 11.35 (2020): 9459-9467. DOI: 10.1039/D0SC03635H
@article{sidky2020molecular,
title={Molecular latent space simulators},
author={Sidky, Hythem and Chen, Wei and Ferguson, Andrew L},
journal={Chemical Science},
volume={11},
number={35},
pages={9459--9467},
year={2020},
publisher={Royal Society of Chemistry}
}
W. Chen, H. Sidky, and A.L. Ferguson "Nonlinear discovery of slow molecular modes using state-free reversible VAMPnets" J. Chem. Phys. 150 214114 (2019) doi: 10.1063/1.5092521
@article{chen2019nonlinear,
title={Nonlinear discovery of slow molecular modes using state-free reversible VAMPnets},
author={Chen, Wei and Sidky, Hythem and Ferguson, Andrew L},
journal={The Journal of Chemical Physics},
volume={150},
number={21},
pages={214114},
year={2019},
publisher={AIP Publishing LLC}
}