The repository is an official implementation of Pinal: Toward De Novo Protein Design from Natural Language
Quickly try our online server (16B) here
If you have any questions about the paper or the code, feel free to raise an issue!
Create and activate a new conda environment with Python 3.8.
conda create -n pinal python=3.8 --yes
conda activate pinal
pip install -r requirements.txt
We provide a script to download the pre-trained model weights, as shown below. Please download all files and put them in the weights
directory, e.g., weights/Pinal/...
huggingface-cli download westlake-repl/Pinal \
--repo-type model \
--local-dir weights/
The weights
directory contains 3 models:
Name | Size |
---|---|
SaProt-T | 760M |
T2struc-1.2B | 1.2B |
T2struc-15B | 15B |
Design protein from natural language instruction with only 3 lines of code!
from utils.design_utils import load_pinal, PinalDesign
load_pinal()
res = PinalDesign(desc="Actin.", num=10)
# res is a list of designed proteins, sorted by the probability per token.
The above code will generate 10 de novo designed proteins based on the input description "Actin.", inferred by 1.2B T2struc and SaProt-T. If you want inference with T2struc-15B, you can set the environment variable T2struc_NAME
before calling load_pinal()
, as shown below.
import os
os.environ["T2struc_NAME"] = "T2struc-15B"
Warning: Inferencing with T2struc-15B requires at least 40GB GPU memory.
For textual alignment, we recommend using ProTrek to calculate the sequence-text similarity score.
For foldability, we recommend using pLDDT and PAE, outputted by Alphafold series or ESMFold.
- ProTrek and its online server
- Evola and its online server
- SaprotHub and its online server