CRISPR-Cas generation #59
Unanswered
alexandre239
asked this question in
Q&A
Replies: 1 comment
-
Hi Alexandre. Thanks for your question. The Evo 1 model was finetuned on Cas using those special tokens as you said. We have not performed Cas finetuning with Evo 2 or specifically looked at Cas generation. Your idea to use the phylogenetic tag and the crispr-cas locus upstream makes sense to me to try generate Cas from the pretrained Evo 2 model! You may be interested this related preprint, that uses Evo 1 to generate other systems without finetuning |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi!
Thank you for introducing Evo 2, it's truly fascinating how you managed to up-scale your model to genome-scale generation!
My issue is fundamentally just a question, and it is related to the generation of CRISPR-Cas loci. In the previous version of Evo, Evo 1, special tokens were assigned to 3 different classes of Cas (Cas9, Cas12 and Cas13). I was wondering if this feature is somehow maintained in Evo 2, since I was wondering if Evo 2 would generate better or more diverse results as it has been trained in a bigger prokaryote dataset size.
If the generation mode is not the same, what would be your advice to generate a specific subtype of CRISPR-Cas locus? Would it be to provide the corresponding species special token/phylogenetic tag (taking an example from the paper: |D__BACTERIA;P__PSEUDOMONADOTA;C__GAMMAPROTEOBACTERIA; O__ENTEROBACTERALES;F__ENTEROBACTERIACEAE;G__ESCHERICHIA; S__ESCHERICHIA|) and a context upstream sequence to prompt the model?
Thanks a lot in advance!
Beta Was this translation helpful? Give feedback.
All reactions