Skip to content

Latest commit

 

History

History
18 lines (15 loc) · 860 Bytes

README.md

File metadata and controls

18 lines (15 loc) · 860 Bytes

Gene Annotation Pipeline

Steps

  1. Organizing inputs: Naming genomes and RNAseq reads
  2. Mapping RNAseq reads to the genome (STAR)
  3. Calculating genome quality metrics (Assemblethon, BUSCOs) - seprate alt-haplotypes in genome, if any.
  4. Remapping RNAseq reads to primary scaffolds/contigs only (STAR)
  5. Running transcript assembly (Cufflinks, Strawberry, Stringtie, Class2 and Trinity), Splice junctions (Portcullis)
  6. Run ab initio gene prediction (BRAKER)
  7. Map Trinity to genome to generate GFF3 (GMAP)
  8. Pick transcripts for evidence-based predictions (Mikado)
  9. Combine annotations (Mikado and Homology based predictions with ab initio) (GeMoMa)
  10. Identify primary transcripts (TRaCE)
  11. Finalize GFF3 files (custom) and calculate annotation metrics (AGAT, BUSCO)
  12. Perform repeat annotations (EDTA)
  13. Functional Annotations (TBD)