-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
33 changed files
with
88 additions
and
42 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
Program,Author,Year,Contig assembly method,Scaffolding method,Use cases,Link,Paper,Used | ||
ALLPATHS-LG,Gnerre,2011,de Bruijn graph and sequence graphs,Read pairs,Large genomes with both short and long reads,http://software.broadinstitute.org/allpaths-lg/blog/?page_id=12,https://doi.org/10.1073/pnas.1017351108,N | ||
hifiasm,Cheng,2020,Error corrected overlap-layout to preserve haplotypes,NA,PacBio Hifi reads,https://github.com/chhylp123/hifiasm,https://arxiv.org/abs/2008.01237,N | ||
Spades,Bankevich,2012,de Bruijn graph,Read pairs and small gap repeat resolution; can also use long reads or previously inferred contigs,Good for small genomes or targeted sequencing (e.g. exomes),https://github.com/ablab/spades,https://doi.org/10.1089/cmb.2012.0021,Y | ||
Discovar de novo,Weisenfeld,2014,de Bruijn graph and lines,NA,Single 2x250bp library,https://software.broadinstitute.org/software/discovar/blog/,https://dx.doi.org/10.1038%2Fng.3121,N | ||
Supernova,Weisenfeld,2017,de Bruijn graph and lines,Read pairs and linked read barcodes,Large genomes with linked reads for phased genome assemblies,https://github.com/10XGenomics/supernova,https://doi.org/10.1101/gr.235812.118,N | ||
Canu,Koren,2017,Overlap-layout-consensus,NA,Long reads,https://github.com/marbl/canu,https://dx.doi.org/10.1101/gr.215087.116,N | ||
HiCanu,Nurk,2020,Overlap-layout-consensus,NA,PacBio Hifi reads,,https://doi.org/10.1101/gr.263566.120 ,N | ||
Flye,Kolmogorov,2019,Repeat graph,NA,Long reads,https://github.com/fenderglass/Flye/,https://doi.org/10.1038/s41587-019-0072-8,Y | ||
platanus,Kajitani,2014,de Bruijn graph,Read pairs,Short reads for genomes with high heterozygosity,https://dx.doi.org/10.1101%2Fgr.170720.113,http://platanus.bio.titech.ac.jp/,N | ||
opera-lg,Gao,2016,NA,Paired reads and long reads,Scaffolding of repeat-rich genomes,https://sourceforge.net/p/operasf/wiki/The%20OPERA%20wiki/,https://doi.org/10.1186/s13059-016-0951-y,N | ||
agouti,Zhang,2016,NA,RNA-seq reads,Scaffolding of large genomes,https://github.com/svm-zhang/AGOUTI,https://doi.org/10.1186/s13742-016-0136-3,N | ||
Abyss,Jackman,2017,de Bruijn graph; all possible k-mers,"""Mate pairs, linked reads, or long reads""",Short read libraries for genomes up to 100Mb; Transcriptomes with Trans-ABySS,https://github.com/bcgsc/abyss,https://doi.org/10.1101/gr.214346.116 ,N | ||
Velvet,Zerbino and Birney,2008,de Bruijn graph,Read pairs,Short read assembly,https://github.com/dzerbino/velvet,https://doi.org/10.1101/gr.074492.107,N | ||
SOAPdenovo2,Luo,2012,de Bruijn graph,Read pairs,Short read assembly,https://github.com/aquaskyline/SOAPdenovo2,https://doi.org/10.1186/2047-217X-1-18,N | ||
MaSuRCA,Zimin,2013,Overlap-layout-consensus on unique super-reads,"""Mate pairs, linked reads, or long reads""",Mixed short read libraries of large genomes,https://github.com/alekseyzimin/masurca,https://doi.org/10.1093/bioinformatics/btt476,N | ||
CABOG,NA,NA,NA,NA,NA,http://wgs-assembler.sourceforge.net/wiki/index.php?title=Main_Page,NA,N | ||
Falcon,Chin,2016,String graph,NA,PacBio long reads for diploid genome assembly,https://github.com/PacificBiosciences/pb-assembly,https://dx.doi.org/10.1038%2Fnmeth.4035,N | ||
Miniasm,Li,2016,Overlap-layout,NA,Long reads,https://github.com/lh3/miniasm,https://doi.org/10.1093/bioinformatics/btw152,N | ||
HINGE,Kamath,2017,Overlap-layout-consensus with hinging,NA,Long reads,https://hingeassembler.github.io/,https://doi.org/10.1101/gr.216465.116 ,N | ||
Abruijn,Lin,2016,A-Bruijn graph,NA,Long reads,https://github.com/bioreps/ABruijn,https://doi.org/10.1073/pnas.1604560113,N | ||
MEGAHIT,Li,2015,de Bruijn graph,,Short read assembly of metagenomes,https://github.com/voutcn/megahit,https://doi.org/10.1093/bioinformatics/btv033,N | ||
Peregrine,Chin and Khalak,2020,Overlap-layout-consensus with shimmer indexing,NA,Fast long reads assembly,https://github.com/cschin/Peregrine,https://doi.org/10.1101/705616 ,N |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
Format,Use,Link,Specs | ||
FASTA,Stores sequence data.,https://en.wikipedia.org/wiki/FASTA_format,NA | ||
FASTQ,Stores sequence data and quality scores.,https://en.wikipedia.org/wiki/FASTQ_format,https://doi.org/10.1093/nar/gkp1137 | ||
SAM,Sequence Alignment Map format. Stores information about reads mapped to a reference genome.,https://en.wikipedia.org/wiki/SAM_(file_format),https://samtools.github.io/hts-specs/SAMv1.pdf | ||
BAM,Binary Alignment Map format. The compressed binary version of SAM format.,https://en.wikipedia.org/wiki/SAM_(file_format),https://samtools.github.io/hts-specs/SAMv1.pdf | ||
CRAM,Another compressed format to store read mapping information.,https://en.wikipedia.org/wiki/CRAM_(file_format),https://samtools.github.io/hts-specs/CRAMv3.pdf | ||
VCF,Variant Call Format. Used to store information about variants inferred for a given sample(s).,https://en.wikipedia.org/wiki/Variant_Call_Format,https://samtools.github.io/hts-specs/VCFv4.2.pdf | ||
BCF,Binary variant Call Format. The binary compressed verion of a VCF.,https://en.wikipedia.org/wiki/Variant_Call_Format,https://samtools.github.io/hts-specs/VCFv4.2.pdf | ||
BED,Stores coordinates of regions of interest,https://en.wikipedia.org/wiki/BED_(file_format),https://bedtools.readthedocs.io/en/latest/content/general-usage.html |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
Program,Author,Year,Use cases,Link,Paper,Used | ||
BWA,Li and Durbin,2010,Short read alignment,http://bio-bwa.sourceforge.net/,https://doi.org/10.1093/bioinformatics/btp324,Y | ||
TopHat2,Kim,2013,Mapping RNA-seq reads,https://ccb.jhu.edu/software/tophat/index.shtml,https://doi.org/10.1186/gb-2013-14-4-r36,N | ||
Minimap2,Li,2018,Long read mapping and whole genome alignment,https://github.com/lh3/minimap2,https://doi.org/10.1093/bioinformatics/bty191,Y | ||
bbmap,Bushnell,2014,Short and long read mapper with many extras,https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/bbmap-guide/,https://www.osti.gov/biblio/1241166,N | ||
Bowtie2,Langmead and Salzberg,2012,Short read alignment,http://bowtie-bio.sourceforge.net/bowtie2/index.shtml,https://dx.doi.org/10.1038%2Fnmeth.1923,N | ||
SOAP2,Li,2009,Short read alignment,https://sourceforge.net/projects/soapdenovo2/,https://doi.org/10.1093/bioinformatics/btp336,N |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
Program,Author,Year,Use cases,Link,Paper,Used | ||
bedtools,Quinnlan and Hall,2010,Perform operations on sets of genomic coordinates.,https://bedtools.readthedocs.io/en/latest/,https://doi.org/10.1093/bioinformatics/btq033,Y | ||
bcftools,NA,NA,Perform opterions on VCF and BCF formatted files.,http://samtools.github.io/bcftools/,NA,Y | ||
samtools,Li,2009,Perform operations on SAM/BAM/CRAM formatted files.,http://www.htslib.org/download/,https://doi.org/10.1093/bioinformatics/btp352,Y | ||
Picard tools,Broad Institute,2019,Performs many operations on SAM/BAM/CRAM and VCF files.,https://github.com/broadinstitute/picard,http://broadinstitute.github.io/picard/,Y | ||
mosdepth,Pedersen and Quinlan,2018,Calculates read depth from mapped reads.,https://github.com/brentp/mosdepth,https://doi.org/10.1093/bioinformatics/btx699,N | ||
pseudo-it,Sarver,2017,Iterative read mapping for pseudo-reference assembly.,https://github.com/goodest-goodlab/pseudo-it,https://doi.org/10.1093/gbe/evx034,Y | ||
Referee,Thomas and Hahn,2018,Assign per-base quality scores to genome assemblies.,https://gwct.github.io/referee/,https://doi.org/10.1093/gbe/evz088,N |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
Term,Definition | ||
Sequence libraries,A sample of DNA that has been processed to be sequenced. | ||
Reads,Fragmented and overlapping pieces of a DNA strand that are sequenced. | ||
Phred quality scores,"A scaled probability that a given inference (usually a called base) is incorrect. The probabilty of error, P(e), scaled by -10 log P(e)." | ||
Short reads,"Reads from first and second generation sequencing such as Sanger, Illumina, IonTorrent, etc. Short reads can range from 30-1000bp long." | ||
Read pair,"Many short read sequencing technologies sequence from both ends of a DNA fragment, resulting in a pair of sequenced reads that come from said fragment." | ||
Adapter,A short piece of DNA that is ligated to the short fragment to be sequenced. The adapter allows the fragment to be affixed to a physical medium (such as a flow cell) to facilitate amplification and sequencing. | ||
Insert size,The size of the DNA fragment between the adapter sequences. | ||
Mate pairs,Long-insert paired end reads prepared by circularizing longer DNA fragments. | ||
Jumping libraries,Junction-fragment libraries. Mate pair libraries. | ||
Long reads,Reads from single-molecule sequencing technology such as PacBio SMRT and Oxford Nanopore. Long reads can range from 1000-100000+bp long. | ||
"""Genome assembly, Assembly, de novo Assembly""","""1. The process by which small overlapping parts of the genome are reconstructed into longer contiguous sequences, 2. A sequence that has undergone the assembly process.""" | ||
Contigs,Assembled reads. Contig assembly is usually done with a graph-based representation (i.e. de Bruijn graphs) of overlapping sequence reads. | ||
Scaffolds,Contigs that have been joined together to form longer sequences. Scaffolding is usually done using read pair information or long reads. | ||
Reference genome,An already assembled genome to which you can compare newly sequenced reads or genomes. | ||
Read mapping,The process of aligning reads from a newly sequenced genome to a reference genome | ||
Mapping quality,A usually Phred scaled probability that a given read has mapped incorrectly. | ||
Reference-guided assembly,"""1. The process of using read mapping to reconstruct the genome from a set of reads, 2. A sequence that has undergone the reference-guided assembly process.""" | ||
Reference bias,The phenomenon of a set of mapped reads appearing to resembe (through lower divergence) the reference genome more closely than they actually do because reads containing the most variation were not mapped. | ||
Iterative mapping,"The process of mapping reads to a reference genome, generating a reference-guided assembly, and then repeating the process this time mapping to the new reference guided assembly. Done to reduce reference bias." |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters