Downloads all RefSeq mitogenome records available for a given taxon
usage: mitodownloader.py [-h] [-f] TAXON_NAME
positional arguments:
TAXON_NAME Taxon name
optional arguments:
-h, --help show this help message and exit
-f, --fasta Downloads records in fasta format (default: genbank)
Gets contig information from a multifasta file. Has to be used with one of three options (-c, -a, -r):
usage: python3 extract_large_contigs.py [-h] [-c | -a | -r ] infile
-c, --count Get a list of all contigs and their size
-a , --acc Get a single contig by ID (please provide description line without '>')
-r , --range Get sequence of all contigs inside a min-max length. Please provide the lower and upper limits such as '12000-18000'
-h, --help show help message and exit
Converts a single genbank file to fasta, printing its output to the screen.
usage: python3 gb_to_fa.py sequence.gb
Aligns a multifasta file using clustal omega (at the moment, needs clustalo-1.2.4-Ubuntu-x86_64 on $PATH to work) and converts this alignment into a relaxed (more than 10 characters allowed for sequence identifiers) phylip alignment with no line wrapping.
The phylip alignment output can be used for the generation of phylogenetic/phylogenomic trees using PartitionFinder2.
This scripts only works with sequences that are less than 1 Gbp in size.
usage: python3 generate_phylip_from_mutlifasta.py [-h] [-t] multifasta.fa
optional arguments:
-h, --help show this help message and exit
-t , --type Type of data: {Protein, RNA or DNA(default)}
Converts .gff files generated by MitosWebServer to a modified .gff that can be exported to the Artemis Annotation tool.
usage: python3 mitos_to_artemis.py filename.gff
Removes "score" values present in the annotation of MitosWebServer. The removal of the score values from seqin files is necessary in order to submit mitochondrial sequences to genbank.
usage: python3 remove_score_seqin annotated_sequence.seqin
Extracts reads (in fastq format) from a sam file.
usage: python3 sam_to_fastq.py [-h] [-P] file.sam
optional arguments:
-h, --help show this help message and exit
-P, --paired Generates two paired-end data files (unpaired reads included)
Downloads a list of datasets in sra file format.
The sra_download.py script works by reading a text file (list of sra datasets) that should contain two collumns using tab as separators: Accession number and species name, as represented below:
ERR1306022 Species1
ERR7295165 Species2
ERR1306034 Species3
SRR4409513 Species4
At the moment, the wget is required. Please install it before running the script:
pip install wget
Script usage:
python3 sra_download.py dataset_list.txt
Splits a multigenbank in individual records, generating a genbank file (name_of_species.gb) for each.
Script usage:
python3 split_multigenbank.py multirecord_file.gb