Sequence_Trimming

Basic Usage

The Sequence_Trimming handler trims adapter sequences and performs trimming based on quality on a set of FASTQs. This script utilizes Trimmomatic to perform the trimming. Sequence_Trimming supports both uncompressed and gzipped FASTQ files as input.

To run Sequence_Trimming, all common and handler-specific variables must be defined within the configuration file. Once the variables have been defined, Sequence_Trimming can be submitted to a job scheduler with the following command (assuming that you are in the directory containing RNApipeline):

./main.sh Sequence_Trimming proj.conf

Handler-Specific Variables

The following are a list of variables that need to be defined within the configuration file. In addition to the handler-specific variables, all common variables must be defined.

Variable	Function
`ST_QSUB`	QSub settings for batch submission
`FORWARD_NAMING`	Shared suffix for forward reads. Example: If your files are named `sample1_R1.fastq` and `sample2_R1.fastq`, then `FORWARD_NAMING=_R1.fastq`
`REVERSE_NAMING`	Shared suffix for reverse reads. Example: If your files are named `sample1_R2.fastq` and `sample2_R2.fastq`, then `REVERSE_NAMING=_R2.fastq`
`ADAPTERS`	A plain text or FASTA file with the adapter sequences. These sequences will depend on the technology and platform used for sequencing, but most common adapters for various platforms can be found online
`PHRED64`	Use the phred64 scale instead of the phred33 quality scale

Note: If you have single-end samples, leave FORWARD_NAMING and REVERSE_NAMING filled with values that do not match your samples. If none of your samples match the forward or reverse naming suffixes, Adapter_Trimming will automatically assume that the samples are single-end.

Output

Sequence_Trimming will output a trimmed, gzipped FASTQ file for each sample. If you have paired-end data, then each sample should end in _forward_paired.fastq.gz for forward reads and _reverse_paired.fastq.gz for reverse reads; single-end data should end in _trimmed.fastq.gz.

In addition, a list of all trimmed FASTQ files will be generated for use with other handlers. The full file path to this list will be ${OUT_DIR}/Sequence_Trimming/${PROJECT}_trimmed.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sequence_Trimming

Basic Usage

Handler-Specific Variables

Output

Dependencies

Next: Read_Mapping

Main Information

Recommended Workflow Handlers

Clone this wiki locally