This is an old revision of the document!
Quality Control with FastQC, Fastp and MultiQC
The following scripts perform quality check with FastQC and MultiQC. You can find Fastp in How to perform quality control and adapter trimming of your reads with cutadapt and fastp. Fastp performs quality trimming and also generates reports, which can be used as input to MultiQC.
FastQC
Uses fastq files as inputs and generates reports. Below daughter and parent scripts are provided.
Dauther script
fastqc.sh
# SCRIPT FOR PERFORMING FASTQC
# NOTE: Run this script from the directory where the "log" directory is located,
# Example: /mnt/proj/ibd/ds-06_cd-fecal/common
#
# PURPOSE:
# This script performs FastQC.
#
# PARAMETERS:
# 1: input_dir - Directory where FASTQ files are located.
# 2: output_dir - Directory where FastQC reports will be located
# SAMPLE USAGE:
# In a parent script: src/fastqc.sh <input_dir> <output_dir>
#
# IMPORTANT:
# - Run from a parent script.
# Check if correct number of arguments are provided
if [ "$#" -ne 2 ]; then
echo "Usage: $0 <input_dir> <output_dir>"
exit 1
fi
# Input parameters
input_dir="$1"
output_dir="$2"
echo "Running FastQC on reads..."
mkdir -p "$output_dir"
fastqc -o "$output_dir" "$input_dir/"*.fq.gz
Parent script
fastqc_00.sh
#!/bin/bash #SBATCH --mem=10gb #SBATCH --cpus-per-task=10 #SBATCH --job-name=fastqc_00 #SBATCH --output=log/fastqc_00%j.log # %j will be replaced with the job ID #parameters input=fq_renamed output=fastqc src/fastqc.sh $input $output
MultiQC
MultiQC generates a report based on multiple FastQC reports and enables to view then simultaneously. MultiQC also takes Fastp reports as input and allocates a separate section.
Dauther script
multiqc.sh
# SCRIPT FOR PERFORMING MultiQC
# NOTE: Run this script from the directory where the "log" directory is located,
# Example: /mnt/proj/ibd/ds-06_cd-fecal/common
#
# PURPOSE:
# This script performs MultiQC.
#
# PARAMETERS:
# 1: input_dir - Directory where FastQC reports are located.
# 2: output_dir - Directory where MultiQC report will be located
# SAMPLE USAGE:
# In a parent script: src/multiqc.sh <input_dir> <output_dir>
#
# IMPORTANT:
# - Run from a parent script.
# Check if correct number of arguments are provided
if [ "$#" -ne 2 ]; then
echo "Usage: $0 <input_dir> <output_dir>"
exit 1
fi
# Input parameters
input_dir="$1"
output_dir="$2"
echo "Running MultiQC..."
mkdir -p "$output_dir"
multiqc -o "$output_dir" "$input_dir/"*.fq.gz
Parent script
multiqc_00.sh
#!/bin/bash #SBATCH --mem=10gb #SBATCH --cpus-per-task=10 #SBATCH --job-name=multiqc_00 #SBATCH --output=log/multiqc_00%j.log # %j will be replaced with the job ID #parameters input=fastqc output=multiqc src/multiqc.sh $input $output
You can also perform them simultaneously with one parent script:
Combined parent script
fastqc_multiqc_00.sh
#!/bin/bash #SBATCH --mem=10gb #SBATCH --cpus-per-task=10 #SBATCH --job-name=fastqc_multiqc_00 #SBATCH --output=log/fastqc_multiqc_00%j.log # %j will be replaced with the job ID #parameters input_fqc=fq output_fqc=fastqc input_mqc=fastqc output_mqc=multiqc src/fastqc.sh $input_fqc $output_fqc src/multiqc.sh $input_mqc $output_mqc
You can also use the same scripts on trimmed files. Just add “_trimmed” to the inputs and outputs like this. Remember to save them in separate folders (fastqc_trimmed, multiqc_trimmed)
#Input parameters input=fq_trimmed output=fastqc_trimmed
Fastp
If you want to add Fastp reports to MultiQC, you can add “fq_trimmed” directory as input to MultiQC.
In daughter script:
# Rest of the code # Input parameters input_dir=fastqc_trimmed input_fastp=fq_trimmed output_dir=multiqc_trimmed # Rest of the code
In parent script:
# Rest of the code src/multiqc.sh $input_dir $input_fastp $output_dir
