User Tools

Site Tools


scripts:qc

This is an old revision of the document!


Quality Control with FastQC, Fastp and MultiQC

The following scripts perform quality check with FastQC and MultiQC. You can find Fastp in How to perform quality control and adapter trimming of your reads with cutadapt and fastp. Fastp performs quality trimming and also generates reports, which can be used as input to MultiQC.

FastQC

Uses fastq files as inputs and generates reports. Below daughter and parent scripts are provided.

Dauther script

fastqc.sh

# SCRIPT FOR PERFORMING FASTQC
# NOTE: Run this script from the directory where the "log" directory is located,
#       Example: /mnt/proj/ibd/ds-06_cd-fecal/common
#
# PURPOSE:
#   This script performs FastQC.
#
# PARAMETERS:
#   1: input_dir - Directory where FASTQ files are located.
#   2: output_dir - Directory where FastQC reports will be located
# SAMPLE USAGE:
#   In a parent script: src/fastqc.sh <input_dir> <output_dir>
#
# IMPORTANT:
#   - Run from a parent script.

# Check if correct number of arguments are provided
if [ "$#" -ne 2 ]; then
    echo "Usage: $0 <input_dir> <output_dir>"
    exit 1
fi

# Input parameters
input_dir="$1"
output_dir="$2"

echo "Running FastQC on reads..."
mkdir -p "$output_dir"
fastqc -o "$output_dir" "$input_dir/"*.fq.gz
Parent script

fastqc_00.sh

#!/bin/bash
#SBATCH --mem=10gb
#SBATCH --cpus-per-task=10
#SBATCH --job-name=fastqc_00
#SBATCH --output=log/fastqc_00%j.log  # %j will be replaced with the job ID

#parameters

input=fq_renamed
output=fastqc

src/fastqc.sh $input $output

MultiQC

MultiQC generates a report based on multiple FastQC reports and enables to view then simultaneously. MultiQC also takes Fastp reports as input and allocates a separate section.

Dauther script

multiqc.sh

# SCRIPT FOR PERFORMING MultiQC
# NOTE: Run this script from the directory where the "log" directory is located,
#       Example: /mnt/proj/ibd/ds-06_cd-fecal/common
#
# PURPOSE:
#   This script performs MultiQC.
#
# PARAMETERS:
#   1: input_dir - Directory where FastQC reports are located.
#   2: output_dir - Directory where MultiQC report will be located
# SAMPLE USAGE:
#   In a parent script: src/multiqc.sh <input_dir> <output_dir>
#
# IMPORTANT:
#   - Run from a parent script.

# Check if correct number of arguments are provided
if [ "$#" -ne 2 ]; then
    echo "Usage: $0 <input_dir> <output_dir>"
    exit 1
fi

# Input parameters
input_dir="$1"
output_dir="$2"

echo "Running MultiQC..."
mkdir -p "$output_dir"
multiqc -o "$output_dir" "$input_dir"
Parent script

multiqc_00.sh

#!/bin/bash
#SBATCH --mem=10gb
#SBATCH --cpus-per-task=10
#SBATCH --job-name=multiqc_00
#SBATCH --output=log/multiqc_00%j.log  # %j will be replaced with the job ID

#parameters

input=fastqc
output=multiqc

src/multiqc.sh $input $output

You can also perform them simultaneously with one parent script:

Combined parent script

qc_00.sh

#!/bin/bash
#SBATCH --mem=10gb
#SBATCH --cpus-per-task=10
#SBATCH --job-name=fastqc_multiqc_00
#SBATCH --output=log/fastqc_multiqc_00%j.log  # %j will be replaced with the job ID

#parameters

input_fqc=fq
output_fqc=fastqc
input_mqc=fastqc
output_mqc=multiqc

src/fastqc.sh $input_fqc $output_fqc
src/multiqc.sh $input_mqc $output_mqc

You can also use the same scripts on trimmed files. Just add “_trimmed” to the inputs and outputs like this. Remember to save them in separate folders (fastqc_trimmed, multiqc_trimmed)

fastqc_post_00.sh

#!/bin/bash
#SBATCH --mem=10gb
#SBATCH --cpus-per-task=10
#SBATCH --job-name=fastqc_00
#SBATCH --output=log/fastqc_00%j.log  # %j will be replaced with the job ID

#parameters

input=fq_trimmed
output=fastqc_trimmed

src/fastqc.sh $input $output

multiqc_post_00.sh

#!/bin/bash
#SBATCH --mem=10gb
#SBATCH --cpus-per-task=10
#SBATCH --job-name=multiqc_00
#SBATCH --output=log/multiqc_00%j.log  # %j will be replaced with the job ID

#parameters

input=fastqc_trimmed
output=multiqc_trimmed

src/multiqc.sh $input $output
Fastp

If you want to add Fastp reports to MultiQC, you can add “fq_trimmed” directory as input to MultiQC (multiqc_fastp.sh).

Daughter script

multiqc_fastp.sh:

# SCRIPT FOR PERFORMING MultiQC
# NOTE: Run this script from the directory where the "log" directory is located,
#       Example: /mnt/proj/ibd/ds-06_cd-fecal/common
#
# PURPOSE:
#   This script performs MultiQC.
#
# PARAMETERS:
#   1: input_dir - Directory where FastQC reports are located.
#   2: output_dir - Directory where MultiQC report will be located
# SAMPLE USAGE:
#   In a parent script: src/multiqc.sh <input_dir> <output_dir>
#
# IMPORTANT:
#   - Run from a parent script.

# Check if correct number of arguments are provided
if [ "$#" -ne 2 ]; then
    echo "Usage: $0 <input_dir> <output_dir>"
    exit 1
fi

# Input parameters
input_dir=$1
input_fastp=$2
output_dir=$3

echo "Running MultiQC..."
mkdir -p "$output_dir"
multiqc -o "$output_dir" "$input_dir" "$input_fastp"
Parent

multiqc_fastp_00.sh

#!/bin/bash
#SBATCH --mem=10gb
#SBATCH --cpus-per-task=10
#SBATCH --job-name=multiqc_00
#SBATCH --output=log/multiqc_00%j.log  # %j will be replaced with the job ID

#parameters

input=fastqc_trimmed
input_fastp=fq_trimmed
output=multiqc_trimmed

src/multiqc.sh $input $input_fastp $output
scripts/qc.1742560781.txt.gz · Last modified: by 37.26.174.181

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki