User Tools

Site Tools


scripts:qc

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
scripts:qc [2025/03/11 11:41] – created 37.26.174.181scripts:qc [2025/03/21 12:43] (current) 37.26.174.181
Line 1: Line 1:
-====== Perform quality check of your FASTQ files ======+==== Quality Control with FastQC, Fastp and MultiQC ====
  
-===== QC with FASTQC & MULTIQC =====+The following scripts perform quality check with FastQC and MultiQC. You can find Fastp in [[scripts:adapter_and_quality_trimming|How to perform quality control and adapter trimming of your reads with cutadapt and fastp]]. Fastp performs quality trimming and also generates reports, which can be used as input to MultiQC.
  
-==== Daughter script ==== +=== FastQC ===
-==== Parent script ====+
  
 +Uses fastq files as inputs and generates reports. Below daughter and parent scripts are provided.
 +
 +== Dauther script ==
 +fastqc.sh
 <code> <code>
 +# SCRIPT FOR PERFORMING FASTQC
 +# NOTE: Run this script from the directory where the "log" directory is located,
 +#       Example: /mnt/proj/ibd/ds-06_cd-fecal/common
 +#
 +# PURPOSE:
 +#   This script performs FastQC.
 +#
 +# PARAMETERS:
 +#   1: input_dir - Directory where FASTQ files are located.
 +#   2: output_dir - Directory where FastQC reports will be located
 +# SAMPLE USAGE:
 +#   In a parent script: src/fastqc.sh <input_dir> <output_dir>
 +#
 +# IMPORTANT:
 +#   - Run from a parent script.
 +
 +# Check if correct number of arguments are provided
 +if [ "$#" -ne 2 ]; then
 +    echo "Usage: $0 <input_dir> <output_dir>"
 +    exit 1
 +fi
 +
 +# Input parameters
 +input_dir="$1"
 +output_dir="$2"
 +
 +echo "Running FastQC on reads..."
 +mkdir -p "$output_dir"
 +fastqc -o "$output_dir" "$input_dir/"*.fq.gz
 </code> </code>
  
-===== QC with FASTP & MULTIQC ====+== Parent script == 
-==== Daughter script ==== +fastqc_00.sh 
-==== Parent script ====+<code> 
 +#!/bin/bash 
 +#SBATCH --mem=10gb 
 +#SBATCH --cpus-per-task=10 
 +#SBATCH --job-name=fastqc_00 
 +#SBATCH --output=log/fastqc_00%j.log  # %j will be replaced with the job ID
  
 +#parameters
 +
 +input=fq_renamed
 +output=fastqc
 +
 +src/fastqc.sh $input $output
 +</code>
 +=== MultiQC ===
 +
 +MultiQC generates a report based on multiple FastQC reports and enables to view then simultaneously. MultiQC also takes Fastp reports as input and allocates a separate section.
 +
 +== Dauther script ==
 +multiqc.sh
 +<code>
 +# SCRIPT FOR PERFORMING MultiQC
 +# NOTE: Run this script from the directory where the "log" directory is located,
 +#       Example: /mnt/proj/ibd/ds-06_cd-fecal/common
 +#
 +# PURPOSE:
 +#   This script performs MultiQC.
 +#
 +# PARAMETERS:
 +#   1: input_dir - Directory where FastQC reports are located.
 +#   2: output_dir - Directory where MultiQC report will be located
 +# SAMPLE USAGE:
 +#   In a parent script: src/multiqc.sh <input_dir> <output_dir>
 +#
 +# IMPORTANT:
 +#   - Run from a parent script.
 +
 +# Check if correct number of arguments are provided
 +if [ "$#" -ne 2 ]; then
 +    echo "Usage: $0 <input_dir> <output_dir>"
 +    exit 1
 +fi
 +
 +# Input parameters
 +input_dir="$1"
 +output_dir="$2"
 +
 +echo "Running MultiQC..."
 +mkdir -p "$output_dir"
 +multiqc -o "$output_dir" "$input_dir"
 +</code>
 +
 +== Parent script ==
 +multiqc_00.sh
 +<code>
 +#!/bin/bash
 +#SBATCH --mem=10gb
 +#SBATCH --cpus-per-task=10
 +#SBATCH --job-name=multiqc_00
 +#SBATCH --output=log/multiqc_00%j.log  # %j will be replaced with the job ID
 +
 +#parameters
 +
 +input=fastqc
 +output=multiqc
 +
 +src/multiqc.sh $input $output
 +</code>
 +
 +You can also perform them simultaneously with one parent script:
 +== Combined parent script ==
 +qc_00.sh
 +<code>
 +#!/bin/bash
 +#SBATCH --mem=10gb
 +#SBATCH --cpus-per-task=10
 +#SBATCH --job-name=fastqc_multiqc_00
 +#SBATCH --output=log/fastqc_multiqc_00%j.log  # %j will be replaced with the job ID
 +
 +#parameters
 +
 +input_fqc=fq
 +output_fqc=fastqc
 +input_mqc=fastqc
 +output_mqc=multiqc
 +
 +src/fastqc.sh $input_fqc $output_fqc
 +src/multiqc.sh $input_mqc $output_mqc
 +</code>
 +
 +You can also use the same scripts on trimmed files. Just add "_trimmed" to the inputs and outputs like this. Remember to save them in separate folders (fastqc_trimmed, multiqc_trimmed)
 +
 +fastqc_post_00.sh
 +<code>
 +#!/bin/bash
 +#SBATCH --mem=10gb
 +#SBATCH --cpus-per-task=10
 +#SBATCH --job-name=fastqc_post_00
 +#SBATCH --output=log/fastqc_post_00%j.log  # %j will be replaced with the job ID
 +
 +#parameters
 +
 +input=fq_trimmed
 +output=fastqc_trimmed
 +
 +src/fastqc.sh $input $output
 +</code>
 +
 +multiqc_post_00.sh
 +<code>
 +#!/bin/bash
 +#SBATCH --mem=10gb
 +#SBATCH --cpus-per-task=10
 +#SBATCH --job-name=multiqc_post_00
 +#SBATCH --output=log/multiqc_post_00%j.log  # %j will be replaced with the job ID
 +
 +#parameters
 +
 +input=fastqc_trimmed
 +output=multiqc_trimmed
 +
 +src/multiqc.sh $input $output
 +</code>
 +
 +=== Fastp ===
 +If you want to add Fastp reports to MultiQC, you can add "fq_trimmed" directory as input to MultiQC (multiqc_fastp.sh).
 +
 +== Daughter script ==
 +multiqc_fastp.sh:
 +<code>
 +# SCRIPT FOR PERFORMING MultiQC
 +# NOTE: Run this script from the directory where the "log" directory is located,
 +#       Example: /mnt/proj/ibd/ds-06_cd-fecal/common
 +#
 +# PURPOSE:
 +#   This script performs MultiQC.
 +#
 +# PARAMETERS:
 +#   1: input_dir - Directory where FastQC reports are located.
 +#   2: output_dir - Directory where MultiQC report will be located
 +# SAMPLE USAGE:
 +#   In a parent script: src/multiqc.sh <input_dir> <output_dir>
 +#
 +# IMPORTANT:
 +#   - Run from a parent script.
 +
 +# Check if correct number of arguments are provided
 +if [ "$#" -ne 2 ]; then
 +    echo "Usage: $0 <input_dir> <output_dir>"
 +    exit 1
 +fi
 +
 +# Input parameters
 +input_dir=$1
 +input_fastp=$2
 +output_dir=$3
 +
 +echo "Running MultiQC..."
 +mkdir -p "$output_dir"
 +multiqc -o "$output_dir" "$input_dir" "$input_fastp"
 +</code>
 +
 +== Parent ==
 +multiqc_fastp_00.sh
 +<code>
 +#!/bin/bash
 +#SBATCH --mem=10gb
 +#SBATCH --cpus-per-task=10
 +#SBATCH --job-name=multiqc_fastp_00
 +#SBATCH --output=log/multiqc_fastp_00%j.log  # %j will be replaced with the job ID
 +
 +#parameters
 +
 +input=fastqc_trimmed
 +input_fastp=fq_trimmed
 +output=multiqc_trimmed
 +
 +src/multiqc.sh $input $input_fastp $output
 +</code>
scripts/qc.1741693260.txt.gz · Last modified: by 37.26.174.181

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki