MicroHapulator Report

Report generated at {{date}},
using MicroHapulator version {{mhpl8rversion}}.

Table of Contents

  1. Read QA/QC
  2. Read Merging
  3. Read Mapping
  4. Haplotype Calling
  5. Genotype Calling

All statistics presented in this report are aggregated in a single table available at analysis/summary.tsv in the working directory. Full-resolution graphics for each figure are also available in each analysis/{samplename}/ subdirectory.

Read QA/QC

QC reports for the input reads are generated using FastQC. Links to reports for each sample are provided below.

NOTE: FastQC was designed for QC of whole-genome shotgun NGS reads prior to genome asssembly. A QC warning or failure for some modules (such as per-base sequence content or sequence duplication levels) may or may not be a concern with MH reads. Interpret results with care!
{% for sample in samples %} {% endfor %}
SampleR1 ReportR2 Report
{{sample}} Click here to open in a new tab Click here to open in a new tab

The following histograms show the distribution of R1 and R2 read lengths for each sample.

{% for r1plot, r2plot in zip(plots["r1readlen"], plots["r2readlen"]) %} {% endfor %}

Read Merging

Paired end reads are merged using FLASh.

{% for i, row in summary.iterrows() %} {% endfor %}
Sample Total Reads Merged Reads Merge Rate
{{row.Sample}} {{ "{:,}".format(row.TotalReads) }} {{ "{:,}".format(row.Merged) }} {{ "{:.2f}".format(row.MergeRate * 100) }}%

The following histograms show the distribution of merged read lengths for each sample.

{% for plot in plots["mergedreadlen"] %} {% endfor %}

Read Mapping

Merged reads are aligned to marker reference sequences using BWA MEM and formatted/sorted using SAMtools. The reads were also aligned to the full (entire chromosomes) human reference genome, to aid in discriminating between off-target sequences and e.g. contaminant sequences: reads that align to the entire chromosomes but not to the marker sequences represent off-target sequences, while reads that do not align to either are likely contaminants.

{% for i, row in summary.iterrows() %} {% endfor %}
Sample Merged Reads Mapped Mapping Rate Mapped (Chrom) Mapping Rate (Chrom)
{{row.Sample}} {{ "{:,}".format(row.Merged) }} {{ "{:,}".format(row.Mapped) }} {{ "{:.2f}".format(row.MappingRate * 100) }}% {{ "{:,}".format(row.MappedFullRefr) }} {{ "{:.2f}".format(row.MappingRateFullRefr * 100) }}%

The following histograms show the interlocus balance for each sample.

{% for plot in plots["locbalance"] %} {% endfor %}

Haplotype Calling

Haplotypes are called empirically on a per-read basis using mhpl8r type. Reads that span all SNPs of interest in the corresponding marker are examined; all other reads are discarded. The haplotype tallies respresent a typing result for each sample.

{% for i, row in summary.iterrows() %} {% endfor %}
Sample Mapped Reads Typed Reads Typing Success Rate
{{row.Sample}} {{ "{:,}".format(row.Mapped) }} {{ "{:,}".format(row.Typed) }} {{ "{:.2f}".format(row.TypingRate * 100) }}%

Genotype Calling

Fixed detection thresholds and dynamic analytical thresholds are applied to the typing result using mhpl8r filter to discriminate true and erroneous haplotypes and predict each sample's genotype. MicroHapulator applied a static filter of ≥{{static}} reads as a detection threshold and a dynamic filter of ≥{{"{:.1f}".format(dynamic*100)}}% of total reads.

The following figures show the heterozygote balance for each sample. {% for plot in plots["hetbalance"] %} {% endfor %}