QC Report

	general
Report generated at	2021-10-10 15:11:58
Title	test
Description	test
Pipeline version	v1.2.0
Pipeline type	ChIP-nexus
Genome	mm10
Aligner	bowtie
Sequencing endedness	{'rep1': {'paired_end': False}, 'rep2': {'paired_end': False}, 'rep3': {'paired_end': False}, 'rep4': {'paired_end': False}}
Peak caller	macs2

Alignment quality metrics

SAMstat (raw unfiltered BAM)

	rep1	rep2	rep3	rep4
Total Reads	6121	6252	5120	5120
Total Reads (QC-failed)	0	0	0	0
Duplicate Reads	0	0	0	0
Duplicate Reads (QC-failed)	0	0	0	0
Mapped Reads	6121	6252	5120	5120
Mapped Reads (QC-failed)	0	0	0	0
% Mapped Reads	100.0	100.0	100.0	100.0
Paired Reads	0	0	0	0
Paired Reads (QC-failed)	0	0	0	0
Read1	0	0	0	0
Read1 (QC-failed)	0	0	0	0
Read2	0	0	0	0
Read2 (QC-failed)	0	0	0	0
Properly Paired Reads	0	0	0	0
Properly Paired Reads (QC-failed)	0	0	0	0
% Properly Paired Reads	0.0	0.0	0.0	0.0
With itself	0	0	0	0
With itself (QC-failed)	0	0	0	0
Singletons	0	0	0	0
Singletons (QC-failed)	0	0	0	0
% Singleton	0.0	0.0	0.0	0.0
Diff. Chroms	0	0	0	0
Diff. Chroms (QC-failed)	0	0	0	0

SAMstat (filtered/deduped BAM)

	rep1	rep2	rep3	rep4
Total Reads	6121	6252	5120	5120
Total Reads (QC-failed)	0	0	0	0
Duplicate Reads	0	0	0	0
Duplicate Reads (QC-failed)	0	0	0	0
Mapped Reads	6121	6252	5120	5120
Mapped Reads (QC-failed)	0	0	0	0
% Mapped Reads	100.0	100.0	100.0	100.0
Paired Reads	0	0	0	0
Paired Reads (QC-failed)	0	0	0	0
Read1	0	0	0	0
Read1 (QC-failed)	0	0	0	0
Read2	0	0	0	0
Read2 (QC-failed)	0	0	0	0
Properly Paired Reads	0	0	0	0
Properly Paired Reads (QC-failed)	0	0	0	0
% Properly Paired Reads	0.0	0.0	0.0	0.0
With itself	0	0	0	0
With itself (QC-failed)	0	0	0	0
Singletons	0	0	0	0
Singletons (QC-failed)	0	0	0	0
% Singleton	0.0	0.0	0.0	0.0
Diff. Chroms	0	0	0	0
Diff. Chroms (QC-failed)	0	0	0	0

Filtered and duplicates removed

Sequence quality metrics (filtered/deduped BAM)

Open chromatin assays are known to have significant GC bias. Please take this into consideration as necessary.

Annotated genomic region enrichment

	rep1	rep2	rep3	rep4
Fraction of Reads in universal DHS regions	0.3953754499910584	0.39015402729508286	0.36604640910378444	0.3621994176247297
Fraction of Reads in blacklist regions	0.013023859049992979	0.012111632101337479	0.011719082548878598	0.011810939271363409
Fraction of Reads in promoter regions	0.03946232674842887	0.03706631116828838	0.031438719833529666	0.029769289082795688
Fraction of Reads in enhancer regions	0.3434627437541839	0.34148802053399174	0.3234009519174915	0.32102992030878097

Signal to noise can be assessed by considering whether reads are falling into known open regions (such as DHS regions) or not. A high fraction of reads should fall into the universal (across cell type) DHS set. A small fraction should fall into the blacklist regions. A high set (though not all) should fall into the promoter regions. A high set (though not all) should fall into the enhancer regions. The promoter regions should not take up all reads, as it is known that there is a bias for promoters in open chromatin assays.

Library complexity quality metrics

Library complexity (filtered non-mito BAM)

	rep1	rep2	rep3	rep4
Total Fragments	34725435	58584133	52569083	54544580
Distinct Fragments	31040339	50096056	48383395	51715108
Positions with Two Read	None	None	None	None
NRF = Distinct/Total	0.8938790543588583	0.8551130388837538	0.9203773822723901	0.9481255149457563
PBC1 = OneRead/Distinct	None	None	None	None
PBC2 = OneRead/TwoRead	None	None	None	None

Mitochondrial reads are filtered out by default. The non-redundant fraction (NRF) is the fraction of non-redundant mapped reads in a dataset; it is the ratio between the number of positions in the genome that uniquely mapped reads map to and the total number of uniquely mappable reads. The NRF should be > 0.8. The PBC1 is the ratio of genomic locations with EXACTLY one read pair over the genomic locations with AT LEAST one read pair. PBC1 is the primary measure, and the PBC1 should be close to 1. Provisionally 0-0.5 is severe bottlenecking, 0.5-0.8 is moderate bottlenecking, 0.8-0.9 is mild bottlenecking, and 0.9-1.0 is no bottlenecking. The PBC2 is the ratio of genomic locations with EXACTLY one read pair over the genomic locations with EXACTLY two read pairs. The PBC2 should be significantly greater than 1.

NRF (non redundant fraction)
PBC1 (PCR Bottleneck coefficient 1)
PBC2 (PCR Bottleneck coefficient 2)
PBC1 is the primary measure. Provisionally

0-0.5 is severe bottlenecking
0.5-0.8 is moderate bottlenecking
0.8-0.9 is mild bottlenecking
0.9-1.0 is no bottlenecking

Replication quality metrics

IDR (Irreproducible Discovery Rate) plots

Reproducibility QC and peak detection statistics

	overlap	idr
Nt	38817	14553
N1	24995	8675
N2	28519	10484
N3	15682	5964
N4	18134	7007
Np	50028	19996
N optimal	50028	19996
N conservative	38817	14553
Optimal Set	pooled-pr1_vs_pooled-pr2	pooled-pr1_vs_pooled-pr2
Conservative Set	rep1_vs_rep2	rep1_vs_rep2
Rescue Ratio	1.2888167555452508	1.3740122311550882
Self Consistency Ratio	1.8185818135441907	1.7578806170355465
Reproducibility Test	pass	pass

Reproducibility QC

N1: Replicate 1 self-consistent peaks (comparing two pseudoreplicates generated by subsampling Rep1 reads)
N2: Replicate 2 self-consistent peaks (comparing two pseudoreplicates generated by subsampling Rep2 reads)
Ni: Replicate i self-consistent peaks (comparing two pseudoreplicates generated by subsampling RepX reads)
Nt: True Replicate consistent peaks (comparing true replicates Rep1 vs Rep2)
Np: Pooled-pseudoreplicate consistent peaks (comparing two pseudoreplicates generated by subsampling pooled reads from Rep1 and Rep2)
Self-consistency Ratio: max(N1,N2) / min (N1,N2)
Rescue Ratio: max(Np,Nt) / min (Np,Nt)
Reproducibility Test: If Self-consistency Ratio >2 AND Rescue Ratio > 2, then 'Fail' else 'Pass'

Number of raw peaks

	rep1	rep2	rep3	rep4
Number of peaks	66734	85934	52357	46276

Top 500000 raw peaks from macs2 with p-val threshold 0.01

Peak calling statistics

Peak region size

	rep1	rep2	rep3	rep4	idr_opt	overlap_opt
Min size	150.0	150.0	150.0	150.0	150.0	150.0
25 percentile	165.0	168.0	162.0	170.0	497.0	340.0
50 percentile (median)	226.0	228.0	215.0	239.0	645.0	472.0
75 percentile	353.0	356.0	358.0	404.25	817.0	658.0
Max size	1671.0	1965.0	1844.0	1914.0	3054.0	3054.0
Mean	287.83428237480143	290.980403565527	293.00003819928565	318.01577491572306	671.5454090818164	523.8125249860078

Enrichment / Signal-to-noise ratio

TSS enrichment (filtered/deduped BAM)

	rep1	rep2	rep3	rep4
TSS enrichment	2.4693492364184	2.3935170292925054	1.9718156583172377	1.7914598385681382

Open chromatin assays should show enrichment in open chromatin sites, such as TSS's. An average TSS enrichment in human (hg19) is above 6. A strong TSS enrichment is above 10. For other references please see https://www.encodeproject.org/atac-seq/

Jensen-Shannon distance (filtered/deduped BAM)

	rep1	rep2	rep3	rep4
AUC	0.2778209429037775	0.2938677418915542	0.3087353698678042	0.3117975392544026
Synthetic AUC	0.49062393685507943	0.49262729307984315	0.4924153732714727	0.49268081897897936
X-intercept	0.1299918418926809	0.11342768477713171	0.11306776827775956	0.11143214774172391
Synthetic X-intercept	9.750639907934148e-97	3.0864936975188777e-157	1.668137745349747e-148	1.4477899607291867e-159
Elbow Point	0.5727751161730479	0.5567188412288349	0.5266078269841397	0.5214210303209655
Synthetic Elbow Point	0.5044049947936692	0.5079564278300622	0.511707901107329	0.5001046618614695
Synthetic JS Distance	0.26004390053718796	0.24932254867891696	0.22395752220233578	0.2220904626495864

Peak enrichment

Fraction of reads in peaks (FRiP)

FRiP for macs2 raw peaks

	rep1	rep2	rep3	rep4	rep1-pr1	rep2-pr1	rep3-pr1	rep4-pr1	rep1-pr2	rep2-pr2	rep3-pr2	rep4-pr2	pooled	pooled-pr1	pooled-pr2
Fraction of Reads in Peaks	0.06286306989108592	0.06616856225168703	0.039651992176241456	0.04048923188945095	0.06480947051482039	0.05500544793386529	0.033291958257746106	0.038309810742346316	0.06490812052368759	0.05458170200065251	0.03375017469836862	0.03822206075640411	0.0735420338305926	0.060837465631619514	0.06096926278480056

FRiP for overlap peaks

	rep1_vs_rep2	rep1_vs_rep3	rep1_vs_rep4	rep2_vs_rep3	rep2_vs_rep4	rep3_vs_rep4	rep1-pr1_vs_rep1-pr2	rep2-pr1_vs_rep2-pr2	rep3-pr1_vs_rep3-pr2	rep4-pr1_vs_rep4-pr2	pooled-pr1_vs_pooled-pr2
Fraction of Reads in Peaks	0.05051037135243125	0.0446432121478061	0.04426287149178079	0.04616410576731199	0.04571359650612102	0.04222318154200081	0.043439892843953797	0.04442746949979456	0.026115240569621046	0.029892347899573176	0.05630649567281463

FRiP for IDR peaks

	rep1_vs_rep2	rep1_vs_rep3	rep1_vs_rep4	rep2_vs_rep3	rep2_vs_rep4	rep3_vs_rep4	rep1-pr1_vs_rep1-pr2	rep2-pr1_vs_rep2-pr2	rep3-pr1_vs_rep3-pr2	rep4-pr1_vs_rep4-pr2	pooled-pr1_vs_pooled-pr2
Fraction of Reads in Peaks	0.033090867521552055	0.02895221095884083	0.029107203183351586	0.030447480374337175	0.030706950269588806	0.02839607082737454	0.027804754323076174	0.028720284886299234	0.01719170388931988	0.02034654940680004	0.03841457178958988

For macs2 raw peaks:

repX: Peak from true replicate X
repX-prY: Peak from Yth pseudoreplicates from replicate X
pooled: Peak from pooled true replicates (pool of rep1, rep2, ...)
pooled-pr1: Peak from 1st pooled pseudo replicate (pool of rep1-pr1, rep2-pr1, ...)
pooled-pr2: Peak from 2nd pooled pseudo replicate (pool of rep1-pr2, rep2-pr2, ...)

For overlap/IDR peaks:

repX_vs_repY: Comparing two peaks from true replicates X and Y
repX-pr1_vs_repX-pr2: Comparing two peaks from both pseudoreplicates from replicate X
pooled-pr1_vs_pooled-pr2: Comparing two peaks from 1st and 2nd pooled pseudo replicates