QC Report

	general
Report generated at	2021-11-16 23:01:52
Title	ENCSR274HQD-1
Description	No description
Pipeline version	v1.10.0
Pipeline type	atac
Genome	hg38
Aligner	bowtie2
Sequencing endedness	{'rep1': {'paired_end': True}}
Peak caller	macs2

Alignment quality metrics

Fragment length statistics (filtered/deduped BAM)

	rep1
Fraction of reads in NFR	0.5269639200101196
Fraction of reads in NFR (QC pass)	True
Fraction of reads in NFR (QC reason)	OK
NFR / mono-nuc reads	1.8535732640575608
NFR / mono-nuc reads (QC pass)	False
NFR / mono-nuc reads (QC reason)	out of range [2.5, inf]
Presence of NFR peak	True
Presence of Mono-Nuc peak	True
Presence of Di-Nuc peak	True

Open chromatin assays show distinct fragment length enrichments, as the cut sites are only in open chromatin and not in nucleosomes. As such, peaks representing different n-nucleosomal (ex mono-nucleosomal, di-nucleosomal) fragment lengths will arise. Good libraries will show these peaks in a fragment length distribution and will show specific peak ratios.

NFR: Nucleosome free region

Sequence quality metrics (filtered/deduped BAM)

Open chromatin assays are known to have significant GC bias. Please take this into consideration as necessary.

Annotated genomic region enrichment

	rep1
Fraction of Reads in universal DHS regions	0.339396533048957
Fraction of Reads in blacklist regions	0.008079798447170279
Fraction of Reads in promoter regions	0.14959440904450713
Fraction of Reads in enhancer regions	0.2759019758842537

Signal to noise can be assessed by considering whether reads are falling into known open regions (such as DHS regions) or not. A high fraction of reads should fall into the universal (across cell type) DHS set. A small fraction should fall into the blacklist regions. A high set (though not all) should fall into the promoter regions. A high set (though not all) should fall into the enhancer regions. The promoter regions should not take up all reads, as it is known that there is a bias for promoters in open chromatin assays.

Replication quality metrics

Number of raw peaks

	rep1
Number of peaks	217779

Top 300000 raw peaks from macs2 with p-val threshold 0.01

Peak calling statistics

Peak region size

	rep1
Min size	150.0
25 percentile	196.0
50 percentile (median)	318.0
75 percentile	641.0
Max size	2382.0
Mean	472.15733381088165

Enrichment / Signal-to-noise ratio

Jensen-Shannon distance (filtered/deduped BAM)

	rep1
AUC	0.22611902190935435
Synthetic AUC	0.493734591418901
X-intercept	0.13484123271074003
Synthetic X-intercept	2.711128502768229e-218
Elbow Point	0.6875477341522614
Synthetic Elbow Point	0.5083180640080235
Synthetic JS Distance	0.3735787236899883

Peak enrichment

Fraction of reads in peaks (FRiP)

FRiP for macs2 raw peaks

	rep1
Fraction of Reads in Peaks	0.2355531236956968

For macs2 raw peaks:

repX: Peak from true replicate X
repX-prY: Peak from Yth pseudoreplicates from replicate X
pooled: Peak from pooled true replicates (pool of rep1, rep2, ...)
pooled-pr1: Peak from 1st pooled pseudo replicate (pool of rep1-pr1, rep2-pr1, ...)
pooled-pr2: Peak from 2nd pooled pseudo replicate (pool of rep1-pr2, rep2-pr2, ...)

For overlap/IDR peaks:

repX_vs_repY: Comparing two peaks from true replicates X and Y
repX-pr1_vs_repX-pr2: Comparing two peaks from both pseudoreplicates from replicate X
pooled-pr1_vs_pooled-pr2: Comparing two peaks from 1st and 2nd pooled pseudo replicates