|
1.4.3
|
|
1.
|
Signal Intensity Filter – trims reads that lose signal ‘crispness’ near the 3’ end. Some flows have an ambiguous signal intensity (0.5-0.7 on a 0-1.0 scale), possibly from overall signal droop and/or CAFIE error accumulation that leads to a low signal-to-noise ratio (as shown in Figure 5). The 3’ end of a read is trimmed such that <3% of the remaining flows have ambiguous signal intensity for incorporation (numTrimmedTooShortQuality metric).
|
|
2.
|
Primer Filter – trims the end of a read when it matches a 454 Sequencing System Adaptor sequence (numTrimmedTooShortPrimer metric).
|
|
3.
|
Valley Filter/TrimBack Valley Filter – Filters or trims reads with many off-peak signal intensities. A Valley flow is defined as an intermediate signal intensity, i.e., a signal intensity occurring in the valley between the peaks for 0-mer and 1-mer incorporations, 1-mer and 2-mer incorporations or between 2-mer and 3-mer incorporations. The signal distribution of all reads of the Run is used to define the peaks of the homopolymer incorporations relative to the valleys. One of the following three valley filters is applied when doValleyFilter = ‘true’.
|
|
•
|
Counting Valley Filter [read rejecting] is applied when both vfScanAllFlows and doValleyFilterTrimBack are ‘false’. The Counting Valley Filter evaluates each flow up to a limit specified by vfLastFlowToTest (default = 320 flows). For each read, the number of borderline valley flows is compared to a threshold specified by vfBadFlowThreshold (default = 4 “bad” flows with intermediate signal intensity). Reads that exceed the threshold are discarded (numTrimmedTooShortQuality metric). The Counting Valley Filter has been largely superseded by the Scoring and TrimBack Valley Filters.
|
|
•
|
Scoring Valley Filter [read rejecting] is applied when vfScanAllFlows is enabled, and is the default for amplicon processing. The Scoring Valley Filter calculates a valley score for each flow up to a limit specified by vfScanLimit (default = 700 flows for amplicons). For each read, the sum of the flow valley scores is scaled by a factor (vfTrimBackScaleFactor) and compared to a threshold ratio calculated from vfBadFlowThreshold and vfLastFlowToTest (default = 4 bad flows per 320 flows). Reads with a scaled, summed score that exceeds the threshold are discarded.
|
|
•
|
TrimBack Valley Filter [read trimming] is applied when doValleyFilterTrimBack is ‘true’, and is the default for shotgun processing. The TrimBack Valley Filter utilizes the same mechanism as the Scoring Valley Filter, but with the additional feature that it scans reads backwards from the last flow of the read and trims flows until the scaled, summed valley score for the read no longer exceeds the scoring valley filter threshold. This trimming is used to retain the higher quality portion of a read (numTrimmedTooShortQuality metric).
|
|
XLR70 Kit: 2.0
|
||||
|
4.
|
Basecall Quality Score Trimming Filter - Technically, this filter is executed as part of basecalling, but functionally it belongs with the other quality filters. The quality score trimming filter trims back from the 3’ end of reads based on estimated quality scores (not the final quality scores) derived from an internal calibrated signal histogram. The error rate in a sliding window (default size of 40 bp) is calculated form the estimated quality scores, and multiplied by an empirical scaling factor (default of 1.1). The window is slid leftwards until the estimated error rate in the window is less than 1.0% (by default). If the resulting read is less than 40 bp (default) the read is discarded (numTrimmedTooShortQuality metric).
|