4. GS De Novo Assembler and GS Reference Mapper Appendices
:
4.8 Paired End Libraries in the 454 Sequencing System
: 4.8.3 Paired End Library Span Estimation
4.8.3
Paired End Library Span Estimation
Estimates of the distance spanned by Paired End reads in a library are made when at least 8 consistent mate pairs are found that align to the same contig or scaffold. Both halves of a Paired End read must align to the same contig with the expected directionality (the read halves 3’ ends point toward each other, after reverse-complementation of the left half). Statistics for the distance between mated pairs are kept for each library. As additional scaffolds are formed, statistics for additional Paired End reads become available and the library span is re-estimated. Paired End reads whose halves are too far away from the mean of the distribution and those whose halves don’t have the expected relative orientation are excluded from the span distance calculation. The estimate is less robust when either little Paired End information for a library is available or when very few contigs are significantly longer than the actual library span (in the latter case, the estimated span may be significantly lower than the actual span).