Although this analysis is for ATAC-seq data, many of the steps (especially the first section) are the same for other types of DNA sequencing experiments.
We'll be doing the analysis in Bash, which is the standard language for UNIX command-line scripting.
We start with raw .fastq.gz
files, which are provided by the sequencing instrument. For each DNA molecule (read) that was sequenced, they provide the nucleotide sequence, and information about the quality of the signal of that nucleotide.
### Set up variables storing the location of our data
### The proper way to load your variables is with the ~/.bashrc command, but this is very slow in iPython
export SUNETID="$(whoami)"
export WORK_DIR="/srv/scratch/training_camp/tc2016/${SUNETID}"
export DATA_DIR="${WORK_DIR}/data"
export FASTQ_DIR="${DATA_DIR}/fastq/"
export SRC_DIR="${WORK_DIR}/src/training_camp/src/"
export ANALYSIS_DIR="${WORK_DIR}/analysis/"
export YEAST_DIR="/srv/scratch/training_camp/saccer3/seq"
export YEAST_INDEX="/srv/scratch/training_camp/saccer3/bowtie2_index/saccer3"
export YEAST_CHR="/srv/scratch/training_camp/saccer3/sacCer3.chrom.sizes"
export TMP="${WORK_DIR}/tmp"
export TEMP=$TMP
export TMPDIR=$TMP
Now, let's check exactly which fastqs we have:
(recall that the ls
command lists the contents of a directory)
ls $FASTQ_DIR
Ct_1_S22_R1.fastq.gz DMSO_1_S6_R1.fastq.gz Kz_300_S4_R1.fastq.gz Ct_1_S22_R2.fastq.gz DMSO_1_S6_R2.fastq.gz Kz_300_S4_R2.fastq.gz Ct_2_S23_R1.fastq.gz DMSO_2_S12_R1.fastq.gz Kz_800_S10_R1.fastq.gz Ct_2_S23_R2.fastq.gz DMSO_2_S12_R2.fastq.gz Kz_800_S10_R2.fastq.gz Ct_300_S3_R1.fastq.gz DMSO_2_S32_R1.fastq.gz Mz_1_S19_R1.fastq.gz Ct_300_S3_R2.fastq.gz DMSO_2_S32_R2.fastq.gz Mz_1_S19_R2.fastq.gz Ct_3_S24_R1.fastq.gz It_1_S25_R1.fastq.gz Mz_2_S20_R1.fastq.gz Ct_3_S24_R2.fastq.gz It_1_S25_R2.fastq.gz Mz_2_S20_R2.fastq.gz Ct_800_S9_R1.fastq.gz It_2_S26_R1.fastq.gz Mz_300_S2_R1.fastq.gz Ct_800_S9_R2.fastq.gz It_2_S26_R2.fastq.gz Mz_300_S2_R2.fastq.gz Cz_1_S16_R1.fastq.gz It_300_S5_R1.fastq.gz Mz_3_S21_R1.fastq.gz Cz_1_S16_R2.fastq.gz It_300_S5_R2.fastq.gz Mz_3_S21_R2.fastq.gz Cz_2_S17_R1.fastq.gz It_3_S27_R1.fastq.gz Mz_800_S8_R1.fastq.gz Cz_2_S17_R2.fastq.gz It_3_S27_R2.fastq.gz Mz_800_S8_R2.fastq.gz Cz_300_S1_R1.fastq.gz It_800_S11_R1.fastq.gz U_1_S28_R1.fastq.gz Cz_300_S1_R2.fastq.gz It_800_S11_R2.fastq.gz U_1_S28_R2.fastq.gz Cz_3_S18_R1.fastq.gz Kt_1_S13_R1.fastq.gz U_2_S29_R1.fastq.gz Cz_3_S18_R2.fastq.gz Kt_1_S13_R2.fastq.gz U_2_S29_R2.fastq.gz Cz_800_S7_R1.fastq.gz Kt_2_S14_R1.fastq.gz U_3_S30_R1.fastq.gz Cz_800_S7_R2.fastq.gz Kt_2_S14_R2.fastq.gz U_3_S30_R2.fastq.gz DMSO_1_S31_R1.fastq.gz Kt_3_S15_R1.fastq.gz DMSO_1_S31_R2.fastq.gz Kt_3_S15_R2.fastq.gz
As a sanity check, we can also look at the size and last edited time of some of the fastqs by addind -lrth
to the ls
command:
ls -lrth $FASTQ_DIR | head
total 0 lrwxrwxrwx 1 user23 user23 43 Sep 13 05:43 U_2_S29_R1.fastq.gz -> ../../../../data/fastqs/U_2_S29_R1.fastq.gz lrwxrwxrwx 1 user23 user23 43 Sep 13 05:43 U_1_S28_R2.fastq.gz -> ../../../../data/fastqs/U_1_S28_R2.fastq.gz lrwxrwxrwx 1 user23 user23 43 Sep 13 05:43 U_1_S28_R1.fastq.gz -> ../../../../data/fastqs/U_1_S28_R1.fastq.gz lrwxrwxrwx 1 user23 user23 45 Sep 13 05:43 Mz_800_S8_R2.fastq.gz -> ../../../../data/fastqs/Mz_800_S8_R2.fastq.gz lrwxrwxrwx 1 user23 user23 45 Sep 13 05:43 Mz_800_S8_R1.fastq.gz -> ../../../../data/fastqs/Mz_800_S8_R1.fastq.gz lrwxrwxrwx 1 user23 user23 44 Sep 13 05:43 Mz_3_S21_R2.fastq.gz -> ../../../../data/fastqs/Mz_3_S21_R2.fastq.gz lrwxrwxrwx 1 user23 user23 44 Sep 13 05:43 Mz_3_S21_R1.fastq.gz -> ../../../../data/fastqs/Mz_3_S21_R1.fastq.gz lrwxrwxrwx 1 user23 user23 45 Sep 13 05:43 Mz_300_S2_R2.fastq.gz -> ../../../../data/fastqs/Mz_300_S2_R2.fastq.gz lrwxrwxrwx 1 user23 user23 45 Sep 13 05:43 Mz_300_S2_R1.fastq.gz -> ../../../../data/fastqs/Mz_300_S2_R1.fastq.gz
Let's also inspect the format of one of the fastqs. Notice that each read takes up 4 lines:
zcat $(ls $FASTQ_DIR* | head -n 1) | head -n 8
@NS500418:473:H55FVAFXX:1:11101:6407:1042 1:N:0:AGGCAGAA+NCTACTCT ATACCNAGGGGCGCAATGTGCGTTCAAAGATTCGATGATTCACGGAATTCTGCAACTGTCTCTTATACACATCTCC + AAAAA#EAEEE<EEEEEEAA<EEEEEEEEEEEAEEEEEEEA/<66EEAEEEEAEE<E<EEEE/EEEE<AAEEEEAA @NS500418:473:H55FVAFXX:1:11101:7864:1042 1:N:0:AGGCAGAA+NCTACTCT GTGTTNGTTCATCTAGACAGCCGGACGGTGGCCATGGAAGTCGGAATCCGCTAAGGAGTGTGTAACAACTCACCGG + AAAAA#EEEEEEEEE/EEEEEEE<EEEAEEEAEEEEEEEEEEAEE/EEAE<EEE/EEEAEE/EEAEEA/E<AEEEA gzip: stdout: Broken pipe
In many kinds of DNA and RNA sequencing experiments, sometimes the sequences will read through the targeted sequence insert and into sequencing adapter or PCR primer sequences on the end of the fragment. When the insert size is shorter than the read length (like in some of our ATAC-seq reads), the adapter sequence is read by the sequencer.
We need to remove such adapter sequences because they won't align to the genome.
In ATAC-seq (the data we're analyzing), the fragment length follows a periodic distribution. Some reads have very short inserts (only a few basepairs), while other reads have inserts that are much longer (100's of basepairs — much longer than the 77bp reads we're using to read them.
We know ahead of time that the first part of the adapter sequence is CTGTCTCTTATA
, since our reads are sequenced using a Nextera sample prep kit.
# Let's sanity check our adapter sequence by seeing
# how many times it occurs in the first 100000 reads.
ADAPTER="CTGTCTCTTATA"
NUM_LINES=400000 # 4 * num_reads, since each fastq entry is 4 lines
zcat $(ls $FASTQ_DIR*R1* | head -n 1) | head -n $NUM_LINES | grep $ADAPTER | wc -l
24856 gzip: stdout: Broken pipe
# Let's also check how often a permutation (rearrangement)
# of the adapter sequence occurs:
NOT_ADAPTER="CGTTCTTCTATA" # A permutation of the adapter sequence
zcat $(ls $FASTQ_DIR*R1* | head -n 1) | head -n $NUM_LINES | grep $NOT_ADAPTER | wc -l
0 gzip: stdout: Broken pipe
Notice that the correct adapter sequence occurs many times more in the reads than a permutation of the adapter sequene — this is an important validation that we have the right sequence.
Now, we'll trim the paired-end reads using a tool called cutadapt
:
#create a directory to store the trimmed data
export TRIMMED_DIR="$ANALYSIS_DIR/trimmed/"
[[ ! -d $TRIMMED_DIR ]] && mkdir -p "$TRIMMED_DIR"
for R1_fastq in ${FASTQ_DIR}*_R1*fastq.gz; do
# Get the read 2 fastq file from the filename of read 1
R2_fastq=$(echo $R1_fastq | sed -e 's/R1/R2/')
# Generate names for the trimmed fastq files
trimmed_R1_fastq=$TRIMMED_DIR$(echo $(basename $R1_fastq)| sed -e 's/.fastq.gz/.trimmed.fastq.gz/')
trimmed_R2_fastq=$TRIMMED_DIR$(echo $(basename $R2_fastq)| sed -e 's/.fastq.gz/.trimmed.fastq.gz/')
echo cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA \
-o ${trimmed_R1_fastq} \
-p ${trimmed_R2_fastq} \
$R1_fastq \
$R2_fastq
cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA \
-o ${trimmed_R1_fastq} \
-p ${trimmed_R2_fastq} \
$R1_fastq \
$R2_fastq
done
cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_1_S22_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_1_S22_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_1_S22_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_1_S22_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_1_S22_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_1_S22_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_1_S22_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_1_S22_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 733.62 s (187 us/read; 0.32 M reads/minute). === Summary === Total read pairs processed: 3,929,668 Read 1 with adapter: 1,440,399 (36.7%) Read 2 with adapter: 1,431,456 (36.4%) Pairs that were too short: 3,958 (0.1%) Pairs written (passing filters): 3,925,710 (99.9%) Total basepairs processed: 597,309,536 bp Read 1: 298,654,768 bp Read 2: 298,654,768 bp Total written (filtered): 535,954,991 bp (89.7%) Read 1: 267,912,314 bp Read 2: 268,042,677 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1440399 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 13.9% C: 38.4% G: 24.9% T: 22.9% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 60048 61401.1 0 60048 4 62875 15350.3 0 37454 25421 5 66744 3837.6 1 28100 38644 6 45078 959.4 1 28383 16695 7 32197 239.8 1 27942 4255 8 34230 60.0 1 29341 4448 441 9 32762 15.0 1 29858 702 2202 10 31349 3.7 2 29316 581 1452 11 31110 0.9 2 29459 567 1084 12 30841 0.2 2 30008 422 411 13 31540 0.2 2 30735 380 425 14 32351 0.2 2 31526 420 405 15 33230 0.2 2 32513 392 325 16 32113 0.2 2 31336 338 439 17 33034 0.2 2 32284 357 393 18 33171 0.2 2 32425 376 370 19 34858 0.2 2 33960 369 529 20 36150 0.2 2 35350 377 423 21 35973 0.2 2 35141 402 430 22 34587 0.2 2 33850 332 405 23 33083 0.2 2 32264 349 470 24 36252 0.2 2 35529 354 369 25 40487 0.2 2 39792 416 279 26 40627 0.2 2 39839 410 378 27 36178 0.2 2 35512 340 326 28 34699 0.2 2 33999 345 355 29 37697 0.2 2 36990 338 369 30 39488 0.2 2 38789 391 308 31 39966 0.2 2 39281 392 293 32 39153 0.2 2 38450 368 335 33 36583 0.2 2 35760 301 522 34 34380 0.2 2 33681 273 426 35 42471 0.2 2 41812 373 286 36 45430 0.2 2 44768 344 318 37 32201 0.2 2 31592 302 307 38 25370 0.2 2 24887 204 279 39 19155 0.2 2 18551 154 450 40 15009 0.2 2 14520 163 326 41 8088 0.2 2 7663 60 365 42 4682 0.2 2 3932 65 685 43 2796 0.2 2 2327 57 412 44 1886 0.2 2 1479 42 365 45 2024 0.2 2 1699 33 292 46 5100 0.2 2 4637 50 413 47 3859 0.2 2 3475 44 340 48 2076 0.2 2 1776 56 244 49 1942 0.2 2 1470 39 433 50 1340 0.2 2 1036 31 273 51 898 0.2 2 165 30 703 52 440 0.2 2 74 34 332 53 973 0.2 2 67 41 865 54 543 0.2 2 83 28 432 55 452 0.2 2 104 25 323 56 927 0.2 2 184 25 718 57 517 0.2 2 134 16 367 58 502 0.2 2 77 21 404 59 548 0.2 2 40 19 489 60 459 0.2 2 26 8 425 61 565 0.2 2 53 22 490 62 330 0.2 2 25 11 294 63 805 0.2 2 5 16 784 64 521 0.2 2 7 10 504 65 391 0.2 2 5 19 367 66 387 0.2 2 4 4 379 67 302 0.2 2 5 28 269 68 1524 0.2 2 0 55 1469 69 350 0.2 2 1 18 331 70 553 0.2 2 0 6 547 71 165 0.2 2 0 2 163 72 216 0.2 2 0 1 215 73 627 0.2 2 0 18 609 74 585 0.2 2 0 11 574 75 276 0.2 2 0 15 261 76 280 0.2 2 0 3 277 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1431456 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.3% C: 37.3% G: 25.3% T: 23.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 59997 61401.1 0 59997 4 61914 15350.3 0 36805 25109 5 65122 3837.6 1 26581 38541 6 44272 959.4 1 27306 16966 7 31903 239.8 1 26720 5183 8 33870 60.0 1 28852 4541 477 9 32485 15.0 1 29411 907 2167 10 31131 3.7 2 28322 1246 1563 11 30987 0.9 2 28879 846 1262 12 30664 0.2 2 29447 695 522 13 31297 0.2 2 30180 634 483 14 32153 0.2 2 31041 631 481 15 33049 0.2 2 32046 595 408 16 31859 0.2 2 30838 601 420 17 32913 0.2 2 31803 631 479 18 32986 0.2 2 31936 602 448 19 34730 0.2 2 33481 597 652 20 35933 0.2 2 34824 633 476 21 35777 0.2 2 34615 640 522 22 34453 0.2 2 33364 578 511 23 32897 0.2 2 31836 526 535 24 36075 0.2 2 35065 579 431 25 40356 0.2 2 39228 719 409 26 40472 0.2 2 39333 655 484 27 36030 0.2 2 35076 549 405 28 34647 0.2 2 33661 510 476 29 37591 0.2 2 36539 574 478 30 39356 0.2 2 38328 601 427 31 39833 0.2 2 38823 617 393 32 39016 0.2 2 38027 591 398 33 36497 0.2 2 35329 548 620 34 34303 0.2 2 33316 485 502 35 42311 0.2 2 41347 590 374 36 45323 0.2 2 44237 608 478 37 32107 0.2 2 31260 450 397 38 25294 0.2 2 24591 388 315 39 19059 0.2 2 18346 266 447 40 15014 0.2 2 14393 219 402 41 8101 0.2 2 7581 118 402 42 4703 0.2 2 3884 98 721 43 2790 0.2 2 2306 67 417 44 1872 0.2 2 1472 40 360 45 2009 0.2 2 1676 37 296 46 5070 0.2 2 4583 74 413 47 3842 0.2 2 3434 48 360 48 2059 0.2 2 1746 73 240 49 1990 0.2 2 1459 42 489 50 1356 0.2 2 1014 48 294 51 847 0.2 2 151 32 664 52 409 0.2 2 76 14 319 53 876 0.2 2 64 55 757 54 544 0.2 2 79 15 450 55 412 0.2 2 94 20 298 56 922 0.2 2 171 31 720 57 506 0.2 2 124 13 369 58 553 0.2 2 72 26 455 59 612 0.2 2 32 36 544 60 445 0.2 2 19 13 413 61 574 0.2 2 45 17 512 62 320 0.2 2 17 15 288 63 812 0.2 2 4 19 789 64 505 0.2 2 5 9 491 65 390 0.2 2 4 18 368 66 400 0.2 2 2 10 388 67 280 0.2 2 5 26 249 68 1512 0.2 2 0 66 1446 69 339 0.2 2 1 20 318 70 595 0.2 2 0 9 586 71 160 0.2 2 0 3 157 72 165 0.2 2 0 7 158 73 667 0.2 2 0 27 640 74 584 0.2 2 0 7 577 75 262 0.2 2 0 18 244 76 297 0.2 2 0 5 292 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_2_S23_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_2_S23_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_2_S23_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_2_S23_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_2_S23_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_2_S23_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_2_S23_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_2_S23_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 533.00 s (176 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 3,028,087 Read 1 with adapter: 1,233,731 (40.7%) Read 2 with adapter: 1,226,357 (40.5%) Pairs that were too short: 2,777 (0.1%) Pairs written (passing filters): 3,025,310 (99.9%) Total basepairs processed: 460,269,224 bp Read 1: 230,134,612 bp Read 2: 230,134,612 bp Total written (filtered): 406,641,404 bp (88.3%) Read 1: 203,266,521 bp Read 2: 203,374,883 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1233731 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 13.8% C: 37.8% G: 24.7% T: 23.6% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 44983 47313.9 0 44983 4 47855 11828.5 0 29513 18342 5 50904 2957.1 1 22883 28021 6 35926 739.3 1 23480 12446 7 26276 184.8 1 23126 3150 8 28315 46.2 1 24262 3718 335 9 27232 11.6 1 25115 550 1567 10 25801 2.9 2 24294 434 1073 11 25613 0.7 2 24341 447 825 12 25295 0.2 2 24641 334 320 13 26149 0.2 2 25547 317 285 14 27174 0.2 2 26531 338 305 15 28334 0.2 2 27777 319 238 16 28287 0.2 2 27611 351 325 17 28763 0.2 2 28159 314 290 18 28708 0.2 2 28091 324 293 19 30524 0.2 2 29733 345 446 20 31602 0.2 2 31021 321 260 21 31048 0.2 2 30390 336 322 22 30376 0.2 2 29734 330 312 23 28883 0.2 2 28218 308 357 24 32716 0.2 2 32091 318 307 25 38225 0.2 2 37625 376 224 26 38226 0.2 2 37601 370 255 27 32869 0.2 2 32314 284 271 28 31710 0.2 2 31146 285 279 29 34280 0.2 2 33678 307 295 30 36535 0.2 2 35977 302 256 31 36426 0.2 2 35891 311 224 32 35547 0.2 2 34934 365 248 33 32430 0.2 2 31757 314 359 34 30675 0.2 2 30112 257 306 35 39883 0.2 2 39354 324 205 36 41537 0.2 2 40930 343 264 37 27899 0.2 2 27384 259 256 38 20592 0.2 2 20203 186 203 39 15419 0.2 2 14974 148 297 40 11813 0.2 2 11449 107 257 41 6591 0.2 2 6264 75 252 42 3611 0.2 2 3065 54 492 43 2364 0.2 2 2051 32 281 44 1677 0.2 2 1396 19 262 45 1908 0.2 2 1669 37 202 46 5163 0.2 2 4831 60 272 47 3560 0.2 2 3264 31 265 48 1861 0.2 2 1665 45 151 49 1582 0.2 2 1230 24 328 50 1038 0.2 2 846 22 170 51 575 0.2 2 109 15 451 52 301 0.2 2 66 9 226 53 613 0.2 2 63 33 517 54 352 0.2 2 69 12 271 55 325 0.2 2 56 16 253 56 541 0.2 2 112 11 418 57 366 0.2 2 97 6 263 58 379 0.2 2 42 8 329 59 416 0.2 2 28 22 366 60 309 0.2 2 22 6 281 61 338 0.2 2 24 11 303 62 217 0.2 2 22 7 188 63 506 0.2 2 11 13 482 64 387 0.2 2 7 11 369 65 252 0.2 2 0 18 234 66 305 0.2 2 2 3 300 67 192 0.2 2 4 11 177 68 1047 0.2 2 0 34 1013 69 302 0.2 2 0 11 291 70 378 0.2 2 0 4 374 71 98 0.2 2 0 0 98 72 134 0.2 2 0 1 133 73 447 0.2 2 0 10 437 74 399 0.2 2 0 9 390 75 184 0.2 2 0 14 170 76 183 0.2 2 0 4 179 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1226357 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.2% C: 36.9% G: 25.0% T: 23.9% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 44408 47313.9 0 44408 4 47241 11828.5 0 29174 18067 5 49928 2957.1 1 21742 28186 6 35187 739.3 1 22731 12456 7 26026 184.8 1 22126 3900 8 28018 46.2 1 23892 3755 371 9 27078 11.6 1 24774 729 1575 10 25673 2.9 2 23734 766 1173 11 25441 0.7 2 23876 675 890 12 25126 0.2 2 24249 497 380 13 25983 0.2 2 25197 477 309 14 26993 0.2 2 26165 503 325 15 28219 0.2 2 27415 509 295 16 28131 0.2 2 27281 486 364 17 28613 0.2 2 27775 467 371 18 28537 0.2 2 27757 468 312 19 30381 0.2 2 29393 521 467 20 31463 0.2 2 30649 504 310 21 30868 0.2 2 30051 458 359 22 30233 0.2 2 29364 518 351 23 28785 0.2 2 27897 454 434 24 32551 0.2 2 31714 486 351 25 38097 0.2 2 37250 533 314 26 38088 0.2 2 37231 527 330 27 32769 0.2 2 32006 455 308 28 31601 0.2 2 30855 419 327 29 34159 0.2 2 33362 449 348 30 36412 0.2 2 35612 504 296 31 36342 0.2 2 35583 451 308 32 35401 0.2 2 34609 499 293 33 32366 0.2 2 31435 477 454 34 30580 0.2 2 29830 427 323 35 39759 0.2 2 38927 555 277 36 41413 0.2 2 40547 533 333 37 27826 0.2 2 27145 382 299 38 20536 0.2 2 19992 279 265 39 15397 0.2 2 14835 211 351 40 11799 0.2 2 11312 184 303 41 6606 0.2 2 6221 83 302 42 3618 0.2 2 3013 67 538 43 2396 0.2 2 2042 43 311 44 1671 0.2 2 1376 40 255 45 1930 0.2 2 1663 22 245 46 5146 0.2 2 4798 81 267 47 3521 0.2 2 3233 44 244 48 1877 0.2 2 1636 65 176 49 1588 0.2 2 1207 39 342 50 1046 0.2 2 840 28 178 51 534 0.2 2 103 15 416 52 307 0.2 2 63 13 231 53 595 0.2 2 58 34 503 54 365 0.2 2 65 10 290 55 284 0.2 2 51 19 214 56 551 0.2 2 102 17 432 57 377 0.2 2 93 10 274 58 380 0.2 2 37 16 327 59 390 0.2 2 26 22 342 60 265 0.2 2 20 7 238 61 378 0.2 2 18 17 343 62 236 0.2 2 19 9 208 63 500 0.2 2 9 13 478 64 365 0.2 2 6 10 349 65 253 0.2 2 0 19 234 66 315 0.2 2 2 4 309 67 185 0.2 2 4 17 164 68 1106 0.2 2 0 51 1055 69 243 0.2 2 0 20 223 70 369 0.2 2 0 5 364 71 101 0.2 2 0 0 101 72 132 0.2 2 0 3 129 73 461 0.2 2 0 12 449 74 435 0.2 2 0 4 431 75 215 0.2 2 0 18 197 76 188 0.2 2 0 5 183 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_300_S3_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_300_S3_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_300_S3_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_300_S3_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_300_S3_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_300_S3_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_300_S3_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_300_S3_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 949.85 s (180 us/read; 0.33 M reads/minute). === Summary === Total read pairs processed: 5,271,099 Read 1 with adapter: 1,921,344 (36.5%) Read 2 with adapter: 1,911,038 (36.3%) Pairs that were too short: 5,445 (0.1%) Pairs written (passing filters): 5,265,654 (99.9%) Total basepairs processed: 801,207,048 bp Read 1: 400,603,524 bp Read 2: 400,603,524 bp Total written (filtered): 716,533,393 bp (89.4%) Read 1: 358,180,437 bp Read 2: 358,352,956 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1921344 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.3% C: 37.3% G: 24.3% T: 24.0% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 75301 82360.9 0 75301 4 80025 20590.2 0 46890 33135 5 85265 5147.6 1 34215 51050 6 57407 1286.9 1 35173 22234 7 39865 321.7 1 34394 5471 8 43117 80.4 1 35914 6597 606 9 40483 20.1 1 36636 996 2851 10 38684 5.0 2 35969 682 2033 11 39251 1.3 2 36940 781 1530 12 38616 0.3 2 37571 560 485 13 39370 0.3 2 38238 543 589 14 40790 0.3 2 39779 532 479 15 41676 0.3 2 40698 504 474 16 40273 0.3 2 39221 502 550 17 42581 0.3 2 41556 516 509 18 42031 0.3 2 40986 519 526 19 44960 0.3 2 43686 492 782 20 45537 0.3 2 44550 512 475 21 45454 0.3 2 44349 605 500 22 44943 0.3 2 43860 517 566 23 43194 0.3 2 42152 447 595 24 48695 0.3 2 47656 522 517 25 55982 0.3 2 54978 575 429 26 54642 0.3 2 53585 589 468 27 48188 0.3 2 47264 508 416 28 47243 0.3 2 46230 504 509 29 51220 0.3 2 50181 540 499 30 54571 0.3 2 53551 577 443 31 53394 0.3 2 52421 534 439 32 53168 0.3 2 52163 557 448 33 50372 0.3 2 49197 478 697 34 49725 0.3 2 48678 500 547 35 65234 0.3 2 64253 616 365 36 69134 0.3 2 67977 683 474 37 48468 0.3 2 47528 498 442 38 38438 0.3 2 37702 367 369 39 29226 0.3 2 28364 263 599 40 22050 0.3 2 21395 197 458 41 11944 0.3 2 11336 123 485 42 6593 0.3 2 5570 86 937 43 4048 0.3 2 3515 42 491 44 2828 0.3 2 2307 34 487 45 3310 0.3 2 2919 49 342 46 8838 0.3 2 8174 91 573 47 6358 0.3 2 5769 92 497 48 3886 0.3 2 3490 68 328 49 3287 0.3 2 2663 45 579 50 2239 0.3 2 1846 33 360 51 1169 0.3 2 176 31 962 52 551 0.3 2 79 18 454 53 1233 0.3 2 51 34 1148 54 768 0.3 2 75 19 674 55 485 0.3 2 73 16 396 56 1272 0.3 2 142 29 1101 57 683 0.3 2 153 14 516 58 734 0.3 2 50 21 663 59 718 0.3 2 41 22 655 60 621 0.3 2 21 10 590 61 790 0.3 2 22 17 751 62 409 0.3 2 13 7 389 63 1169 0.3 2 7 19 1143 64 756 0.3 2 2 13 741 65 574 0.3 2 2 25 547 66 571 0.3 2 1 8 562 67 368 0.3 2 7 34 327 68 2295 0.3 2 0 87 2208 69 562 0.3 2 0 37 525 70 821 0.3 2 0 8 813 71 229 0.3 2 0 1 228 72 279 0.3 2 0 7 272 73 840 0.3 2 0 34 806 74 745 0.3 2 0 7 738 75 444 0.3 2 0 33 411 76 354 0.3 2 1 8 345 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1911038 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.7% C: 35.9% G: 24.8% T: 24.5% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 75031 82360.9 0 75031 4 78717 20590.2 0 46238 32479 5 83544 5147.6 1 32965 50579 6 56731 1286.9 1 34428 22303 7 39528 321.7 1 33282 6246 8 42373 80.4 1 35550 6206 617 9 40423 20.1 1 36395 983 3045 10 38513 5.0 2 35420 996 2097 11 39024 1.3 2 36566 917 1541 12 38463 0.3 2 37221 671 571 13 39150 0.3 2 37931 630 589 14 40658 0.3 2 39391 687 580 15 41456 0.3 2 40315 639 502 16 40115 0.3 2 38891 623 601 17 42372 0.3 2 41166 647 559 18 41849 0.3 2 40631 632 586 19 44716 0.3 2 43290 650 776 20 45418 0.3 2 44171 682 565 21 45325 0.3 2 44070 681 574 22 44766 0.3 2 43514 610 642 23 43050 0.3 2 41830 591 629 24 48538 0.3 2 47323 647 568 25 55825 0.3 2 54556 760 509 26 54481 0.3 2 53232 718 531 27 48056 0.3 2 46908 648 500 28 47103 0.3 2 45976 586 541 29 51081 0.3 2 49912 622 547 30 54424 0.3 2 53206 726 492 31 53235 0.3 2 52049 679 507 32 52993 0.3 2 51778 704 511 33 50273 0.3 2 48802 676 795 34 49649 0.3 2 48352 649 648 35 65053 0.3 2 63826 788 439 36 68980 0.3 2 67590 806 584 37 48382 0.3 2 47231 640 511 38 38383 0.3 2 37460 456 467 39 29147 0.3 2 28152 359 636 40 22009 0.3 2 21252 267 490 41 11943 0.3 2 11231 172 540 42 6549 0.3 2 5535 90 924 43 4099 0.3 2 3476 62 561 44 2803 0.3 2 2297 51 455 45 3360 0.3 2 2902 60 398 46 8764 0.3 2 8127 122 515 47 6273 0.3 2 5742 80 451 48 3888 0.3 2 3457 92 339 49 3289 0.3 2 2643 56 590 50 2233 0.3 2 1834 37 362 51 1080 0.3 2 171 27 882 52 536 0.3 2 74 20 442 53 1185 0.3 2 49 42 1094 54 720 0.3 2 69 16 635 55 477 0.3 2 82 13 382 56 1249 0.3 2 136 36 1077 57 647 0.3 2 148 10 489 58 687 0.3 2 48 23 616 59 635 0.3 2 38 27 570 60 618 0.3 2 16 18 584 61 793 0.3 2 18 23 752 62 444 0.3 2 11 13 420 63 1175 0.3 2 5 25 1145 64 681 0.3 2 1 12 668 65 505 0.3 2 0 23 482 66 525 0.3 2 1 11 513 67 409 0.3 2 6 29 374 68 2202 0.3 2 0 61 2141 69 531 0.3 2 0 29 502 70 921 0.3 2 0 13 908 71 197 0.3 2 0 1 196 72 271 0.3 2 0 2 269 73 851 0.3 2 0 45 806 74 828 0.3 2 0 5 823 75 439 0.3 2 0 34 405 76 397 0.3 2 1 5 391 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_3_S24_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_3_S24_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_3_S24_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_3_S24_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_3_S24_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_3_S24_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_3_S24_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_3_S24_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 484.94 s (183 us/read; 0.33 M reads/minute). === Summary === Total read pairs processed: 2,648,492 Read 1 with adapter: 875,320 (33.0%) Read 2 with adapter: 869,705 (32.8%) Pairs that were too short: 2,586 (0.1%) Pairs written (passing filters): 2,645,906 (99.9%) Total basepairs processed: 402,570,784 bp Read 1: 201,285,392 bp Read 2: 201,285,392 bp Total written (filtered): 366,393,470 bp (91.0%) Read 1: 183,169,459 bp Read 2: 183,224,011 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 875320 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 13.9% C: 38.0% G: 25.1% T: 23.0% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 40306 41382.7 0 40306 4 43071 10345.7 0 24437 18634 5 45564 2586.4 1 18133 27431 6 29882 646.6 1 18072 11810 7 20753 161.7 1 17829 2924 8 21901 40.4 1 18068 3543 290 9 20357 10.1 1 18321 454 1582 10 19336 2.5 2 17992 387 957 11 19380 0.6 2 18252 336 792 12 18753 0.2 2 18249 258 246 13 19445 0.2 2 18925 254 266 14 19517 0.2 2 19027 238 252 15 20112 0.2 2 19649 252 211 16 19773 0.2 2 19288 211 274 17 19908 0.2 2 19414 230 264 18 20117 0.2 2 19640 227 250 19 21030 0.2 2 20464 218 348 20 21553 0.2 2 21047 245 261 21 21370 0.2 2 20861 251 258 22 20711 0.2 2 20184 225 302 23 19974 0.2 2 19455 196 323 24 21704 0.2 2 21223 203 278 25 24505 0.2 2 24097 216 192 26 24575 0.2 2 24130 210 235 27 21427 0.2 2 21012 200 215 28 20510 0.2 2 20089 174 247 29 21445 0.2 2 21011 179 255 30 22978 0.2 2 22587 167 224 31 23330 0.2 2 22939 216 175 32 22871 0.2 2 22460 221 190 33 21019 0.2 2 20464 182 373 34 19972 0.2 2 19533 148 291 35 24949 0.2 2 24578 188 183 36 26251 0.2 2 25829 207 215 37 18187 0.2 2 17815 168 204 38 13719 0.2 2 13406 110 203 39 10446 0.2 2 10059 108 279 40 8083 0.2 2 7797 67 219 41 4491 0.2 2 4191 58 242 42 2589 0.2 2 2049 45 495 43 1531 0.2 2 1237 20 274 44 971 0.2 2 736 21 214 45 1058 0.2 2 834 17 207 46 2605 0.2 2 2323 31 251 47 1884 0.2 2 1633 24 227 48 1005 0.2 2 805 35 165 49 952 0.2 2 611 17 324 50 666 0.2 2 489 12 165 51 517 0.2 2 65 19 433 52 288 0.2 2 47 15 226 53 574 0.2 2 42 33 499 54 318 0.2 2 45 10 263 55 262 0.2 2 60 7 195 56 482 0.2 2 80 23 379 57 323 0.2 2 70 10 243 58 364 0.2 2 30 15 319 59 390 0.2 2 20 12 358 60 300 0.2 2 9 9 282 61 302 0.2 2 30 8 264 62 222 0.2 2 20 5 197 63 468 0.2 2 4 16 448 64 352 0.2 2 2 5 345 65 251 0.2 2 2 21 228 66 251 0.2 2 1 7 243 67 190 0.2 2 2 13 175 68 1001 0.2 2 1 44 956 69 268 0.2 2 0 19 249 70 273 0.2 2 0 4 269 71 93 0.2 2 0 0 93 72 122 0.2 2 0 6 116 73 409 0.2 2 0 18 391 74 397 0.2 2 0 11 386 75 190 0.2 2 0 17 173 76 177 0.2 2 0 5 172 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 869705 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.4% C: 37.3% G: 25.3% T: 23.0% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 39911 41382.7 0 39911 4 41929 10345.7 0 24239 17690 5 44550 2586.4 1 17354 27196 6 29312 646.6 1 17555 11757 7 20514 161.7 1 17260 3254 8 21588 40.4 1 17853 3384 351 9 20224 10.1 1 18109 535 1580 10 19295 2.5 2 17630 549 1116 11 19276 0.6 2 17931 515 830 12 18710 0.2 2 17999 356 355 13 19345 0.2 2 18700 368 277 14 19462 0.2 2 18846 327 289 15 20026 0.2 2 19417 346 263 16 19693 0.2 2 19071 319 303 17 19827 0.2 2 19221 322 284 18 20043 0.2 2 19420 307 316 19 20962 0.2 2 20269 312 381 20 21444 0.2 2 20851 334 259 21 21273 0.2 2 20629 351 293 22 20599 0.2 2 19971 334 294 23 19926 0.2 2 19274 296 356 24 21601 0.2 2 20989 347 265 25 24448 0.2 2 23848 357 243 26 24487 0.2 2 23897 308 282 27 21356 0.2 2 20848 288 220 28 20433 0.2 2 19879 281 273 29 21379 0.2 2 20849 280 250 30 22916 0.2 2 22363 314 239 31 23302 0.2 2 22779 286 237 32 22822 0.2 2 22212 348 262 33 20975 0.2 2 20281 268 426 34 19917 0.2 2 19369 226 322 35 24921 0.2 2 24338 343 240 36 26179 0.2 2 25557 341 281 37 18136 0.2 2 17662 242 232 38 13679 0.2 2 13276 203 200 39 10475 0.2 2 9980 152 343 40 8104 0.2 2 7725 113 266 41 4475 0.2 2 4154 69 252 42 2565 0.2 2 2025 62 478 43 1553 0.2 2 1216 34 303 44 991 0.2 2 729 21 241 45 1024 0.2 2 829 24 171 46 2612 0.2 2 2297 52 263 47 1913 0.2 2 1619 25 269 48 1006 0.2 2 798 46 162 49 960 0.2 2 607 26 327 50 692 0.2 2 485 18 189 51 575 0.2 2 71 22 482 52 295 0.2 2 47 14 234 53 528 0.2 2 39 38 451 54 324 0.2 2 43 19 262 55 309 0.2 2 56 14 239 56 452 0.2 2 77 20 355 57 329 0.2 2 66 12 251 58 345 0.2 2 27 25 293 59 401 0.2 2 16 16 369 60 263 0.2 2 7 9 247 61 280 0.2 2 25 10 245 62 235 0.2 2 18 7 210 63 502 0.2 2 4 14 484 64 356 0.2 2 0 13 343 65 237 0.2 2 2 11 224 66 301 0.2 2 1 7 293 67 202 0.2 2 1 16 185 68 973 0.2 2 0 43 930 69 249 0.2 2 0 24 225 70 307 0.2 2 0 6 301 71 120 0.2 2 0 3 117 72 114 0.2 2 0 0 114 73 423 0.2 2 0 20 403 74 389 0.2 2 0 7 382 75 188 0.2 2 0 14 174 76 178 0.2 2 0 3 175 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_800_S9_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_800_S9_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_800_S9_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_800_S9_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_800_S9_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Ct_800_S9_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_800_S9_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Ct_800_S9_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 676.99 s (184 us/read; 0.33 M reads/minute). === Summary === Total read pairs processed: 3,676,166 Read 1 with adapter: 1,333,350 (36.3%) Read 2 with adapter: 1,326,286 (36.1%) Pairs that were too short: 3,868 (0.1%) Pairs written (passing filters): 3,672,298 (99.9%) Total basepairs processed: 558,777,232 bp Read 1: 279,388,616 bp Read 2: 279,388,616 bp Total written (filtered): 499,010,029 bp (89.3%) Read 1: 249,445,043 bp Read 2: 249,564,986 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1333350 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.2% C: 37.3% G: 24.3% T: 24.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 51504 57440.1 0 51504 4 54364 14360.0 0 31336 23028 5 58203 3590.0 1 23181 35022 6 38503 897.5 1 22975 15528 7 26441 224.4 1 22465 3976 8 27253 56.1 1 23440 3355 458 9 27314 14.0 1 24542 686 2086 10 25965 3.5 2 24217 449 1299 11 26702 0.9 2 25113 480 1109 12 26122 0.2 2 25388 326 408 13 27043 0.2 2 26313 319 411 14 27669 0.2 2 26936 347 386 15 28381 0.2 2 27723 331 327 16 27208 0.2 2 26468 339 401 17 28262 0.2 2 27605 285 372 18 28199 0.2 2 27530 281 388 19 30214 0.2 2 29359 307 548 20 31247 0.2 2 30556 310 381 21 31500 0.2 2 30764 362 374 22 31026 0.2 2 30350 307 369 23 29618 0.2 2 28883 282 453 24 33877 0.2 2 33139 339 399 25 39604 0.2 2 38923 363 318 26 38408 0.2 2 37689 352 367 27 32640 0.2 2 32080 307 253 28 31534 0.2 2 30883 277 374 29 34771 0.2 2 34073 316 382 30 37541 0.2 2 36872 309 360 31 37474 0.2 2 36822 344 308 32 37240 0.2 2 36536 373 331 33 34871 0.2 2 34087 271 513 34 35330 0.2 2 34586 296 448 35 49405 0.2 2 48700 412 293 36 52654 0.2 2 51826 440 388 37 34841 0.2 2 34244 284 313 38 26942 0.2 2 26439 205 298 39 21398 0.2 2 20732 189 477 40 16376 0.2 2 15904 127 345 41 9055 0.2 2 8609 80 366 42 4955 0.2 2 4164 70 721 43 3054 0.2 2 2615 37 402 44 2148 0.2 2 1733 30 385 45 2613 0.2 2 2287 39 287 46 7640 0.2 2 7189 77 374 47 5305 0.2 2 4944 60 301 48 2755 0.2 2 2459 44 252 49 2458 0.2 2 1921 29 508 50 1655 0.2 2 1375 20 260 51 866 0.2 2 145 22 699 52 387 0.2 2 63 12 312 53 890 0.2 2 37 51 802 54 469 0.2 2 48 11 410 55 389 0.2 2 46 18 325 56 872 0.2 2 99 28 745 57 588 0.2 2 177 10 401 58 523 0.2 2 62 19 442 59 583 0.2 2 18 17 548 60 458 0.2 2 16 9 433 61 530 0.2 2 24 16 490 62 345 0.2 2 10 8 327 63 761 0.2 2 4 11 746 64 550 0.2 2 0 15 535 65 402 0.2 2 0 25 377 66 465 0.2 2 0 4 461 67 271 0.2 2 1 21 249 68 1682 0.2 2 0 76 1606 69 408 0.2 2 1 22 385 70 529 0.2 2 0 6 523 71 179 0.2 2 0 3 176 72 197 0.2 2 0 2 195 73 561 0.2 2 0 15 546 74 623 0.2 2 0 10 613 75 277 0.2 2 0 23 254 76 263 0.2 2 3 2 258 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1326286 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.7% C: 35.8% G: 24.9% T: 24.6% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 52084 57440.1 0 52084 4 53384 14360.0 0 30668 22716 5 56578 3590.0 1 22077 34501 6 37778 897.5 1 22360 15418 7 26140 224.4 1 21605 4535 8 26988 56.1 1 23168 3350 470 9 27145 14.0 1 24310 756 2079 10 25939 3.5 2 23678 746 1515 11 26532 0.9 2 24848 567 1117 12 25989 0.2 2 25055 458 476 13 26912 0.2 2 25973 459 480 14 27518 0.2 2 26593 474 451 15 28273 0.2 2 27438 430 405 16 27131 0.2 2 26156 478 497 17 28115 0.2 2 27246 423 446 18 28079 0.2 2 27207 425 447 19 30042 0.2 2 29075 430 537 20 31133 0.2 2 30241 463 429 21 31378 0.2 2 30469 462 447 22 30962 0.2 2 30078 411 473 23 29549 0.2 2 28613 402 534 24 33769 0.2 2 32853 448 468 25 39467 0.2 2 38602 511 354 26 38293 0.2 2 37321 543 429 27 32566 0.2 2 31775 436 355 28 31448 0.2 2 30589 401 458 29 34633 0.2 2 33795 462 376 30 37427 0.2 2 36569 443 415 31 37420 0.2 2 36536 452 432 32 37124 0.2 2 36217 510 397 33 34848 0.2 2 33816 412 620 34 35226 0.2 2 34303 434 489 35 49288 0.2 2 48281 605 402 36 52490 0.2 2 51412 649 429 37 34736 0.2 2 33967 399 370 38 26851 0.2 2 26200 316 335 39 21289 0.2 2 20570 241 478 40 16359 0.2 2 15749 214 396 41 9046 0.2 2 8533 126 387 42 5009 0.2 2 4118 99 792 43 3092 0.2 2 2586 48 458 44 2134 0.2 2 1709 37 388 45 2666 0.2 2 2267 39 360 46 7644 0.2 2 7143 93 408 47 5317 0.2 2 4913 71 333 48 2764 0.2 2 2431 61 272 49 2403 0.2 2 1893 40 470 50 1664 0.2 2 1353 37 274 51 793 0.2 2 141 23 629 52 408 0.2 2 61 11 336 53 809 0.2 2 43 45 721 54 462 0.2 2 45 18 399 55 376 0.2 2 45 16 315 56 771 0.2 2 97 23 651 57 556 0.2 2 165 15 376 58 504 0.2 2 59 22 423 59 530 0.2 2 16 26 488 60 440 0.2 2 13 13 414 61 548 0.2 2 16 14 518 62 326 0.2 2 7 11 308 63 732 0.2 2 3 13 716 64 563 0.2 2 1 7 555 65 367 0.2 2 0 22 345 66 489 0.2 2 0 8 481 67 311 0.2 2 1 25 285 68 1613 0.2 2 1 63 1549 69 417 0.2 2 1 27 389 70 547 0.2 2 0 9 538 71 150 0.2 2 0 0 150 72 187 0.2 2 0 9 178 73 604 0.2 2 0 18 586 74 583 0.2 2 0 12 571 75 303 0.2 2 0 24 279 76 275 0.2 2 3 4 268 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_1_S16_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_1_S16_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_1_S16_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_1_S16_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_1_S16_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_1_S16_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_1_S16_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_1_S16_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 455.11 s (172 us/read; 0.35 M reads/minute). === Summary === Total read pairs processed: 2,639,705 Read 1 with adapter: 906,543 (34.3%) Read 2 with adapter: 902,814 (34.2%) Pairs that were too short: 2,510 (0.1%) Pairs written (passing filters): 2,637,195 (99.9%) Total basepairs processed: 401,235,160 bp Read 1: 200,617,580 bp Read 2: 200,617,580 bp Total written (filtered): 363,682,498 bp (90.6%) Read 1: 181,815,218 bp Read 2: 181,867,280 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 906543 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.2% C: 37.8% G: 25.0% T: 22.9% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 43250 41245.4 0 43250 4 46310 10311.3 0 27211 19099 5 47390 2577.8 1 20188 27202 6 31669 644.5 1 20068 11601 7 22425 161.1 1 19266 3159 8 23754 40.3 1 19793 3686 275 9 21662 10.1 1 19633 505 1524 10 20605 2.5 2 19247 357 1001 11 20430 0.6 2 19289 345 796 12 19526 0.2 2 19010 261 255 13 20308 0.2 2 19783 252 273 14 20255 0.2 2 19821 203 231 15 21505 0.2 2 21034 258 213 16 20602 0.2 2 20114 234 254 17 20625 0.2 2 20109 232 284 18 20083 0.2 2 19674 212 197 19 20630 0.2 2 20102 191 337 20 21493 0.2 2 21067 196 230 21 20783 0.2 2 20265 228 290 22 19854 0.2 2 19394 169 291 23 18960 0.2 2 18510 173 277 24 21265 0.2 2 20876 190 199 25 24783 0.2 2 24400 211 172 26 24140 0.2 2 23696 214 230 27 20513 0.2 2 20103 201 209 28 19580 0.2 2 19193 173 214 29 20326 0.2 2 19900 189 237 30 21455 0.2 2 21023 206 226 31 21400 0.2 2 21003 205 192 32 20706 0.2 2 20320 178 208 33 19526 0.2 2 19017 170 339 34 19595 0.2 2 19106 147 342 35 27082 0.2 2 26700 236 146 36 29657 0.2 2 29214 234 209 37 20532 0.2 2 20123 168 241 38 16015 0.2 2 15692 145 178 39 13634 0.2 2 13219 109 306 40 11178 0.2 2 10860 82 236 41 6331 0.2 2 6024 59 248 42 3537 0.2 2 2927 56 554 43 2062 0.2 2 1761 33 268 44 1224 0.2 2 994 18 212 45 1249 0.2 2 1055 29 165 46 3649 0.2 2 3316 39 294 47 2592 0.2 2 2300 31 261 48 1442 0.2 2 1252 37 153 49 1481 0.2 2 1095 24 362 50 930 0.2 2 789 14 127 51 529 0.2 2 79 9 441 52 281 0.2 2 37 14 230 53 552 0.2 2 25 37 490 54 291 0.2 2 22 10 259 55 262 0.2 2 25 10 227 56 416 0.2 2 57 10 349 57 342 0.2 2 57 12 273 58 311 0.2 2 19 7 285 59 434 0.2 2 12 17 405 60 263 0.2 2 7 4 252 61 354 0.2 2 17 14 323 62 223 0.2 2 8 5 210 63 435 0.2 2 7 15 413 64 345 0.2 2 4 11 330 65 236 0.2 2 1 20 215 66 295 0.2 2 0 8 287 67 204 0.2 2 1 22 181 68 923 0.2 2 0 52 871 69 251 0.2 2 0 22 229 70 289 0.2 2 0 3 286 71 97 0.2 2 0 0 97 72 101 0.2 2 0 2 99 73 380 0.2 2 0 17 363 74 408 0.2 2 0 6 402 75 152 0.2 2 0 5 147 76 166 0.2 2 0 3 163 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 902814 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.5% C: 37.1% G: 25.2% T: 23.2% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 43102 41245.4 0 43102 4 45560 10311.3 0 26775 18785 5 47076 2577.8 1 19578 27498 6 31457 644.5 1 19661 11796 7 22127 161.1 1 18656 3471 8 23597 40.3 1 19605 3674 318 9 21500 10.1 1 19441 547 1512 10 20575 2.5 2 18946 538 1091 11 20340 0.6 2 19079 430 831 12 19475 0.2 2 18849 315 311 13 20193 0.2 2 19582 294 317 14 20175 0.2 2 19640 272 263 15 21394 0.2 2 20864 302 228 16 20534 0.2 2 19911 306 317 17 20504 0.2 2 19944 284 276 18 20015 0.2 2 19506 265 244 19 20529 0.2 2 19916 275 338 20 21473 0.2 2 20910 285 278 21 20665 0.2 2 20126 272 267 22 19806 0.2 2 19232 244 330 23 18888 0.2 2 18384 217 287 24 21238 0.2 2 20715 246 277 25 24717 0.2 2 24184 336 197 26 24121 0.2 2 23515 324 282 27 20460 0.2 2 19947 290 223 28 19540 0.2 2 19070 231 239 29 20260 0.2 2 19763 243 254 30 21384 0.2 2 20924 238 222 31 21400 0.2 2 20906 236 258 32 20654 0.2 2 20185 257 212 33 19476 0.2 2 18853 220 403 34 19552 0.2 2 18954 226 372 35 27065 0.2 2 26547 276 242 36 29595 0.2 2 29059 287 249 37 20458 0.2 2 20008 230 220 38 15991 0.2 2 15601 196 194 39 13608 0.2 2 13130 148 330 40 11189 0.2 2 10762 151 276 41 6361 0.2 2 6011 64 286 42 3492 0.2 2 2904 52 536 43 2088 0.2 2 1754 39 295 44 1271 0.2 2 984 23 264 45 1281 0.2 2 1044 30 207 46 3617 0.2 2 3310 37 270 47 2576 0.2 2 2277 31 268 48 1486 0.2 2 1241 65 180 49 1449 0.2 2 1085 34 330 50 989 0.2 2 788 16 185 51 513 0.2 2 74 16 423 52 259 0.2 2 39 12 208 53 574 0.2 2 24 45 505 54 295 0.2 2 24 11 260 55 245 0.2 2 21 12 212 56 410 0.2 2 50 12 348 57 304 0.2 2 55 5 244 58 337 0.2 2 14 14 309 59 426 0.2 2 12 21 393 60 252 0.2 2 8 4 240 61 268 0.2 2 14 8 246 62 206 0.2 2 6 6 194 63 429 0.2 2 7 11 411 64 345 0.2 2 2 5 338 65 240 0.2 2 1 17 222 66 276 0.2 2 0 3 273 67 219 0.2 2 1 36 182 68 962 0.2 2 0 71 891 69 264 0.2 2 0 43 221 70 274 0.2 2 0 1 273 71 110 0.2 2 0 0 110 72 96 0.2 2 0 2 94 73 446 0.2 2 0 23 423 74 395 0.2 2 0 9 386 75 168 0.2 2 0 11 157 76 198 0.2 2 0 3 195 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_2_S17_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_2_S17_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_2_S17_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_2_S17_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_2_S17_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_2_S17_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_2_S17_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_2_S17_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 930.21 s (182 us/read; 0.33 M reads/minute). === Summary === Total read pairs processed: 5,115,781 Read 1 with adapter: 1,383,851 (27.1%) Read 2 with adapter: 1,378,107 (26.9%) Pairs that were too short: 5,556 (0.1%) Pairs written (passing filters): 5,110,225 (99.9%) Total basepairs processed: 777,598,712 bp Read 1: 388,799,356 bp Read 2: 388,799,356 bp Total written (filtered): 724,625,587 bp (93.2%) Read 1: 362,285,202 bp Read 2: 362,340,385 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1383851 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.8% C: 37.0% G: 24.7% T: 23.4% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 80813 79934.1 0 80813 4 86055 19983.5 0 46787 39268 5 90258 4995.9 1 32710 57548 6 56657 1249.0 1 32361 24296 7 37438 312.2 1 31379 6059 8 40191 78.1 1 32610 6966 615 9 36058 19.5 1 31905 940 3213 10 33905 4.9 2 31078 652 2175 11 33529 1.2 2 31282 551 1696 12 31526 0.3 2 30586 409 531 13 32257 0.3 2 31354 387 516 14 32040 0.3 2 31189 338 513 15 32596 0.3 2 31796 368 432 16 32383 0.3 2 31506 312 565 17 33332 0.3 2 32428 346 558 18 32173 0.3 2 31381 325 467 19 33041 0.3 2 31993 329 719 20 33498 0.3 2 32625 338 535 21 32062 0.3 2 31127 385 550 22 30303 0.3 2 29440 284 579 23 28977 0.3 2 28103 294 580 24 31073 0.3 2 30237 319 517 25 34688 0.3 2 33970 322 396 26 34446 0.3 2 33622 355 469 27 30798 0.3 2 30068 287 443 28 29536 0.3 2 28760 270 506 29 30216 0.3 2 29440 292 484 30 30703 0.3 2 30001 278 424 31 30085 0.3 2 29360 290 435 32 28741 0.3 2 28000 320 421 33 26194 0.3 2 25178 242 774 34 24659 0.3 2 23807 213 639 35 30006 0.3 2 29387 274 345 36 31621 0.3 2 30909 279 433 37 23274 0.3 2 22619 220 435 38 18841 0.3 2 18307 169 365 39 14476 0.3 2 13687 130 659 40 10937 0.3 2 10268 120 549 41 5999 0.3 2 5462 62 475 42 3842 0.3 2 2720 78 1044 43 2211 0.3 2 1577 63 571 44 1629 0.3 2 1033 60 536 45 1463 0.3 2 1037 43 383 46 3497 0.3 2 2879 41 577 47 2698 0.3 2 2205 38 455 48 1770 0.3 2 1354 71 345 49 1801 0.3 2 1115 30 656 50 1276 0.3 2 895 42 339 51 1080 0.3 2 88 34 958 52 479 0.3 2 55 9 415 53 1190 0.3 2 35 59 1096 54 662 0.3 2 38 18 606 55 542 0.3 2 62 24 456 56 976 0.3 2 86 27 863 57 621 0.3 2 64 20 537 58 680 0.3 2 24 21 635 59 816 0.3 2 13 29 774 60 537 0.3 2 7 7 523 61 662 0.3 2 27 20 615 62 461 0.3 2 26 11 424 63 982 0.3 2 6 22 954 64 740 0.3 2 1 18 721 65 550 0.3 2 2 23 525 66 612 0.3 2 0 8 604 67 435 0.3 2 7 38 390 68 2056 0.3 2 1 79 1976 69 560 0.3 2 0 34 526 70 666 0.3 2 0 13 653 71 248 0.3 2 0 1 247 72 238 0.3 2 0 5 233 73 904 0.3 2 0 42 862 74 926 0.3 2 0 15 911 75 319 0.3 2 0 12 307 76 337 0.3 2 0 3 334 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1378107 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 15.1% C: 36.2% G: 25.0% T: 23.6% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 80404 79934.1 0 80404 4 85061 19983.5 0 46021 39040 5 89207 4995.9 1 31722 57485 6 56039 1249.0 1 31563 24476 7 37122 312.2 1 30400 6722 8 39549 78.1 1 32214 6653 682 9 36035 19.5 1 31637 1051 3347 10 33856 4.9 2 30709 799 2348 11 33428 1.2 2 30923 719 1786 12 31443 0.3 2 30307 490 646 13 32168 0.3 2 31099 466 603 14 31892 0.3 2 30932 457 503 15 32460 0.3 2 31512 463 485 16 32305 0.3 2 31278 431 596 17 33190 0.3 2 32140 451 599 18 32051 0.3 2 31115 451 485 19 32941 0.3 2 31754 414 773 20 33350 0.3 2 32315 505 530 21 31992 0.3 2 30852 499 641 22 30220 0.3 2 29223 410 587 23 28938 0.3 2 27920 389 629 24 30978 0.3 2 30045 402 531 25 34637 0.3 2 33738 419 480 26 34385 0.3 2 33421 447 517 27 30699 0.3 2 29875 387 437 28 29445 0.3 2 28546 362 537 29 30187 0.3 2 29258 388 541 30 30680 0.3 2 29762 393 525 31 30045 0.3 2 29153 419 473 32 28739 0.3 2 27818 443 478 33 26171 0.3 2 25042 317 812 34 24671 0.3 2 23669 300 702 35 29975 0.3 2 29224 331 420 36 31610 0.3 2 30710 392 508 37 23227 0.3 2 22481 290 456 38 18805 0.3 2 18180 223 402 39 14471 0.3 2 13591 200 680 40 10896 0.3 2 10189 147 560 41 6170 0.3 2 5445 84 641 42 3878 0.3 2 2707 84 1087 43 2266 0.3 2 1567 76 623 44 1591 0.3 2 1035 42 514 45 1474 0.3 2 1041 43 390 46 3482 0.3 2 2852 65 565 47 2721 0.3 2 2197 41 483 48 1784 0.3 2 1336 72 376 49 1834 0.3 2 1090 54 690 50 1332 0.3 2 888 43 401 51 1059 0.3 2 90 24 945 52 540 0.3 2 51 36 453 53 1184 0.3 2 35 58 1091 54 646 0.3 2 34 23 589 55 520 0.3 2 54 24 442 56 999 0.3 2 71 38 890 57 615 0.3 2 56 15 544 58 664 0.3 2 22 29 613 59 832 0.3 2 13 46 773 60 543 0.3 2 7 7 529 61 698 0.3 2 23 22 653 62 455 0.3 2 22 14 419 63 984 0.3 2 3 16 965 64 708 0.3 2 1 16 691 65 550 0.3 2 2 19 529 66 551 0.3 2 0 6 545 67 445 0.3 2 5 47 393 68 1999 0.3 2 1 87 1911 69 563 0.3 2 0 54 509 70 665 0.3 2 0 9 656 71 250 0.3 2 0 1 249 72 264 0.3 2 0 5 259 73 911 0.3 2 0 35 876 74 882 0.3 2 0 11 871 75 398 0.3 2 0 14 384 76 378 0.3 2 0 9 369 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_300_S1_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_300_S1_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_300_S1_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_300_S1_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_300_S1_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_300_S1_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_300_S1_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_300_S1_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 960.71 s (187 us/read; 0.32 M reads/minute). === Summary === Total read pairs processed: 5,143,802 Read 1 with adapter: 1,948,373 (37.9%) Read 2 with adapter: 1,937,390 (37.7%) Pairs that were too short: 5,379 (0.1%) Pairs written (passing filters): 5,138,423 (99.9%) Total basepairs processed: 781,857,904 bp Read 1: 390,928,952 bp Read 2: 390,928,952 bp Total written (filtered): 694,856,152 bp (88.9%) Read 1: 347,366,805 bp Read 2: 347,489,347 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1948373 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.4% C: 37.1% G: 24.4% T: 24.0% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 74173 80371.9 0 74173 4 77314 20093.0 0 46007 31307 5 83035 5023.2 1 33788 49247 6 56136 1255.8 1 34461 21675 7 38943 314.0 1 33166 5777 8 41351 78.5 1 34686 6078 587 9 40312 19.6 1 36318 1070 2924 10 38312 4.9 2 35457 868 1987 11 38950 1.2 2 36261 1103 1586 12 38387 0.3 2 37083 763 541 13 39750 0.3 2 38374 818 558 14 42055 0.3 2 40748 760 547 15 42861 0.3 2 41591 782 488 16 40610 0.3 2 39253 745 612 17 41247 0.3 2 40036 689 522 18 42038 0.3 2 40739 723 576 19 44742 0.3 2 43201 779 762 20 46361 0.3 2 45058 764 539 21 45502 0.3 2 44170 740 592 22 44983 0.3 2 43752 681 550 23 44284 0.3 2 42981 690 613 24 51635 0.3 2 50336 752 547 25 59775 0.3 2 58426 893 456 26 56541 0.3 2 55188 830 523 27 48037 0.3 2 46913 655 469 28 46730 0.3 2 45512 708 510 29 51337 0.3 2 50062 752 523 30 55404 0.3 2 54218 698 488 31 54832 0.3 2 53608 739 485 32 53892 0.3 2 52725 728 439 33 51394 0.3 2 49945 690 759 34 52783 0.3 2 51376 710 697 35 73283 0.3 2 71879 926 478 36 75409 0.3 2 73879 1007 523 37 50323 0.3 2 49225 675 423 38 37875 0.3 2 36958 479 438 39 29295 0.3 2 28251 407 637 40 22294 0.3 2 21490 293 511 41 12101 0.3 2 11445 173 483 42 6720 0.3 2 5606 97 1017 43 3988 0.3 2 3411 73 504 44 2903 0.3 2 2390 35 478 45 3788 0.3 2 3348 65 375 46 11114 0.3 2 10403 158 553 47 7538 0.3 2 6997 84 457 48 3901 0.3 2 3489 90 322 49 3353 0.3 2 2641 49 663 50 2142 0.3 2 1768 30 344 51 1145 0.3 2 123 17 1005 52 498 0.3 2 66 6 426 53 1296 0.3 2 56 46 1194 54 656 0.3 2 42 17 597 55 465 0.3 2 54 14 397 56 1171 0.3 2 111 17 1043 57 610 0.3 2 137 9 464 58 694 0.3 2 47 14 633 59 727 0.3 2 21 23 683 60 555 0.3 2 5 5 545 61 732 0.3 2 13 13 706 62 417 0.3 2 5 8 404 63 1090 0.3 2 6 23 1061 64 716 0.3 2 3 17 696 65 541 0.3 2 0 27 514 66 544 0.3 2 1 8 535 67 350 0.3 2 1 26 323 68 2197 0.3 2 0 58 2139 69 584 0.3 2 0 37 547 70 827 0.3 2 0 8 819 71 227 0.3 2 0 2 225 72 257 0.3 2 0 10 247 73 840 0.3 2 0 29 811 74 808 0.3 2 0 9 799 75 368 0.3 2 0 18 350 76 325 0.3 2 5 2 318 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1937390 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.8% C: 36.2% G: 24.7% T: 24.4% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 69636 80371.9 0 69636 4 76732 20093.0 0 45660 31072 5 81156 5023.2 1 31102 50054 6 55842 1255.8 1 33757 22085 7 38606 314.0 1 30811 7795 8 40895 78.5 1 34468 5616 811 9 40426 19.6 1 36194 1145 3087 10 38332 4.9 2 35455 773 2104 11 38969 1.2 2 35760 1456 1753 12 38294 0.3 2 37031 680 583 13 39614 0.3 2 38377 680 557 14 41909 0.3 2 40620 733 556 15 42677 0.3 2 41530 699 448 16 40483 0.3 2 39190 687 606 17 41136 0.3 2 39936 678 522 18 41942 0.3 2 40709 682 551 19 44604 0.3 2 43199 714 691 20 46247 0.3 2 44979 770 498 21 45401 0.3 2 44116 719 566 22 44876 0.3 2 43591 699 586 23 44237 0.3 2 42941 620 676 24 51564 0.3 2 50250 744 570 25 59673 0.3 2 58405 814 454 26 56339 0.3 2 55069 808 462 27 47917 0.3 2 46855 620 442 28 46706 0.3 2 45474 654 578 29 51238 0.3 2 50074 643 521 30 55318 0.3 2 54157 673 488 31 54754 0.3 2 53523 747 484 32 53827 0.3 2 52692 675 460 33 51353 0.3 2 49887 657 809 34 52654 0.3 2 51324 691 639 35 73133 0.3 2 71826 870 437 36 75260 0.3 2 73846 902 512 37 50278 0.3 2 49174 604 500 38 37822 0.3 2 36953 440 429 39 29267 0.3 2 28273 334 660 40 22280 0.3 2 21448 273 559 41 12083 0.3 2 11427 158 498 42 6793 0.3 2 5593 118 1082 43 4043 0.3 2 3408 69 566 44 2922 0.3 2 2389 35 498 45 3763 0.3 2 3354 63 346 46 11065 0.3 2 10382 140 543 47 7533 0.3 2 6967 94 472 48 3887 0.3 2 3477 77 333 49 3283 0.3 2 2638 38 607 50 2111 0.3 2 1775 24 312 51 1068 0.3 2 122 24 922 52 510 0.3 2 64 12 434 53 1236 0.3 2 52 71 1113 54 674 0.3 2 43 17 614 55 513 0.3 2 56 11 446 56 1160 0.3 2 113 22 1025 57 611 0.3 2 133 7 471 58 651 0.3 2 44 14 593 59 669 0.3 2 21 24 624 60 541 0.3 2 4 9 528 61 736 0.3 2 9 10 717 62 420 0.3 2 4 8 408 63 1046 0.3 2 0 23 1023 64 689 0.3 2 2 13 674 65 519 0.3 2 0 23 496 66 551 0.3 2 1 6 544 67 352 0.3 2 1 28 323 68 2130 0.3 2 0 80 2050 69 580 0.3 2 0 34 546 70 839 0.3 2 0 8 831 71 229 0.3 2 0 4 225 72 286 0.3 2 0 2 284 73 886 0.3 2 0 19 867 74 852 0.3 2 0 14 838 75 415 0.3 2 0 23 392 76 347 0.3 2 6 8 333 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_3_S18_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_3_S18_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_3_S18_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_3_S18_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_3_S18_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_3_S18_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_3_S18_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_3_S18_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 497.95 s (178 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 2,802,596 Read 1 with adapter: 1,021,983 (36.5%) Read 2 with adapter: 1,015,822 (36.2%) Pairs that were too short: 2,608 (0.1%) Pairs written (passing filters): 2,799,988 (99.9%) Total basepairs processed: 425,994,592 bp Read 1: 212,997,296 bp Read 2: 212,997,296 bp Total written (filtered): 384,200,096 bp (90.2%) Read 1: 192,056,852 bp Read 2: 192,143,244 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1021983 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.3% C: 37.2% G: 24.5% T: 24.0% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 42313 43790.6 0 42313 4 44986 10947.6 0 26976 18010 5 48778 2736.9 1 21197 27581 6 33536 684.2 1 21657 11879 7 24205 171.1 1 21225 2980 8 25440 42.8 1 21941 3161 338 9 24765 10.7 1 22673 486 1606 10 23648 2.7 2 22102 393 1153 11 23344 0.7 2 22094 424 826 12 22783 0.2 2 22224 295 264 13 23113 0.2 2 22588 241 284 14 23644 0.2 2 23062 287 295 15 24195 0.2 2 23714 227 254 16 24891 0.2 2 24303 276 312 17 25846 0.2 2 25280 262 304 18 25887 0.2 2 25336 260 291 19 27220 0.2 2 26556 274 390 20 27557 0.2 2 27030 267 260 21 26914 0.2 2 26366 257 291 22 26446 0.2 2 25896 239 311 23 24989 0.2 2 24439 217 333 24 27198 0.2 2 26677 229 292 25 31149 0.2 2 30657 268 224 26 31631 0.2 2 31106 272 253 27 28186 0.2 2 27709 246 231 28 27177 0.2 2 26669 233 275 29 28475 0.2 2 27977 235 263 30 30010 0.2 2 29496 266 248 31 28729 0.2 2 28252 270 207 32 27058 0.2 2 26591 234 233 33 24107 0.2 2 23543 197 367 34 21126 0.2 2 20681 181 264 35 25561 0.2 2 25136 200 225 36 25606 0.2 2 25144 221 241 37 17552 0.2 2 17165 155 232 38 12544 0.2 2 12259 95 190 39 8815 0.2 2 8450 79 286 40 6721 0.2 2 6395 63 263 41 3710 0.2 2 3425 44 241 42 2200 0.2 2 1737 40 423 43 1475 0.2 2 1178 20 277 44 1199 0.2 2 935 19 245 45 1220 0.2 2 974 26 220 46 2804 0.2 2 2508 32 264 47 1873 0.2 2 1616 20 237 48 1058 0.2 2 854 23 181 49 935 0.2 2 637 11 287 50 583 0.2 2 399 12 172 51 503 0.2 2 62 14 427 52 260 0.2 2 32 10 218 53 543 0.2 2 28 23 492 54 314 0.2 2 38 5 271 55 259 0.2 2 25 9 225 56 490 0.2 2 41 8 441 57 309 0.2 2 42 9 258 58 340 0.2 2 22 4 314 59 372 0.2 2 19 9 344 60 260 0.2 2 10 8 242 61 349 0.2 2 8 5 336 62 225 0.2 2 6 3 216 63 467 0.2 2 2 6 459 64 327 0.2 2 1 7 319 65 253 0.2 2 4 9 240 66 299 0.2 2 1 3 295 67 219 0.2 2 0 15 204 68 975 0.2 2 1 37 937 69 259 0.2 2 0 21 238 70 355 0.2 2 0 4 351 71 116 0.2 2 0 2 114 72 142 0.2 2 0 1 141 73 421 0.2 2 0 14 407 74 330 0.2 2 0 6 324 75 192 0.2 2 0 5 187 76 202 0.2 2 0 6 196 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1015822 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.6% C: 36.4% G: 24.8% T: 24.2% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 41830 43790.6 0 41830 4 44364 10947.6 0 26711 17653 5 47766 2736.9 1 20193 27573 6 33148 684.2 1 21051 12097 7 23956 171.1 1 20289 3667 8 25132 42.8 1 21681 3058 393 9 24668 10.7 1 22363 662 1643 10 23514 2.7 2 21782 504 1228 11 23143 0.7 2 21781 525 837 12 22659 0.2 2 21905 418 336 13 23010 0.2 2 22296 372 342 14 23474 0.2 2 22811 349 314 15 24062 0.2 2 23388 404 270 16 24756 0.2 2 24045 357 354 17 25754 0.2 2 25021 364 369 18 25761 0.2 2 25070 365 326 19 27105 0.2 2 26301 368 436 20 27425 0.2 2 26740 399 286 21 26831 0.2 2 26057 421 353 22 26337 0.2 2 25613 372 352 23 24893 0.2 2 24164 357 372 24 27075 0.2 2 26437 338 300 25 31028 0.2 2 30387 394 247 26 31528 0.2 2 30825 413 290 27 28127 0.2 2 27454 364 309 28 27085 0.2 2 26416 343 326 29 28369 0.2 2 27750 335 284 30 29893 0.2 2 29267 346 280 31 28677 0.2 2 28017 382 278 32 27014 0.2 2 26389 338 287 33 24030 0.2 2 23298 336 396 34 21087 0.2 2 20514 263 310 35 25453 0.2 2 24891 331 231 36 25557 0.2 2 24958 311 288 37 17538 0.2 2 17017 252 269 38 12543 0.2 2 12151 153 239 39 8773 0.2 2 8361 119 293 40 6714 0.2 2 6339 102 273 41 3735 0.2 2 3399 55 281 42 2212 0.2 2 1722 44 446 43 1487 0.2 2 1166 35 286 44 1150 0.2 2 920 23 207 45 1226 0.2 2 959 23 244 46 2802 0.2 2 2489 43 270 47 1865 0.2 2 1595 30 240 48 1073 0.2 2 843 47 183 49 975 0.2 2 631 21 323 50 625 0.2 2 402 15 208 51 471 0.2 2 65 17 389 52 245 0.2 2 37 10 198 53 464 0.2 2 29 19 416 54 330 0.2 2 35 17 278 55 243 0.2 2 24 8 211 56 465 0.2 2 38 15 412 57 284 0.2 2 36 9 239 58 313 0.2 2 20 8 285 59 352 0.2 2 13 13 326 60 311 0.2 2 9 4 298 61 308 0.2 2 4 8 296 62 238 0.2 2 5 11 222 63 450 0.2 2 1 7 442 64 322 0.2 2 1 7 314 65 271 0.2 2 2 12 257 66 276 0.2 2 1 6 269 67 223 0.2 2 0 15 208 68 1000 0.2 2 0 37 963 69 239 0.2 2 0 14 225 70 348 0.2 2 0 6 342 71 118 0.2 2 0 2 116 72 131 0.2 2 0 1 130 73 461 0.2 2 0 19 442 74 349 0.2 2 0 3 346 75 189 0.2 2 0 5 184 76 192 0.2 2 0 6 186 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_800_S7_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_800_S7_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_800_S7_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_800_S7_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_800_S7_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Cz_800_S7_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_800_S7_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Cz_800_S7_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 613.36 s (178 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 3,437,528 Read 1 with adapter: 1,239,506 (36.1%) Read 2 with adapter: 1,233,092 (35.9%) Pairs that were too short: 3,758 (0.1%) Pairs written (passing filters): 3,433,770 (99.9%) Total basepairs processed: 522,504,256 bp Read 1: 261,252,128 bp Read 2: 261,252,128 bp Total written (filtered): 467,013,125 bp (89.4%) Read 1: 233,457,084 bp Read 2: 233,556,041 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1239506 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.4% C: 37.1% G: 24.6% T: 23.9% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 49580 53711.4 0 49580 4 51645 13427.8 0 29625 22020 5 55208 3357.0 1 21816 33392 6 36203 839.2 1 21499 14704 7 24421 209.8 1 20756 3665 8 25833 52.5 1 22030 3370 433 9 25314 13.1 1 22732 612 1970 10 24268 3.3 2 22560 403 1305 11 24415 0.8 2 22879 415 1121 12 24062 0.2 2 23432 269 361 13 24643 0.2 2 23971 290 382 14 25805 0.2 2 25220 257 328 15 26638 0.2 2 26107 271 260 16 25307 0.2 2 24681 252 374 17 25833 0.2 2 25201 253 379 18 25806 0.2 2 25158 277 371 19 27682 0.2 2 26957 275 450 20 29115 0.2 2 28479 276 360 21 28474 0.2 2 27738 344 392 22 28423 0.2 2 27789 257 377 23 27477 0.2 2 26818 268 391 24 31243 0.2 2 30610 292 341 25 36125 0.2 2 35510 308 307 26 34739 0.2 2 34155 275 309 27 30203 0.2 2 29648 267 288 28 29192 0.2 2 28654 231 307 29 31774 0.2 2 31176 289 309 30 34150 0.2 2 33559 306 285 31 33915 0.2 2 33297 296 322 32 34511 0.2 2 33901 320 290 33 32424 0.2 2 31663 247 514 34 32977 0.2 2 32318 242 417 35 45898 0.2 2 45230 365 303 36 48485 0.2 2 47787 355 343 37 33518 0.2 2 32916 279 323 38 26250 0.2 2 25727 224 299 39 20353 0.2 2 19776 160 417 40 15927 0.2 2 15450 131 346 41 8748 0.2 2 8314 63 371 42 4934 0.2 2 4115 69 750 43 2940 0.2 2 2551 38 351 44 1973 0.2 2 1581 19 373 45 2220 0.2 2 1922 21 277 46 6663 0.2 2 6231 60 372 47 4686 0.2 2 4295 44 347 48 2578 0.2 2 2355 28 195 49 2364 0.2 2 1919 29 416 50 1529 0.2 2 1285 14 230 51 814 0.2 2 112 12 690 52 329 0.2 2 57 13 259 53 837 0.2 2 34 29 774 54 458 0.2 2 31 11 416 55 343 0.2 2 45 15 283 56 775 0.2 2 84 8 683 57 495 0.2 2 124 5 366 58 483 0.2 2 29 18 436 59 507 0.2 2 19 17 471 60 396 0.2 2 5 4 387 61 513 0.2 2 12 12 489 62 269 0.2 2 8 4 257 63 735 0.2 2 3 17 715 64 492 0.2 2 1 13 478 65 386 0.2 2 1 14 371 66 398 0.2 2 1 8 389 67 293 0.2 2 0 32 261 68 1513 0.2 2 0 55 1458 69 371 0.2 2 0 13 358 70 608 0.2 2 0 5 603 71 168 0.2 2 0 2 166 72 197 0.2 2 0 1 196 73 587 0.2 2 0 12 575 74 571 0.2 2 0 3 568 75 249 0.2 2 0 9 240 76 248 0.2 2 14 5 229 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1233092 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.8% C: 36.0% G: 24.8% T: 24.4% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 48993 53711.4 0 48993 4 51061 13427.8 0 29286 21775 5 54058 3357.0 1 20890 33168 6 35806 839.2 1 21015 14791 7 24150 209.8 1 20020 4130 8 25514 52.5 1 21778 3312 424 9 25239 13.1 1 22478 726 2035 10 24146 3.3 2 22192 623 1331 11 24314 0.8 2 22551 586 1177 12 23922 0.2 2 23122 393 407 13 24514 0.2 2 23730 382 402 14 25667 0.2 2 24904 412 351 15 26540 0.2 2 25814 395 331 16 25191 0.2 2 24425 368 398 17 25710 0.2 2 24944 362 404 18 25671 0.2 2 24907 373 391 19 27586 0.2 2 26682 380 524 20 29019 0.2 2 28182 416 421 21 28312 0.2 2 27465 440 407 22 28373 0.2 2 27486 417 470 23 27392 0.2 2 26591 338 463 24 31108 0.2 2 30327 396 385 25 35996 0.2 2 35184 465 347 26 34676 0.2 2 33820 467 389 27 30132 0.2 2 29412 389 331 28 29171 0.2 2 28401 360 410 29 31777 0.2 2 30993 372 412 30 34087 0.2 2 33287 438 362 31 33828 0.2 2 33088 394 346 32 34454 0.2 2 33622 465 367 33 32375 0.2 2 31415 400 560 34 32889 0.2 2 32067 373 449 35 45769 0.2 2 44890 515 364 36 48373 0.2 2 47390 562 421 37 33451 0.2 2 32668 395 388 38 26167 0.2 2 25575 273 319 39 20312 0.2 2 19595 248 469 40 15878 0.2 2 15325 206 347 41 8731 0.2 2 8238 108 385 42 4909 0.2 2 4090 63 756 43 2965 0.2 2 2527 43 395 44 1939 0.2 2 1569 24 346 45 2244 0.2 2 1901 43 300 46 6648 0.2 2 6195 86 367 47 4651 0.2 2 4256 65 330 48 2628 0.2 2 2345 45 238 49 2374 0.2 2 1898 40 436 50 1524 0.2 2 1277 28 219 51 738 0.2 2 106 10 622 52 377 0.2 2 56 12 309 53 806 0.2 2 33 34 739 54 484 0.2 2 31 17 436 55 365 0.2 2 42 17 306 56 772 0.2 2 86 14 672 57 461 0.2 2 119 11 331 58 465 0.2 2 31 17 417 59 529 0.2 2 18 19 492 60 382 0.2 2 4 3 375 61 478 0.2 2 10 12 456 62 296 0.2 2 5 5 286 63 723 0.2 2 4 17 702 64 499 0.2 2 1 14 484 65 354 0.2 2 1 16 337 66 405 0.2 2 0 6 399 67 251 0.2 2 0 22 229 68 1526 0.2 2 0 45 1481 69 355 0.2 2 0 17 338 70 522 0.2 2 0 4 518 71 149 0.2 2 0 0 149 72 207 0.2 2 0 3 204 73 594 0.2 2 0 16 578 74 570 0.2 2 0 8 562 75 293 0.2 2 0 18 275 76 257 0.2 2 14 3 240 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_1_S31_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_1_S31_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_1_S31_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_1_S31_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_1_S31_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_1_S31_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_1_S31_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_1_S31_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 932.27 s (177 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 5,276,594 Read 1 with adapter: 1,948,307 (36.9%) Read 2 with adapter: 1,940,411 (36.8%) Pairs that were too short: 5,268 (0.1%) Pairs written (passing filters): 5,271,326 (99.9%) Total basepairs processed: 802,042,288 bp Read 1: 401,021,144 bp Read 2: 401,021,144 bp Total written (filtered): 719,914,465 bp (89.8%) Read 1: 359,882,902 bp Read 2: 360,031,563 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1948307 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.0% C: 37.8% G: 24.9% T: 23.2% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 80276 82446.8 0 80276 4 85350 20611.7 0 50985 34365 5 90919 5152.9 1 39159 51760 6 62109 1288.2 1 39460 22649 7 44237 322.1 1 38546 5691 8 47229 80.5 1 39685 6996 548 9 44444 20.1 1 40628 988 2828 10 42057 5.0 2 39318 794 1945 11 42119 1.3 2 39780 790 1549 12 41593 0.3 2 40518 550 525 13 42695 0.3 2 41598 596 501 14 44422 0.3 2 43418 523 481 15 46201 0.3 2 45209 553 439 16 45136 0.3 2 44048 543 545 17 46540 0.3 2 45469 537 534 18 45664 0.3 2 44625 572 467 19 48447 0.3 2 47120 571 756 20 48806 0.3 2 47741 582 483 21 48602 0.3 2 47435 592 575 22 47412 0.3 2 46326 554 532 23 45251 0.3 2 44137 518 596 24 50503 0.3 2 49429 545 529 25 56810 0.3 2 55772 654 384 26 56867 0.3 2 55784 612 471 27 50704 0.3 2 49743 542 419 28 48452 0.3 2 47445 512 495 29 52191 0.3 2 51155 541 495 30 54412 0.3 2 53406 564 442 31 53764 0.3 2 52764 587 413 32 52712 0.3 2 51660 647 405 33 49155 0.3 2 47955 514 686 34 45990 0.3 2 44949 455 586 35 57416 0.3 2 56472 581 363 36 58925 0.3 2 57885 614 426 37 42180 0.3 2 41263 505 412 38 32360 0.3 2 31633 355 372 39 23287 0.3 2 22442 264 581 40 16872 0.3 2 16265 173 434 41 8818 0.3 2 8305 86 427 42 4984 0.3 2 4003 99 882 43 3252 0.3 2 2661 63 528 44 2188 0.3 2 1726 35 427 45 2424 0.3 2 2073 25 326 46 6312 0.3 2 5691 75 546 47 4432 0.3 2 3926 39 467 48 2566 0.3 2 2161 65 340 49 2294 0.3 2 1673 40 581 50 1437 0.3 2 1101 28 308 51 1045 0.3 2 91 13 941 52 457 0.3 2 44 12 401 53 1147 0.3 2 29 61 1057 54 613 0.3 2 38 16 559 55 490 0.3 2 52 11 427 56 958 0.3 2 69 18 871 57 592 0.3 2 87 15 490 58 661 0.3 2 32 28 601 59 798 0.3 2 14 32 752 60 549 0.3 2 12 9 528 61 643 0.3 2 14 18 611 62 408 0.3 2 10 10 388 63 998 0.3 2 1 28 969 64 711 0.3 2 2 15 694 65 518 0.3 2 0 39 479 66 532 0.3 2 1 10 521 67 362 0.3 2 3 36 323 68 1985 0.3 2 0 108 1877 69 533 0.3 2 0 31 502 70 739 0.3 2 0 10 729 71 176 0.3 2 0 3 173 72 247 0.3 2 0 1 246 73 836 0.3 2 0 41 795 74 767 0.3 2 0 10 757 75 417 0.3 2 0 36 381 76 309 0.3 2 0 9 300 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1940411 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.4% C: 36.9% G: 25.1% T: 23.6% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 79856 82446.8 0 79856 4 84629 20611.7 0 50681 33948 5 90012 5152.9 1 37835 52177 6 61915 1288.2 1 38585 23330 7 43910 322.1 1 37380 6530 8 46659 80.5 1 39278 6747 634 9 44444 20.1 1 40326 1157 2961 10 41828 5.0 2 38654 1128 2046 11 41964 1.3 2 39280 1038 1646 12 41450 0.3 2 40044 762 644 13 42544 0.3 2 41227 722 595 14 44239 0.3 2 42963 712 564 15 45980 0.3 2 44744 736 500 16 44986 0.3 2 43622 721 643 17 46352 0.3 2 45041 706 605 18 45545 0.3 2 44255 681 609 19 48234 0.3 2 46643 783 808 20 48656 0.3 2 47336 752 568 21 48414 0.3 2 47028 791 595 22 47226 0.3 2 45941 709 576 23 45108 0.3 2 43847 592 669 24 50355 0.3 2 49003 718 634 25 56625 0.3 2 55336 792 497 26 56736 0.3 2 55341 850 545 27 50528 0.3 2 49348 686 494 28 48364 0.3 2 47030 711 623 29 52084 0.3 2 50802 696 586 30 54235 0.3 2 52954 767 514 31 53648 0.3 2 52365 776 507 32 52568 0.3 2 51324 778 466 33 49050 0.3 2 47558 703 789 34 45900 0.3 2 44560 663 677 35 57263 0.3 2 56057 755 451 36 58841 0.3 2 57469 805 567 37 42016 0.3 2 40957 634 425 38 32231 0.3 2 31337 478 416 39 23180 0.3 2 22248 361 571 40 16902 0.3 2 16113 270 519 41 8912 0.3 2 8234 130 548 42 5095 0.3 2 3980 108 1007 43 3251 0.3 2 2638 69 544 44 2237 0.3 2 1714 46 477 45 2459 0.3 2 2057 48 354 46 6260 0.3 2 5655 105 500 47 4432 0.3 2 3894 69 469 48 2582 0.3 2 2146 84 352 49 2363 0.3 2 1655 60 648 50 1428 0.3 2 1090 31 307 51 984 0.3 2 99 21 864 52 448 0.3 2 46 14 388 53 1123 0.3 2 25 66 1032 54 608 0.3 2 33 19 556 55 462 0.3 2 45 17 400 56 911 0.3 2 63 21 827 57 543 0.3 2 84 11 448 58 648 0.3 2 28 26 594 59 752 0.3 2 14 26 712 60 480 0.3 2 11 8 461 61 635 0.3 2 10 19 606 62 365 0.3 2 4 10 351 63 911 0.3 2 0 25 886 64 668 0.3 2 2 16 650 65 449 0.3 2 0 15 434 66 505 0.3 2 1 13 491 67 384 0.3 2 2 42 340 68 1898 0.3 2 0 98 1800 69 530 0.3 2 0 46 484 70 691 0.3 2 0 11 680 71 228 0.3 2 0 2 226 72 233 0.3 2 0 4 229 73 856 0.3 2 0 36 820 74 828 0.3 2 0 7 821 75 418 0.3 2 0 26 392 76 357 0.3 2 0 8 349 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_1_S6_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_1_S6_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_1_S6_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_1_S6_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_1_S6_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_1_S6_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_1_S6_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_1_S6_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 901.61 s (183 us/read; 0.33 M reads/minute). === Summary === Total read pairs processed: 4,938,374 Read 1 with adapter: 1,890,185 (38.3%) Read 2 with adapter: 1,876,554 (38.0%) Pairs that were too short: 5,214 (0.1%) Pairs written (passing filters): 4,933,160 (99.9%) Total basepairs processed: 750,632,848 bp Read 1: 375,316,424 bp Read 2: 375,316,424 bp Total written (filtered): 666,453,921 bp (88.8%) Read 1: 333,149,336 bp Read 2: 333,304,585 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1890185 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.5% C: 36.7% G: 24.2% T: 24.6% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 70159 77162.1 0 70159 4 73576 19290.5 0 44139 29437 5 79165 4822.6 1 33422 45743 6 53558 1205.7 1 33475 20083 7 39189 301.4 1 33320 5869 8 40329 75.4 1 35065 4643 621 9 39776 18.8 1 36178 1015 2583 10 38837 4.7 2 36073 781 1983 11 38799 1.2 2 36283 914 1602 12 38050 0.3 2 36826 646 578 13 38364 0.3 2 37105 633 626 14 39875 0.3 2 38693 654 528 15 41153 0.3 2 39987 658 508 16 40230 0.3 2 38983 616 631 17 41542 0.3 2 40395 571 576 18 41595 0.3 2 40455 598 542 19 44121 0.3 2 42770 594 757 20 46346 0.3 2 45098 686 562 21 45920 0.3 2 44678 636 606 22 44770 0.3 2 43612 608 550 23 42908 0.3 2 41740 520 648 24 48660 0.3 2 47468 621 571 25 56574 0.3 2 55395 712 467 26 55749 0.3 2 54508 716 525 27 48101 0.3 2 47085 548 468 28 46668 0.3 2 45572 550 546 29 51192 0.3 2 50048 593 551 30 54981 0.3 2 53813 667 501 31 54154 0.3 2 53054 626 474 32 52209 0.3 2 51113 594 502 33 48241 0.3 2 46909 551 781 34 48070 0.3 2 46987 507 576 35 65951 0.3 2 64817 738 396 36 70858 0.3 2 69536 784 538 37 48391 0.3 2 47394 528 469 38 37189 0.3 2 36326 409 454 39 29818 0.3 2 28827 332 659 40 22617 0.3 2 21890 265 462 41 12459 0.3 2 11792 149 518 42 6461 0.3 2 5469 103 889 43 3971 0.3 2 3391 64 516 44 2765 0.3 2 2199 51 515 45 3319 0.3 2 2873 58 388 46 9272 0.3 2 8635 110 527 47 6617 0.3 2 6066 87 464 48 3764 0.3 2 3325 73 366 49 3395 0.3 2 2727 51 617 50 2297 0.3 2 1899 36 362 51 1077 0.3 2 152 32 893 52 583 0.3 2 77 14 492 53 1133 0.3 2 40 48 1045 54 660 0.3 2 62 21 577 55 528 0.3 2 83 21 424 56 1133 0.3 2 144 23 966 57 664 0.3 2 156 18 490 58 654 0.3 2 61 29 564 59 689 0.3 2 23 21 645 60 563 0.3 2 10 13 540 61 669 0.3 2 13 22 634 62 394 0.3 2 7 8 379 63 943 0.3 2 6 29 908 64 723 0.3 2 2 17 704 65 575 0.3 2 1 35 539 66 627 0.3 2 3 13 611 67 428 0.3 2 2 30 396 68 2139 0.3 2 0 79 2060 69 514 0.3 2 0 17 497 70 724 0.3 2 0 15 709 71 219 0.3 2 0 3 216 72 261 0.3 2 0 4 257 73 857 0.3 2 0 34 823 74 687 0.3 2 0 11 676 75 387 0.3 2 0 25 362 76 349 0.3 2 4 6 339 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1876554 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.9% C: 35.6% G: 24.4% T: 25.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 66221 77162.1 0 66221 4 72340 19290.5 0 43574 28766 5 77060 4822.6 1 30435 46625 6 52675 1205.7 1 32570 20105 7 38519 301.4 1 30423 8096 8 40121 75.4 1 34723 4577 821 9 39840 18.8 1 35842 1204 2794 10 38671 4.7 2 35823 751 2097 11 38533 1.2 2 35656 1281 1596 12 37906 0.3 2 36499 738 669 13 38179 0.3 2 36756 771 652 14 39687 0.3 2 38431 705 551 15 40982 0.3 2 39700 760 522 16 40085 0.3 2 38697 745 643 17 41335 0.3 2 40073 705 557 18 41521 0.3 2 40183 689 649 19 43970 0.3 2 42449 757 764 20 46163 0.3 2 44775 819 569 21 45723 0.3 2 44304 794 625 22 44636 0.3 2 43262 741 633 23 42750 0.3 2 41382 675 693 24 48498 0.3 2 47100 763 635 25 56372 0.3 2 55029 853 490 26 55547 0.3 2 54143 843 561 27 48032 0.3 2 46758 725 549 28 46527 0.3 2 45223 705 599 29 51027 0.3 2 49757 732 538 30 54810 0.3 2 53445 836 529 31 54048 0.3 2 52724 774 550 32 52066 0.3 2 50763 768 535 33 48086 0.3 2 46655 671 760 34 47922 0.3 2 46697 652 573 35 65852 0.3 2 64438 896 518 36 70709 0.3 2 69123 998 588 37 48282 0.3 2 47077 689 516 38 37110 0.3 2 36144 479 487 39 29698 0.3 2 28658 391 649 40 22618 0.3 2 21764 308 546 41 12406 0.3 2 11722 168 516 42 6491 0.3 2 5464 103 924 43 3956 0.3 2 3374 75 507 44 2734 0.3 2 2201 55 478 45 3350 0.3 2 2871 55 424 46 9294 0.3 2 8583 135 576 47 6667 0.3 2 6029 102 536 48 3759 0.3 2 3302 94 363 49 3362 0.3 2 2697 58 607 50 2295 0.3 2 1880 45 370 51 1014 0.3 2 147 23 844 52 544 0.3 2 80 13 451 53 1076 0.3 2 48 40 988 54 636 0.3 2 61 21 554 55 530 0.3 2 77 22 431 56 1098 0.3 2 141 26 931 57 731 0.3 2 157 13 561 58 645 0.3 2 60 27 558 59 619 0.3 2 20 21 578 60 560 0.3 2 7 10 543 61 696 0.3 2 12 14 670 62 428 0.3 2 5 7 416 63 933 0.3 2 4 17 912 64 697 0.3 2 1 15 681 65 496 0.3 2 1 23 472 66 645 0.3 2 3 9 633 67 388 0.3 2 0 30 358 68 2155 0.3 2 0 80 2075 69 513 0.3 2 0 40 473 70 799 0.3 2 0 10 789 71 237 0.3 2 0 0 237 72 310 0.3 2 0 5 305 73 855 0.3 2 0 36 819 74 759 0.3 2 0 11 748 75 377 0.3 2 0 30 347 76 378 0.3 2 4 5 369 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_2_S12_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_2_S12_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_2_S12_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_2_S12_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_2_S12_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_2_S12_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_2_S12_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_2_S12_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 671.30 s (179 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 3,751,289 Read 1 with adapter: 1,309,625 (34.9%) Read 2 with adapter: 1,301,429 (34.7%) Pairs that were too short: 4,083 (0.1%) Pairs written (passing filters): 3,747,206 (99.9%) Total basepairs processed: 570,195,928 bp Read 1: 285,097,964 bp Read 2: 285,097,964 bp Total written (filtered): 512,933,036 bp (90.0%) Read 1: 256,406,643 bp Read 2: 256,526,393 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1309625 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.4% C: 37.0% G: 24.4% T: 24.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 54026 58613.9 0 54026 4 57031 14653.5 0 32732 24299 5 60411 3663.4 1 23833 36578 6 40174 915.8 1 23841 16333 7 27094 229.0 1 23103 3991 8 28242 57.2 1 24046 3788 408 9 27919 14.3 1 25108 665 2146 10 26172 3.6 2 24272 467 1433 11 26725 0.9 2 25174 431 1120 12 26938 0.2 2 26206 339 393 13 27130 0.2 2 26381 313 436 14 28404 0.2 2 27703 322 379 15 28797 0.2 2 28154 303 340 16 27401 0.2 2 26645 302 454 17 28800 0.2 2 28145 268 387 18 28222 0.2 2 27498 298 426 19 30259 0.2 2 29436 287 536 20 31484 0.2 2 30800 308 376 21 30930 0.2 2 30220 312 398 22 31048 0.2 2 30343 295 410 23 29628 0.2 2 28873 309 446 24 33558 0.2 2 32847 302 409 25 38011 0.2 2 37379 334 298 26 37024 0.2 2 36286 355 383 27 31813 0.2 2 31287 239 287 28 31010 0.2 2 30350 298 362 29 33633 0.2 2 32959 292 382 30 36311 0.2 2 35690 327 294 31 36243 0.2 2 35590 328 325 32 35977 0.2 2 35311 326 340 33 33957 0.2 2 33102 276 579 34 33744 0.2 2 33060 281 403 35 44693 0.2 2 44052 365 276 36 46535 0.2 2 45789 361 385 37 32452 0.2 2 31886 259 307 38 25681 0.2 2 25174 214 293 39 19427 0.2 2 18774 159 494 40 14212 0.2 2 13740 138 334 41 7937 0.2 2 7484 80 373 42 4369 0.2 2 3553 65 751 43 2444 0.2 2 2043 45 356 44 1790 0.2 2 1410 20 360 45 2097 0.2 2 1762 32 303 46 5681 0.2 2 5265 54 362 47 4027 0.2 2 3636 49 342 48 2373 0.2 2 2064 39 270 49 2030 0.2 2 1539 23 468 50 1338 0.2 2 1083 20 235 51 843 0.2 2 92 14 737 52 398 0.2 2 57 13 328 53 962 0.2 2 38 34 890 54 524 0.2 2 43 11 470 55 410 0.2 2 56 21 333 56 868 0.2 2 113 16 739 57 536 0.2 2 148 10 378 58 526 0.2 2 60 26 440 59 605 0.2 2 14 20 571 60 454 0.2 2 11 7 436 61 593 0.2 2 14 20 559 62 314 0.2 2 7 8 299 63 819 0.2 2 6 20 793 64 555 0.2 2 2 11 542 65 381 0.2 2 1 21 359 66 456 0.2 2 2 4 450 67 282 0.2 2 2 34 246 68 1668 0.2 2 1 91 1576 69 436 0.2 2 0 29 407 70 597 0.2 2 0 8 589 71 184 0.2 2 0 5 179 72 187 0.2 2 0 5 182 73 651 0.2 2 0 33 618 74 633 0.2 2 0 9 624 75 300 0.2 2 0 27 273 76 241 0.2 2 10 4 227 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1301429 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.9% C: 35.9% G: 24.7% T: 24.5% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 53418 58613.9 0 53418 4 55982 14653.5 0 32196 23786 5 59035 3663.4 1 22605 36430 6 39549 915.8 1 23169 16380 7 26868 229.0 1 22021 4847 8 27840 57.2 1 23664 3662 514 9 27722 14.3 1 24775 762 2185 10 26041 3.6 2 23881 636 1524 11 26569 0.9 2 24730 684 1155 12 26813 0.2 2 25893 471 449 13 26967 0.2 2 26029 471 467 14 28234 0.2 2 27357 457 420 15 28624 0.2 2 27762 465 397 16 27224 0.2 2 26353 424 447 17 28693 0.2 2 27748 474 471 18 28074 0.2 2 27141 449 484 19 30151 0.2 2 29145 432 574 20 31351 0.2 2 30426 454 471 21 30841 0.2 2 29888 510 443 22 30943 0.2 2 30035 449 459 23 29514 0.2 2 28616 429 469 24 33395 0.2 2 32450 490 455 25 37893 0.2 2 37006 494 393 26 36839 0.2 2 35936 482 421 27 31743 0.2 2 30988 396 359 28 30899 0.2 2 30110 401 388 29 33559 0.2 2 32667 445 447 30 36253 0.2 2 35354 485 414 31 36170 0.2 2 35315 468 387 32 35821 0.2 2 34945 502 374 33 33792 0.2 2 32793 428 571 34 33693 0.2 2 32800 419 474 35 44562 0.2 2 43626 591 345 36 46356 0.2 2 45333 616 407 37 32396 0.2 2 31569 445 382 38 25635 0.2 2 24951 342 342 39 19375 0.2 2 18617 243 515 40 14206 0.2 2 13629 200 377 41 7963 0.2 2 7418 110 435 42 4378 0.2 2 3541 64 773 43 2486 0.2 2 2015 62 409 44 1807 0.2 2 1402 30 375 45 2112 0.2 2 1759 35 318 46 5728 0.2 2 5206 87 435 47 4012 0.2 2 3606 57 349 48 2355 0.2 2 2035 57 263 49 2045 0.2 2 1522 35 488 50 1360 0.2 2 1071 25 264 51 820 0.2 2 92 18 710 52 441 0.2 2 50 17 374 53 880 0.2 2 39 39 802 54 509 0.2 2 40 19 450 55 404 0.2 2 56 18 330 56 802 0.2 2 109 16 677 57 545 0.2 2 143 9 393 58 511 0.2 2 55 28 428 59 579 0.2 2 16 23 540 60 430 0.2 2 8 9 413 61 555 0.2 2 10 17 528 62 329 0.2 2 4 5 320 63 725 0.2 2 7 10 708 64 586 0.2 2 2 12 572 65 387 0.2 2 1 21 365 66 464 0.2 2 2 5 457 67 329 0.2 2 2 26 301 68 1600 0.2 2 1 72 1527 69 411 0.2 2 0 27 384 70 573 0.2 2 0 7 566 71 179 0.2 2 0 1 178 72 185 0.2 2 0 0 185 73 671 0.2 2 0 22 649 74 642 0.2 2 0 14 628 75 290 0.2 2 0 23 267 76 296 0.2 2 11 4 281 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_2_S32_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_2_S32_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_2_S32_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_2_S32_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_2_S32_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/DMSO_2_S32_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_2_S32_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/DMSO_2_S32_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 1017.87 s (177 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 5,736,117 Read 1 with adapter: 2,035,434 (35.5%) Read 2 with adapter: 2,027,208 (35.3%) Pairs that were too short: 5,637 (0.1%) Pairs written (passing filters): 5,730,480 (99.9%) Total basepairs processed: 871,889,784 bp Read 1: 435,944,892 bp Read 2: 435,944,892 bp Total written (filtered): 787,651,289 bp (90.3%) Read 1: 393,764,726 bp Read 2: 393,886,563 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 2035434 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.3% C: 37.5% G: 24.7% T: 23.4% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 89437 89626.8 0 89437 4 94997 22406.7 0 56358 38639 5 99767 5601.7 1 42305 57462 6 67408 1400.4 1 42495 24913 7 47931 350.1 1 41587 6344 8 50533 87.5 1 42750 7179 604 9 47670 21.9 1 43327 1055 3288 10 45633 5.5 2 42683 814 2136 11 45197 1.4 2 42715 797 1685 12 44000 0.3 2 42788 609 603 13 45151 0.3 2 43966 590 595 14 47258 0.3 2 46152 586 520 15 48946 0.3 2 47929 563 454 16 47193 0.3 2 46086 511 596 17 48433 0.3 2 47265 583 585 18 47679 0.3 2 46629 523 527 19 49772 0.3 2 48441 553 778 20 51060 0.3 2 49989 575 496 21 50043 0.3 2 48915 562 566 22 48660 0.3 2 47522 534 604 23 46665 0.3 2 45579 442 644 24 53287 0.3 2 52165 608 514 25 62315 0.3 2 61257 618 440 26 61479 0.3 2 60321 630 528 27 52877 0.3 2 51888 501 488 28 49553 0.3 2 48554 484 515 29 52575 0.3 2 51579 488 508 30 54905 0.3 2 53953 521 431 31 54352 0.3 2 53390 522 440 32 52212 0.3 2 51218 529 465 33 47898 0.3 2 46703 435 760 34 45605 0.3 2 44657 383 565 35 59459 0.3 2 58478 555 426 36 61636 0.3 2 60583 558 495 37 40511 0.3 2 39718 385 408 38 29151 0.3 2 28527 253 371 39 20974 0.3 2 20172 204 598 40 15307 0.3 2 14677 140 490 41 8163 0.3 2 7564 87 512 42 4775 0.3 2 3623 94 1058 43 2951 0.3 2 2319 56 576 44 2165 0.3 2 1666 32 467 45 2535 0.3 2 2092 46 397 46 7087 0.3 2 6442 87 558 47 4787 0.3 2 4248 56 483 48 2433 0.3 2 2004 66 363 49 2244 0.3 2 1500 38 706 50 1368 0.3 2 1013 35 320 51 1157 0.3 2 123 28 1006 52 528 0.3 2 80 25 423 53 1283 0.3 2 66 57 1160 54 741 0.3 2 65 28 648 55 552 0.3 2 80 19 453 56 1146 0.3 2 127 28 991 57 673 0.3 2 121 19 533 58 804 0.3 2 60 32 712 59 847 0.3 2 29 36 782 60 622 0.3 2 21 14 587 61 743 0.3 2 28 34 681 62 457 0.3 2 17 11 429 63 1064 0.3 2 5 29 1030 64 739 0.3 2 2 18 719 65 556 0.3 2 3 34 519 66 595 0.3 2 4 12 579 67 433 0.3 2 4 38 391 68 2125 0.3 2 0 91 2034 69 575 0.3 2 0 35 540 70 765 0.3 2 0 14 751 71 225 0.3 2 0 2 223 72 262 0.3 2 0 2 260 73 906 0.3 2 0 41 865 74 855 0.3 2 0 10 845 75 375 0.3 2 0 27 348 76 369 0.3 2 0 10 359 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 2027208 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.6% C: 36.5% G: 25.1% T: 23.7% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 88750 89626.8 0 88750 4 94393 22406.7 0 55903 38490 5 98832 5601.7 1 40881 57951 6 67089 1400.4 1 41673 25416 7 47510 350.1 1 40313 7197 8 49923 87.5 1 42213 7005 705 9 47552 21.9 1 42941 1250 3361 10 45495 5.5 2 41963 1271 2261 11 44960 1.4 2 42143 1050 1767 12 43793 0.3 2 42406 745 642 13 44946 0.3 2 43551 749 646 14 47058 0.3 2 45720 712 626 15 48718 0.3 2 47514 719 485 16 47066 0.3 2 45630 709 727 17 48238 0.3 2 46833 728 677 18 47512 0.3 2 46174 723 615 19 49594 0.3 2 48072 695 827 20 50928 0.3 2 49627 719 582 21 49843 0.3 2 48546 714 583 22 48486 0.3 2 47158 638 690 23 46452 0.3 2 45207 618 627 24 53086 0.3 2 51800 716 570 25 62117 0.3 2 60759 807 551 26 61266 0.3 2 59856 796 614 27 52667 0.3 2 51504 668 495 28 49423 0.3 2 48228 617 578 29 52409 0.3 2 51111 715 583 30 54819 0.3 2 53605 690 524 31 54202 0.3 2 52981 696 525 32 52081 0.3 2 50854 695 532 33 47840 0.3 2 46424 551 865 34 45518 0.3 2 44332 532 654 35 59218 0.3 2 57992 764 462 36 61502 0.3 2 60186 726 590 37 40467 0.3 2 39436 515 516 38 29048 0.3 2 28306 350 392 39 20959 0.3 2 20040 265 654 40 15289 0.3 2 14556 202 531 41 8221 0.3 2 7512 119 590 42 4829 0.3 2 3583 105 1141 43 2957 0.3 2 2317 70 570 44 2247 0.3 2 1652 53 542 45 2540 0.3 2 2076 49 415 46 7055 0.3 2 6396 91 568 47 4811 0.3 2 4232 65 514 48 2429 0.3 2 1998 82 349 49 2227 0.3 2 1483 45 699 50 1396 0.3 2 1007 36 353 51 1137 0.3 2 132 36 969 52 560 0.3 2 76 30 454 53 1348 0.3 2 66 57 1225 54 710 0.3 2 70 27 613 55 568 0.3 2 81 24 463 56 1071 0.3 2 118 28 925 57 672 0.3 2 118 16 538 58 734 0.3 2 53 43 638 59 788 0.3 2 26 34 728 60 600 0.3 2 19 19 562 61 729 0.3 2 24 24 681 62 470 0.3 2 10 9 451 63 1066 0.3 2 3 29 1034 64 775 0.3 2 3 14 758 65 543 0.3 2 2 38 503 66 608 0.3 2 3 11 594 67 443 0.3 2 2 50 391 68 2202 0.3 2 0 116 2086 69 573 0.3 2 0 41 532 70 748 0.3 2 0 13 735 71 232 0.3 2 0 5 227 72 263 0.3 2 0 5 258 73 948 0.3 2 0 52 896 74 858 0.3 2 0 15 843 75 434 0.3 2 0 44 390 76 367 0.3 2 0 10 357 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_1_S25_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_1_S25_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_1_S25_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_1_S25_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_1_S25_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_1_S25_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_1_S25_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_1_S25_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 955.53 s (176 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 5,429,648 Read 1 with adapter: 2,369,960 (43.6%) Read 2 with adapter: 2,356,104 (43.4%) Pairs that were too short: 5,205 (0.1%) Pairs written (passing filters): 5,424,443 (99.9%) Total basepairs processed: 825,306,496 bp Read 1: 412,653,248 bp Read 2: 412,653,248 bp Total written (filtered): 718,500,645 bp (87.1%) Read 1: 359,143,483 bp Read 2: 359,357,162 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 2369960 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 13.9% C: 37.5% G: 24.5% T: 24.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 80395 84838.2 0 80395 4 84540 21209.6 0 52934 31606 5 89630 5302.4 1 41978 47652 6 63652 1325.6 1 42622 21030 7 48354 331.4 1 42586 5768 8 49970 82.8 1 44332 5043 595 9 48515 20.7 1 44841 986 2688 10 47965 5.2 2 45280 906 1779 11 48261 1.3 2 45864 907 1490 12 48050 0.3 2 46828 656 566 13 48099 0.3 2 46896 627 576 14 50525 0.3 2 49370 635 520 15 52241 0.3 2 51126 644 471 16 51831 0.3 2 50601 644 586 17 53690 0.3 2 52509 598 583 18 53192 0.3 2 52067 600 525 19 57457 0.3 2 56056 658 743 20 59017 0.3 2 57797 676 544 21 58513 0.3 2 57237 666 610 22 56992 0.3 2 55887 549 556 23 55631 0.3 2 54502 537 592 24 61379 0.3 2 60177 652 550 25 67901 0.3 2 66835 655 411 26 68842 0.3 2 67623 716 503 27 63354 0.3 2 62246 619 489 28 61427 0.3 2 60375 554 498 29 65827 0.3 2 64702 620 505 30 69913 0.3 2 68831 627 455 31 70374 0.3 2 69280 633 461 32 69544 0.3 2 68404 655 485 33 65164 0.3 2 63899 526 739 34 63034 0.3 2 61936 525 573 35 76719 0.3 2 75661 621 437 36 81897 0.3 2 80697 695 505 37 62256 0.3 2 61289 535 432 38 50439 0.3 2 49557 445 437 39 40300 0.3 2 39347 346 607 40 30480 0.3 2 29748 251 481 41 17759 0.3 2 17062 159 538 42 9685 0.3 2 8693 121 871 43 6064 0.3 2 5488 67 509 44 4118 0.3 2 3595 50 473 45 4446 0.3 2 4010 40 396 46 11461 0.3 2 10828 95 538 47 8884 0.3 2 8358 82 444 48 5428 0.3 2 5018 84 326 49 4937 0.3 2 4260 60 617 50 3448 0.3 2 3062 37 349 51 1157 0.3 2 214 30 913 52 556 0.3 2 99 20 437 53 1154 0.3 2 63 61 1030 54 672 0.3 2 91 19 562 55 553 0.3 2 86 20 447 56 1047 0.3 2 155 26 866 57 662 0.3 2 168 16 478 58 712 0.3 2 54 30 628 59 740 0.3 2 36 31 673 60 586 0.3 2 14 13 559 61 692 0.3 2 43 22 627 62 415 0.3 2 25 12 378 63 1011 0.3 2 7 26 978 64 708 0.3 2 6 21 681 65 557 0.3 2 2 34 521 66 586 0.3 2 3 6 577 67 426 0.3 2 5 42 379 68 2091 0.3 2 1 96 1994 69 498 0.3 2 0 26 472 70 727 0.3 2 0 4 723 71 227 0.3 2 0 7 220 72 263 0.3 2 0 4 259 73 889 0.3 2 0 46 843 74 720 0.3 2 0 13 707 75 378 0.3 2 0 34 344 76 333 0.3 2 2 5 326 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 2356104 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.2% C: 36.4% G: 24.9% T: 24.4% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 79421 84838.2 0 79421 4 82785 21209.6 0 52066 30719 5 87645 5302.4 1 39843 47802 6 62660 1325.6 1 41324 21336 7 47907 331.4 1 40703 7204 8 49618 82.8 1 43580 5362 676 9 48384 20.7 1 44307 1261 2816 10 47816 5.2 2 44247 1485 2084 11 48059 1.3 2 45003 1320 1736 12 47789 0.3 2 46100 1010 679 13 47814 0.3 2 46185 953 676 14 50222 0.3 2 48630 991 601 15 51957 0.3 2 50394 989 574 16 51582 0.3 2 49938 944 700 17 53408 0.3 2 51772 934 702 18 53007 0.3 2 51287 987 733 19 57168 0.3 2 55321 988 859 20 58758 0.3 2 57076 1000 682 21 58132 0.3 2 56505 960 667 22 56743 0.3 2 55137 904 702 23 55396 0.3 2 53788 870 738 24 61104 0.3 2 59444 999 661 25 67656 0.3 2 65960 1100 596 26 68565 0.3 2 66851 1077 637 27 63090 0.3 2 61522 986 582 28 61327 0.3 2 59677 898 752 29 65616 0.3 2 63947 1014 655 30 69706 0.3 2 68076 1009 621 31 70160 0.3 2 68499 1078 583 32 69296 0.3 2 67683 1049 564 33 64987 0.3 2 63186 916 885 34 62866 0.3 2 61266 894 706 35 76429 0.3 2 74823 1028 578 36 81640 0.3 2 79819 1140 681 37 62109 0.3 2 60638 848 623 38 50246 0.3 2 49034 736 476 39 40165 0.3 2 38941 545 679 40 30416 0.3 2 29422 429 565 41 17705 0.3 2 16904 247 554 42 9720 0.3 2 8600 159 961 43 6078 0.3 2 5408 114 556 44 4102 0.3 2 3552 74 476 45 4417 0.3 2 3983 71 363 46 11412 0.3 2 10713 176 523 47 8903 0.3 2 8239 152 512 48 5452 0.3 2 4969 111 372 49 4940 0.3 2 4216 74 650 50 3441 0.3 2 3039 53 349 51 1134 0.3 2 210 31 893 52 613 0.3 2 102 23 488 53 1155 0.3 2 67 58 1030 54 701 0.3 2 81 22 598 55 579 0.3 2 82 15 482 56 1118 0.3 2 154 28 936 57 679 0.3 2 166 16 497 58 694 0.3 2 49 39 606 59 742 0.3 2 35 37 670 60 519 0.3 2 12 9 498 61 688 0.3 2 35 23 630 62 440 0.3 2 20 11 409 63 951 0.3 2 7 23 921 64 701 0.3 2 4 10 687 65 496 0.3 2 2 37 457 66 586 0.3 2 1 7 578 67 405 0.3 2 5 39 361 68 2062 0.3 2 1 95 1966 69 514 0.3 2 0 39 475 70 681 0.3 2 0 11 670 71 202 0.3 2 0 0 202 72 254 0.3 2 0 5 249 73 863 0.3 2 0 44 819 74 700 0.3 2 0 14 686 75 422 0.3 2 0 37 385 76 386 0.3 2 1 9 376 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_2_S26_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_2_S26_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_2_S26_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_2_S26_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_2_S26_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_2_S26_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_2_S26_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_2_S26_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 476.35 s (174 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 2,733,179 Read 1 with adapter: 1,129,658 (41.3%) Read 2 with adapter: 1,124,570 (41.1%) Pairs that were too short: 2,714 (0.1%) Pairs written (passing filters): 2,730,465 (99.9%) Total basepairs processed: 415,443,208 bp Read 1: 207,721,604 bp Read 2: 207,721,604 bp Total written (filtered): 365,538,502 bp (88.0%) Read 1: 182,733,955 bp Read 2: 182,804,547 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1129658 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 13.8% C: 37.9% G: 24.7% T: 23.6% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 40764 42705.9 0 40764 4 43067 10676.5 0 26643 16424 5 46218 2669.1 1 20849 25369 6 32276 667.3 1 21111 11165 7 23713 166.8 1 20765 2948 8 24834 41.7 1 21471 3080 283 9 23882 10.4 1 22017 555 1310 10 23208 2.6 2 21810 429 969 11 23285 0.7 2 22169 382 734 12 22900 0.2 2 22351 290 259 13 23585 0.2 2 22980 300 305 14 24746 0.2 2 24194 330 222 15 25599 0.2 2 25112 270 217 16 25164 0.2 2 24635 264 265 17 25889 0.2 2 25342 267 280 18 25762 0.2 2 25258 270 234 19 27278 0.2 2 26623 270 385 20 27879 0.2 2 27330 311 238 21 28016 0.2 2 27443 295 278 22 27352 0.2 2 26803 267 282 23 26301 0.2 2 25732 238 331 24 29431 0.2 2 28895 289 247 25 33552 0.2 2 33010 313 229 26 33246 0.2 2 32732 291 223 27 29789 0.2 2 29260 266 263 28 28426 0.2 2 27934 245 247 29 30714 0.2 2 30227 229 258 30 32861 0.2 2 32409 260 192 31 32598 0.2 2 32057 305 236 32 32674 0.2 2 32160 298 216 33 30654 0.2 2 30000 266 388 34 29233 0.2 2 28689 250 294 35 37501 0.2 2 36988 299 214 36 39048 0.2 2 38542 276 230 37 27637 0.2 2 27200 212 225 38 21333 0.2 2 21036 139 158 39 16408 0.2 2 15969 140 299 40 12598 0.2 2 12274 95 229 41 6991 0.2 2 6687 60 244 42 3958 0.2 2 3417 46 495 43 2485 0.2 2 2166 39 280 44 1722 0.2 2 1440 23 259 45 1877 0.2 2 1660 20 197 46 5077 0.2 2 4786 45 246 47 3651 0.2 2 3403 29 219 48 2007 0.2 2 1826 40 141 49 1827 0.2 2 1483 15 329 50 1224 0.2 2 1014 27 183 51 597 0.2 2 88 17 492 52 253 0.2 2 32 8 213 53 621 0.2 2 32 40 549 54 329 0.2 2 27 8 294 55 261 0.2 2 39 11 211 56 535 0.2 2 54 10 471 57 321 0.2 2 51 4 266 58 369 0.2 2 26 20 323 59 356 0.2 2 8 18 330 60 319 0.2 2 10 5 304 61 386 0.2 2 14 13 359 62 225 0.2 2 8 8 209 63 552 0.2 2 2 11 539 64 362 0.2 2 4 13 345 65 285 0.2 2 1 21 263 66 296 0.2 2 2 3 291 67 184 0.2 2 6 20 158 68 1035 0.2 2 0 55 980 69 270 0.2 2 0 17 253 70 422 0.2 2 0 7 415 71 115 0.2 2 0 1 114 72 133 0.2 2 0 1 132 73 422 0.2 2 0 18 404 74 375 0.2 2 0 6 369 75 206 0.2 2 0 18 188 76 189 0.2 2 0 3 186 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1124570 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.1% C: 36.7% G: 25.1% T: 24.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 40392 42705.9 0 40392 4 42623 10676.5 0 26336 16287 5 45496 2669.1 1 19796 25700 6 31732 667.3 1 20576 11156 7 23451 166.8 1 19985 3466 8 24594 41.7 1 21177 3121 296 9 23729 10.4 1 21735 622 1372 10 23159 2.6 2 21358 713 1088 11 23229 0.7 2 21798 595 836 12 22801 0.2 2 22041 426 334 13 23419 0.2 2 22690 417 312 14 24672 0.2 2 23925 417 330 15 25511 0.2 2 24822 397 292 16 25067 0.2 2 24351 386 330 17 25764 0.2 2 25038 414 312 18 25695 0.2 2 24995 380 320 19 27167 0.2 2 26331 380 456 20 27805 0.2 2 27005 465 335 21 27935 0.2 2 27090 481 364 22 27262 0.2 2 26480 445 337 23 26254 0.2 2 25477 377 400 24 29355 0.2 2 28593 459 303 25 33406 0.2 2 32708 473 225 26 33185 0.2 2 32451 443 291 27 29721 0.2 2 28979 417 325 28 28343 0.2 2 27660 383 300 29 30640 0.2 2 29954 408 278 30 32825 0.2 2 32084 430 311 31 32467 0.2 2 31774 428 265 32 32608 0.2 2 31856 454 298 33 30537 0.2 2 29769 356 412 34 29175 0.2 2 28472 362 341 35 37385 0.2 2 36660 461 264 36 39002 0.2 2 38203 450 349 37 27604 0.2 2 26913 379 312 38 21308 0.2 2 20846 230 232 39 16362 0.2 2 15863 202 297 40 12598 0.2 2 12161 161 276 41 6983 0.2 2 6616 93 274 42 3981 0.2 2 3390 70 521 43 2491 0.2 2 2147 42 302 44 1689 0.2 2 1427 29 233 45 1861 0.2 2 1647 29 185 46 5098 0.2 2 4742 64 292 47 3667 0.2 2 3373 55 239 48 2029 0.2 2 1800 62 167 49 1834 0.2 2 1459 30 345 50 1217 0.2 2 1001 22 194 51 514 0.2 2 89 16 409 52 237 0.2 2 32 6 199 53 618 0.2 2 32 37 549 54 363 0.2 2 29 12 322 55 274 0.2 2 41 12 221 56 555 0.2 2 58 10 487 57 280 0.2 2 52 5 223 58 364 0.2 2 25 16 323 59 374 0.2 2 10 15 349 60 293 0.2 2 12 8 273 61 392 0.2 2 9 11 372 62 235 0.2 2 6 7 222 63 532 0.2 2 1 19 512 64 359 0.2 2 4 9 346 65 290 0.2 2 1 24 265 66 264 0.2 2 1 9 254 67 214 0.2 2 6 22 186 68 1133 0.2 2 0 65 1068 69 264 0.2 2 0 17 247 70 390 0.2 2 0 4 386 71 108 0.2 2 0 1 107 72 129 0.2 2 0 3 126 73 467 0.2 2 0 31 436 74 385 0.2 2 0 11 374 75 236 0.2 2 0 20 216 76 172 0.2 2 0 2 170 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_300_S5_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_300_S5_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_300_S5_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_300_S5_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_300_S5_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_300_S5_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_300_S5_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_300_S5_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 903.88 s (178 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 5,089,207 Read 1 with adapter: 1,886,714 (37.1%) Read 2 with adapter: 1,878,128 (36.9%) Pairs that were too short: 5,385 (0.1%) Pairs written (passing filters): 5,083,822 (99.9%) Total basepairs processed: 773,559,464 bp Read 1: 386,779,732 bp Read 2: 386,779,732 bp Total written (filtered): 689,756,051 bp (89.2%) Read 1: 344,824,370 bp Read 2: 344,931,681 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1886714 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.2% C: 37.2% G: 24.4% T: 24.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 72683 79518.9 0 72683 4 76316 19879.7 0 44934 31382 5 82304 4969.9 1 33573 48731 6 55053 1242.5 1 34107 20946 7 38833 310.6 1 33299 5534 8 41207 77.7 1 35134 5435 638 9 40253 19.4 1 36469 966 2818 10 38797 4.9 2 36255 727 1815 11 38275 1.2 2 35908 856 1511 12 37389 0.3 2 36254 604 531 13 37990 0.3 2 36846 564 580 14 40385 0.3 2 39288 581 516 15 41502 0.3 2 40468 578 456 16 39922 0.3 2 38784 541 597 17 41146 0.3 2 40046 562 538 18 41021 0.3 2 39921 554 546 19 43741 0.3 2 42412 590 739 20 44887 0.3 2 43772 560 555 21 43867 0.3 2 42753 583 531 22 42908 0.3 2 41880 515 513 23 41967 0.3 2 40893 472 602 24 48203 0.3 2 47085 584 534 25 59110 0.3 2 57981 670 459 26 56477 0.3 2 55378 620 479 27 45921 0.3 2 44987 494 440 28 44050 0.3 2 43023 482 545 29 48743 0.3 2 47677 552 514 30 52346 0.3 2 51363 575 408 31 51689 0.3 2 50709 580 400 32 49663 0.3 2 48670 554 439 33 45714 0.3 2 44522 491 701 34 47053 0.3 2 46000 490 563 35 71171 0.3 2 69986 754 431 36 77524 0.3 2 76225 812 487 37 47963 0.3 2 47016 509 438 38 35028 0.3 2 34240 372 416 39 27803 0.3 2 26843 311 649 40 22216 0.3 2 21523 230 463 41 12196 0.3 2 11536 139 521 42 6630 0.3 2 5620 107 903 43 3776 0.3 2 3264 60 452 44 2739 0.3 2 2274 34 431 45 3677 0.3 2 3242 70 365 46 11552 0.3 2 10892 131 529 47 7493 0.3 2 6946 78 469 48 3524 0.3 2 3150 61 313 49 3145 0.3 2 2435 53 657 50 2087 0.3 2 1692 37 358 51 1066 0.3 2 133 29 904 52 475 0.3 2 60 9 406 53 1265 0.3 2 40 39 1186 54 663 0.3 2 40 19 604 55 486 0.3 2 60 17 409 56 1181 0.3 2 107 15 1059 57 615 0.3 2 140 11 464 58 709 0.3 2 37 26 646 59 708 0.3 2 18 14 676 60 598 0.3 2 17 8 573 61 728 0.3 2 5 15 708 62 407 0.3 2 4 14 389 63 1094 0.3 2 2 27 1065 64 744 0.3 2 2 11 731 65 586 0.3 2 2 33 551 66 627 0.3 2 1 11 615 67 357 0.3 2 5 28 324 68 2300 0.3 2 0 95 2205 69 537 0.3 2 0 37 500 70 824 0.3 2 0 12 812 71 225 0.3 2 0 2 223 72 269 0.3 2 0 4 265 73 804 0.3 2 0 31 773 74 761 0.3 2 0 12 749 75 419 0.3 2 0 40 379 76 327 0.3 2 2 11 314 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1878128 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.7% C: 36.0% G: 24.8% T: 24.5% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 70722 79518.9 0 70722 4 75743 19879.7 0 44747 30996 5 80455 4969.9 1 32022 48433 6 54644 1242.5 1 33499 21145 7 38532 310.6 1 31829 6703 8 40790 77.7 1 34883 5198 709 9 40156 19.4 1 36281 987 2888 10 38796 4.9 2 36013 752 2031 11 38205 1.2 2 35564 1019 1622 12 37260 0.3 2 36155 549 556 13 37869 0.3 2 36677 586 606 14 40267 0.3 2 39088 621 558 15 41374 0.3 2 40277 608 489 16 39815 0.3 2 38672 541 602 17 41025 0.3 2 39843 597 585 18 40889 0.3 2 39742 587 560 19 43556 0.3 2 42244 606 706 20 44734 0.3 2 43620 581 533 21 43764 0.3 2 42614 615 535 22 42792 0.3 2 41704 541 547 23 41858 0.3 2 40685 546 627 24 48064 0.3 2 46854 655 555 25 58935 0.3 2 57670 786 479 26 56342 0.3 2 55103 723 516 27 45845 0.3 2 44829 559 457 28 43970 0.3 2 42860 548 562 29 48669 0.3 2 47550 570 549 30 52254 0.3 2 51112 641 501 31 51625 0.3 2 50503 636 486 32 49571 0.3 2 48512 632 427 33 45709 0.3 2 44376 548 785 34 46973 0.3 2 45822 583 568 35 71011 0.3 2 69759 795 457 36 77443 0.3 2 75981 877 585 37 47895 0.3 2 46867 548 480 38 34934 0.3 2 34110 422 402 39 27700 0.3 2 26719 358 623 40 22204 0.3 2 21402 289 513 41 12166 0.3 2 11512 144 510 42 6657 0.3 2 5604 105 948 43 3832 0.3 2 3266 68 498 44 2775 0.3 2 2264 54 457 45 3672 0.3 2 3232 56 384 46 11553 0.3 2 10856 148 549 47 7506 0.3 2 6943 79 484 48 3543 0.3 2 3146 70 327 49 3085 0.3 2 2433 50 602 50 2091 0.3 2 1682 46 363 51 1108 0.3 2 134 26 948 52 494 0.3 2 54 24 416 53 1210 0.3 2 45 55 1110 54 681 0.3 2 39 20 622 55 483 0.3 2 58 15 410 56 1189 0.3 2 105 16 1068 57 609 0.3 2 138 12 459 58 670 0.3 2 38 30 602 59 639 0.3 2 14 24 601 60 616 0.3 2 11 9 596 61 729 0.3 2 4 16 709 62 440 0.3 2 4 7 429 63 1060 0.3 2 4 27 1029 64 676 0.3 2 0 15 661 65 556 0.3 2 1 26 529 66 606 0.3 2 1 10 595 67 382 0.3 2 5 36 341 68 2289 0.3 2 0 75 2214 69 493 0.3 2 0 37 456 70 872 0.3 2 0 8 864 71 248 0.3 2 0 3 245 72 256 0.3 2 0 3 253 73 908 0.3 2 0 32 876 74 846 0.3 2 1 13 832 75 431 0.3 2 0 29 402 76 367 0.3 2 1 8 358 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_3_S27_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_3_S27_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_3_S27_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_3_S27_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_3_S27_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_3_S27_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_3_S27_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_3_S27_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 376.27 s (179 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 2,101,615 Read 1 with adapter: 791,122 (37.6%) Read 2 with adapter: 787,508 (37.5%) Pairs that were too short: 2,164 (0.1%) Pairs written (passing filters): 2,099,451 (99.9%) Total basepairs processed: 319,445,480 bp Read 1: 159,722,740 bp Read 2: 159,722,740 bp Total written (filtered): 284,895,022 bp (89.2%) Read 1: 142,413,634 bp Read 2: 142,481,388 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 791122 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.1% C: 38.0% G: 24.8% T: 23.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 30965 32837.7 0 30965 4 32822 8209.4 0 19669 13153 5 34944 2052.4 1 14919 20025 6 24149 513.1 1 15204 8945 7 17372 128.3 1 15031 2341 8 18053 32.1 1 15424 2404 225 9 17675 8.0 1 16139 413 1123 10 16852 2.0 2 15771 289 792 11 16878 0.5 2 15908 331 639 12 16560 0.1 2 16120 216 224 13 16778 0.1 2 16293 253 232 14 16928 0.1 2 16459 233 236 15 17494 0.1 2 17081 211 202 16 17537 0.1 2 17052 192 293 17 18432 0.1 2 17991 191 250 18 18192 0.1 2 17743 226 223 19 18808 0.1 2 18292 184 332 20 19571 0.1 2 19116 202 253 21 19521 0.1 2 19003 261 257 22 19311 0.1 2 18866 199 246 23 18019 0.1 2 17530 201 288 24 19112 0.1 2 18676 203 233 25 20614 0.1 2 20241 203 170 26 20832 0.1 2 20403 212 217 27 19749 0.1 2 19391 169 189 28 19605 0.1 2 19200 195 210 29 21184 0.1 2 20779 189 216 30 21997 0.1 2 21634 195 168 31 21784 0.1 2 21379 238 167 32 21964 0.1 2 21565 229 170 33 20590 0.1 2 20126 170 294 34 19030 0.1 2 18632 154 244 35 21124 0.1 2 20779 177 168 36 22242 0.1 2 21822 193 227 37 18576 0.1 2 18241 169 166 38 16198 0.1 2 15909 129 160 39 13640 0.1 2 13283 105 252 40 11539 0.1 2 11240 105 194 41 7175 0.1 2 6894 70 211 42 4430 0.1 2 3972 58 400 43 2958 0.1 2 2677 28 253 44 1927 0.1 2 1681 27 219 45 1627 0.1 2 1439 27 161 46 2745 0.1 2 2510 35 200 47 2217 0.1 2 1992 23 202 48 1659 0.1 2 1490 29 140 49 1579 0.1 2 1245 26 308 50 818 0.1 2 646 15 157 51 423 0.1 2 75 11 337 52 205 0.1 2 28 5 172 53 483 0.1 2 25 33 425 54 271 0.1 2 21 9 241 55 243 0.1 2 30 8 205 56 421 0.1 2 44 17 360 57 261 0.1 2 25 6 230 58 260 0.1 2 10 14 236 59 272 0.1 2 11 10 251 60 245 0.1 2 10 7 228 61 277 0.1 2 12 10 255 62 174 0.1 2 3 5 166 63 386 0.1 2 5 10 371 64 295 0.1 2 3 9 283 65 231 0.1 2 2 6 223 66 252 0.1 2 0 3 249 67 184 0.1 2 3 17 164 68 805 0.1 2 0 43 762 69 228 0.1 2 1 15 212 70 257 0.1 2 0 2 255 71 91 0.1 2 0 2 89 72 73 0.1 2 0 3 70 73 366 0.1 2 0 12 354 74 317 0.1 2 0 6 311 75 160 0.1 2 0 8 152 76 166 0.1 2 1 8 157 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 787508 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.5% C: 36.8% G: 25.2% T: 23.5% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 31050 32837.7 0 31050 4 32566 8209.4 0 19407 13159 5 34791 2052.4 1 14304 20487 6 23758 513.1 1 14790 8968 7 17224 128.3 1 14548 2676 8 17821 32.1 1 15225 2307 289 9 17627 8.0 1 16025 501 1101 10 16800 2.0 2 15413 487 900 11 16753 0.5 2 15717 400 636 12 16455 0.1 2 15906 287 262 13 16717 0.1 2 16121 291 305 14 16800 0.1 2 16295 281 224 15 17389 0.1 2 16922 264 203 16 17393 0.1 2 16864 280 249 17 18319 0.1 2 17767 272 280 18 18098 0.1 2 17532 304 262 19 18692 0.1 2 18059 289 344 20 19468 0.1 2 18908 304 256 21 19393 0.1 2 18800 329 264 22 19241 0.1 2 18673 284 284 23 17971 0.1 2 17360 270 341 24 19039 0.1 2 18497 266 276 25 20540 0.1 2 19970 330 240 26 20744 0.1 2 20182 311 251 27 19717 0.1 2 19178 298 241 28 19574 0.1 2 19018 281 275 29 21118 0.1 2 20572 294 252 30 21938 0.1 2 21410 311 217 31 21735 0.1 2 21207 300 228 32 21933 0.1 2 21414 285 234 33 20530 0.1 2 19927 267 336 34 18996 0.1 2 18477 228 291 35 21102 0.1 2 20598 270 234 36 22155 0.1 2 21631 286 238 37 18523 0.1 2 18087 245 191 38 16166 0.1 2 15737 221 208 39 13604 0.1 2 13158 165 281 40 11511 0.1 2 11144 155 212 41 7167 0.1 2 6851 80 236 42 4426 0.1 2 3928 81 417 43 2954 0.1 2 2659 32 263 44 1893 0.1 2 1667 29 197 45 1604 0.1 2 1421 28 155 46 2743 0.1 2 2471 55 217 47 2204 0.1 2 1973 32 199 48 1666 0.1 2 1471 49 146 49 1533 0.1 2 1236 22 275 50 804 0.1 2 643 7 154 51 435 0.1 2 73 11 351 52 224 0.1 2 30 11 183 53 458 0.1 2 27 28 403 54 261 0.1 2 19 10 232 55 187 0.1 2 23 9 155 56 391 0.1 2 38 22 331 57 227 0.1 2 23 4 200 58 243 0.1 2 9 10 224 59 335 0.1 2 8 13 314 60 245 0.1 2 8 4 233 61 273 0.1 2 11 9 253 62 197 0.1 2 2 4 191 63 377 0.1 2 4 10 363 64 269 0.1 2 3 8 258 65 228 0.1 2 1 17 210 66 248 0.1 2 0 1 247 67 160 0.1 2 3 15 142 68 872 0.1 2 0 30 842 69 225 0.1 2 1 16 208 70 244 0.1 2 0 2 242 71 80 0.1 2 0 1 79 72 92 0.1 2 0 2 90 73 370 0.1 2 0 17 353 74 315 0.1 2 0 4 311 75 136 0.1 2 0 7 129 76 171 0.1 2 1 6 164 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_800_S11_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_800_S11_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_800_S11_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_800_S11_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_800_S11_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/It_800_S11_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_800_S11_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/It_800_S11_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 684.47 s (179 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 3,830,885 Read 1 with adapter: 1,356,970 (35.4%) Read 2 with adapter: 1,351,725 (35.3%) Pairs that were too short: 4,260 (0.1%) Pairs written (passing filters): 3,826,625 (99.9%) Total basepairs processed: 582,294,520 bp Read 1: 291,147,260 bp Read 2: 291,147,260 bp Total written (filtered): 521,774,410 bp (89.6%) Read 1: 260,844,343 bp Read 2: 260,930,067 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1356970 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.6% C: 36.5% G: 24.2% T: 24.7% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 54264 59857.6 0 54264 4 56477 14964.4 0 32466 24011 5 60298 3741.1 1 23845 36453 6 39784 935.3 1 24066 15718 7 27180 233.8 1 23249 3931 8 28898 58.5 1 24679 3775 444 9 28014 14.6 1 25149 634 2231 10 27178 3.7 2 25311 470 1397 11 27120 0.9 2 25445 459 1216 12 27071 0.2 2 26374 309 388 13 26910 0.2 2 26207 316 387 14 28390 0.2 2 27737 299 354 15 29128 0.2 2 28485 318 325 16 28144 0.2 2 27399 264 481 17 28447 0.2 2 27800 274 373 18 29035 0.2 2 28344 288 403 19 30890 0.2 2 30012 326 552 20 31737 0.2 2 31026 314 397 21 31198 0.2 2 30512 284 402 22 30774 0.2 2 30081 288 405 23 30104 0.2 2 29345 304 455 24 33926 0.2 2 33278 281 367 25 39990 0.2 2 39303 371 316 26 38500 0.2 2 37796 337 367 27 33154 0.2 2 32563 286 305 28 32321 0.2 2 31647 288 386 29 35487 0.2 2 34810 302 375 30 38132 0.2 2 37456 339 337 31 37982 0.2 2 37312 335 335 32 36997 0.2 2 36330 353 314 33 35111 0.2 2 34256 284 571 34 35525 0.2 2 34836 280 409 35 49380 0.2 2 48719 370 291 36 52439 0.2 2 51613 445 381 37 35247 0.2 2 34656 268 323 38 27349 0.2 2 26822 202 325 39 21591 0.2 2 20869 166 556 40 16702 0.2 2 16149 154 399 41 9185 0.2 2 8713 84 388 42 4949 0.2 2 4197 74 678 43 2962 0.2 2 2529 37 396 44 2159 0.2 2 1707 39 413 45 2565 0.2 2 2250 34 281 46 7604 0.2 2 7130 70 404 47 5181 0.2 2 4766 52 363 48 2784 0.2 2 2460 69 255 49 2476 0.2 2 1976 32 468 50 1654 0.2 2 1367 18 269 51 860 0.2 2 108 11 741 52 415 0.2 2 56 15 344 53 963 0.2 2 39 39 885 54 515 0.2 2 36 18 461 55 371 0.2 2 55 6 310 56 913 0.2 2 92 25 796 57 549 0.2 2 138 5 406 58 493 0.2 2 50 26 417 59 551 0.2 2 24 23 504 60 454 0.2 2 10 6 438 61 563 0.2 2 18 14 531 62 325 0.2 2 7 9 309 63 820 0.2 2 2 18 800 64 555 0.2 2 0 16 539 65 399 0.2 2 0 24 375 66 474 0.2 2 0 9 465 67 300 0.2 2 1 27 272 68 1721 0.2 2 1 76 1644 69 415 0.2 2 0 23 392 70 616 0.2 2 0 10 606 71 201 0.2 2 0 1 200 72 207 0.2 2 0 5 202 73 676 0.2 2 1 25 650 74 593 0.2 2 0 6 587 75 344 0.2 2 0 39 305 76 284 0.2 2 19 4 261 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1351725 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.9% C: 35.4% G: 24.6% T: 25.0% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 54024 59857.6 0 54024 4 56071 14964.4 0 32113 23958 5 59273 3741.1 1 22899 36374 6 39801 935.3 1 23632 16169 7 26986 233.8 1 22487 4499 8 28600 58.5 1 24406 3745 449 9 27839 14.6 1 24944 681 2214 10 27114 3.7 2 24986 633 1495 11 26938 0.9 2 25201 544 1193 12 26991 0.2 2 26134 429 428 13 26820 0.2 2 25988 402 430 14 28270 0.2 2 27435 428 407 15 29026 0.2 2 28257 388 381 16 28039 0.2 2 27137 403 499 17 28337 0.2 2 27568 374 395 18 28927 0.2 2 28076 391 460 19 30794 0.2 2 29846 348 600 20 31638 0.2 2 30807 406 425 21 31084 0.2 2 30221 452 411 22 30661 0.2 2 29878 349 434 23 30033 0.2 2 29159 362 512 24 33869 0.2 2 32997 422 450 25 39891 0.2 2 39004 503 384 26 38347 0.2 2 37485 467 395 27 33082 0.2 2 32364 358 360 28 32234 0.2 2 31455 355 424 29 35461 0.2 2 34607 377 477 30 38051 0.2 2 37185 481 385 31 37868 0.2 2 37038 461 369 32 36906 0.2 2 36105 451 350 33 35037 0.2 2 34010 406 621 34 35448 0.2 2 34616 382 450 35 49253 0.2 2 48374 538 341 36 52299 0.2 2 51336 541 422 37 35138 0.2 2 34425 360 353 38 27276 0.2 2 26637 304 335 39 21511 0.2 2 20761 215 535 40 16617 0.2 2 16079 175 363 41 9202 0.2 2 8663 91 448 42 4968 0.2 2 4149 84 735 43 2983 0.2 2 2510 43 430 44 2123 0.2 2 1699 38 386 45 2573 0.2 2 2233 34 306 46 7623 0.2 2 7096 88 439 47 5174 0.2 2 4728 63 383 48 2797 0.2 2 2451 52 294 49 2516 0.2 2 1973 41 502 50 1646 0.2 2 1359 31 256 51 811 0.2 2 102 14 695 52 414 0.2 2 54 15 345 53 847 0.2 2 35 32 780 54 521 0.2 2 39 12 470 55 376 0.2 2 50 9 317 56 923 0.2 2 89 24 810 57 518 0.2 2 141 12 365 58 556 0.2 2 49 22 485 59 517 0.2 2 21 24 472 60 470 0.2 2 10 11 449 61 576 0.2 2 10 12 554 62 371 0.2 2 7 4 360 63 808 0.2 2 1 18 789 64 560 0.2 2 0 17 543 65 410 0.2 2 0 17 393 66 475 0.2 2 0 7 468 67 316 0.2 2 1 23 292 68 1710 0.2 2 1 47 1662 69 400 0.2 2 0 23 377 70 585 0.2 2 0 3 582 71 226 0.2 2 0 3 223 72 212 0.2 2 0 1 211 73 705 0.2 2 0 17 688 74 615 0.2 2 0 3 612 75 338 0.2 2 0 29 309 76 306 0.2 2 19 5 282 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kt_1_S13_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kt_1_S13_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kt_1_S13_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kt_1_S13_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kt_1_S13_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kt_1_S13_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kt_1_S13_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kt_1_S13_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 860.36 s (170 us/read; 0.35 M reads/minute). === Summary === Total read pairs processed: 5,056,162 Read 1 with adapter: 2,148,139 (42.5%) Read 2 with adapter: 2,139,444 (42.3%) Pairs that were too short: 5,031 (0.1%) Pairs written (passing filters): 5,051,131 (99.9%) Total basepairs processed: 768,536,624 bp Read 1: 384,268,312 bp Read 2: 384,268,312 bp Total written (filtered): 671,709,502 bp (87.4%) Read 1: 335,778,558 bp Read 2: 335,930,944 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 2148139 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.0% C: 37.7% G: 24.6% T: 23.7% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 72638 79002.5 0 72638 4 78287 19750.6 0 48434 29853 5 83594 4937.7 1 38121 45473 6 58884 1234.4 1 38882 20002 7 43883 308.6 1 38475 5408 8 45872 77.2 1 39700 5659 513 9 44233 19.3 1 40850 879 2504 10 42186 4.8 2 39759 660 1767 11 43036 1.2 2 41011 704 1321 12 41944 0.3 2 40913 507 524 13 43105 0.3 2 42070 545 490 14 44904 0.3 2 43970 500 434 15 47321 0.3 2 46409 515 397 16 47486 0.3 2 46439 518 529 17 49076 0.3 2 48067 511 498 18 48655 0.3 2 47686 464 505 19 51641 0.3 2 50509 480 652 20 52626 0.3 2 51630 498 498 21 52303 0.3 2 51324 478 501 22 50881 0.3 2 49846 522 513 23 48881 0.3 2 47865 442 574 24 54546 0.3 2 53554 491 501 25 61599 0.3 2 60629 582 388 26 63550 0.3 2 62519 578 453 27 57395 0.3 2 56522 475 398 28 55461 0.3 2 54540 466 455 29 59995 0.3 2 59046 521 428 30 63719 0.3 2 62829 503 387 31 63827 0.3 2 62863 533 431 32 63047 0.3 2 62048 599 400 33 59141 0.3 2 57990 499 652 34 57474 0.3 2 56430 503 541 35 71159 0.3 2 70247 545 367 36 74859 0.3 2 73850 583 426 37 55601 0.3 2 54725 493 383 38 45945 0.3 2 45193 386 366 39 36682 0.3 2 35840 319 523 40 28769 0.3 2 28071 215 483 41 16396 0.3 2 15807 126 463 42 8928 0.3 2 7925 99 904 43 5362 0.3 2 4800 61 501 44 3623 0.3 2 3188 40 395 45 3723 0.3 2 3353 44 326 46 9838 0.3 2 9222 91 525 47 7192 0.3 2 6690 68 434 48 4444 0.3 2 4044 81 319 49 4168 0.3 2 3524 58 586 50 2804 0.3 2 2470 44 290 51 1071 0.3 2 160 20 891 52 432 0.3 2 63 15 354 53 1182 0.3 2 29 54 1099 54 652 0.3 2 49 21 582 55 441 0.3 2 48 20 373 56 1102 0.3 2 75 14 1013 57 520 0.3 2 86 4 430 58 624 0.3 2 21 23 580 59 685 0.3 2 22 29 634 60 546 0.3 2 8 3 535 61 726 0.3 2 13 20 693 62 374 0.3 2 11 2 361 63 1036 0.3 2 3 14 1019 64 609 0.3 2 1 6 602 65 494 0.3 2 0 23 471 66 526 0.3 2 2 5 519 67 366 0.3 2 0 33 333 68 2121 0.3 2 1 76 2044 69 521 0.3 2 0 27 494 70 768 0.3 2 0 11 757 71 210 0.3 2 0 1 209 72 272 0.3 2 0 7 265 73 793 0.3 2 0 30 763 74 722 0.3 2 0 8 714 75 342 0.3 2 0 11 331 76 321 0.3 2 2 11 308 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 2139444 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.3% C: 36.7% G: 24.9% T: 24.0% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 72450 79002.5 0 72450 4 77851 19750.6 0 48145 29706 5 82486 4937.7 1 36578 45908 6 58373 1234.4 1 38003 20370 7 43210 308.6 1 37316 5894 8 45395 77.2 1 39241 5592 562 9 44030 19.3 1 40369 1060 2601 10 41960 4.8 2 39005 1090 1865 11 42826 1.2 2 40391 966 1469 12 41747 0.3 2 40492 670 585 13 42979 0.3 2 41695 660 624 14 44805 0.3 2 43553 651 601 15 47128 0.3 2 46001 662 465 16 47274 0.3 2 45979 666 629 17 48847 0.3 2 47618 659 570 18 48423 0.3 2 47257 649 517 19 51421 0.3 2 50018 647 756 20 52451 0.3 2 51140 712 599 21 52163 0.3 2 50841 707 615 22 50680 0.3 2 49405 666 609 23 48711 0.3 2 47471 613 627 24 54350 0.3 2 53145 681 524 25 61422 0.3 2 60095 807 520 26 63381 0.3 2 62047 775 559 27 57277 0.3 2 56128 695 454 28 55329 0.3 2 54140 641 548 29 59842 0.3 2 58600 714 528 30 63625 0.3 2 62351 738 536 31 63643 0.3 2 62379 768 496 32 62895 0.3 2 61649 750 496 33 59046 0.3 2 57663 631 752 34 57349 0.3 2 56078 645 626 35 70992 0.3 2 69705 807 480 36 74713 0.3 2 73265 868 580 37 55483 0.3 2 54314 667 502 38 45826 0.3 2 44884 512 430 39 36595 0.3 2 35575 421 599 40 28681 0.3 2 27838 348 495 41 16405 0.3 2 15683 186 536 42 8887 0.3 2 7871 122 894 43 5370 0.3 2 4773 73 524 44 3635 0.3 2 3171 42 422 45 3721 0.3 2 3337 46 338 46 9786 0.3 2 9184 103 499 47 7159 0.3 2 6639 85 435 48 4427 0.3 2 4027 87 313 49 4109 0.3 2 3490 61 558 50 2776 0.3 2 2451 51 274 51 1025 0.3 2 155 18 852 52 480 0.3 2 62 14 404 53 1211 0.3 2 30 65 1116 54 638 0.3 2 50 21 567 55 471 0.3 2 44 14 413 56 1085 0.3 2 72 26 987 57 524 0.3 2 81 8 435 58 622 0.3 2 21 15 586 59 699 0.3 2 22 31 646 60 544 0.3 2 9 6 529 61 695 0.3 2 13 12 670 62 350 0.3 2 9 12 329 63 1073 0.3 2 2 11 1060 64 641 0.3 2 0 16 625 65 513 0.3 2 0 25 488 66 543 0.3 2 2 11 530 67 387 0.3 2 0 38 349 68 1981 0.3 2 1 82 1898 69 496 0.3 2 0 32 464 70 755 0.3 2 0 11 744 71 192 0.3 2 0 5 187 72 247 0.3 2 0 3 244 73 872 0.3 2 0 34 838 74 698 0.3 2 0 13 685 75 410 0.3 2 0 18 392 76 358 0.3 2 2 10 346 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kt_2_S14_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kt_2_S14_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kt_2_S14_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kt_2_S14_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kt_2_S14_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kt_2_S14_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kt_2_S14_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kt_2_S14_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 575.24 s (178 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 3,223,603 Read 1 with adapter: 918,651 (28.5%) Read 2 with adapter: 914,186 (28.4%) Pairs that were too short: 3,343 (0.1%) Pairs written (passing filters): 3,220,260 (99.9%) Total basepairs processed: 489,987,656 bp Read 1: 244,993,828 bp Read 2: 244,993,828 bp Total written (filtered): 453,910,300 bp (92.6%) Read 1: 226,921,762 bp Read 2: 226,988,538 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 918651 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.6% C: 37.1% G: 25.0% T: 23.3% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 50468 50368.8 0 50468 4 54033 12592.2 0 29869 24164 5 56531 3148.0 1 20966 35565 6 36404 787.0 1 21229 15175 7 24279 196.8 1 20476 3803 8 26246 49.2 1 21203 4696 347 9 23826 12.3 1 21246 617 1963 10 22565 3.1 2 20719 405 1441 11 21699 0.8 2 20420 350 929 12 20800 0.2 2 20189 245 366 13 20809 0.2 2 20202 258 349 14 20393 0.2 2 19881 224 288 15 21074 0.2 2 20614 228 232 16 20685 0.2 2 20133 214 338 17 21768 0.2 2 21176 233 359 18 21416 0.2 2 20922 212 282 19 21964 0.2 2 21273 227 464 20 22201 0.2 2 21581 282 338 21 21414 0.2 2 20845 250 319 22 20633 0.2 2 20069 206 358 23 19330 0.2 2 18797 174 359 24 19899 0.2 2 19378 209 312 25 21021 0.2 2 20616 203 202 26 21506 0.2 2 21008 220 278 27 20727 0.2 2 20298 186 243 28 20002 0.2 2 19506 174 322 29 20850 0.2 2 20345 195 310 30 21207 0.2 2 20752 189 266 31 21362 0.2 2 20910 219 233 32 20819 0.2 2 20369 229 221 33 19221 0.2 2 18580 188 453 34 17509 0.2 2 16963 146 400 35 19497 0.2 2 19128 176 193 36 20427 0.2 2 19988 178 261 37 16933 0.2 2 16507 158 268 38 14559 0.2 2 14189 133 237 39 11877 0.2 2 11410 96 371 40 9622 0.2 2 9225 87 310 41 5744 0.2 2 5367 53 324 42 3402 0.2 2 2681 59 662 43 2040 0.2 2 1636 42 362 44 1289 0.2 2 942 33 314 45 1173 0.2 2 896 29 248 46 2246 0.2 2 1861 44 341 47 1916 0.2 2 1568 32 316 48 1452 0.2 2 1195 51 206 49 1485 0.2 2 1066 25 394 50 1041 0.2 2 819 15 207 51 726 0.2 2 78 24 624 52 322 0.2 2 39 11 272 53 695 0.2 2 18 33 644 54 370 0.2 2 23 19 328 55 308 0.2 2 29 15 264 56 597 0.2 2 70 21 506 57 369 0.2 2 42 10 317 58 425 0.2 2 16 11 398 59 507 0.2 2 16 22 469 60 341 0.2 2 6 6 329 61 365 0.2 2 29 10 326 62 316 0.2 2 12 12 292 63 612 0.2 2 3 12 597 64 447 0.2 2 2 11 434 65 356 0.2 2 4 20 332 66 335 0.2 2 0 4 331 67 270 0.2 2 1 34 235 68 1313 0.2 2 1 59 1253 69 390 0.2 2 0 32 358 70 420 0.2 2 0 6 414 71 132 0.2 2 0 0 132 72 155 0.2 2 0 2 153 73 549 0.2 2 0 20 529 74 545 0.2 2 0 7 538 75 211 0.2 2 0 9 202 76 211 0.2 2 0 6 205 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 914186 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.8% C: 36.5% G: 25.1% T: 23.5% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 50526 50368.8 0 50526 4 53781 12592.2 0 29652 24129 5 55360 3148.0 1 20027 35333 6 36064 787.0 1 20695 15369 7 23984 196.8 1 19606 4378 8 25933 49.2 1 20926 4547 460 9 23721 12.3 1 20974 751 1996 10 22396 3.1 2 20398 529 1469 11 21660 0.8 2 20072 525 1063 12 20729 0.2 2 19979 354 396 13 20680 0.2 2 19974 343 363 14 20293 0.2 2 19660 302 331 15 20982 0.2 2 20415 284 283 16 20601 0.2 2 19935 295 371 17 21679 0.2 2 20975 331 373 18 21331 0.2 2 20707 312 312 19 21886 0.2 2 21040 325 521 20 22073 0.2 2 21361 369 343 21 21330 0.2 2 20636 343 351 22 20559 0.2 2 19877 293 389 23 19256 0.2 2 18613 275 368 24 19837 0.2 2 19221 258 358 25 20970 0.2 2 20407 289 274 26 21433 0.2 2 20835 301 297 27 20710 0.2 2 20155 271 284 28 19928 0.2 2 19305 300 323 29 20813 0.2 2 20171 279 363 30 21154 0.2 2 20529 325 300 31 21339 0.2 2 20735 305 299 32 20819 0.2 2 20169 344 306 33 19229 0.2 2 18425 271 533 34 17504 0.2 2 16766 280 458 35 19480 0.2 2 18974 260 246 36 20368 0.2 2 19832 249 287 37 16921 0.2 2 16312 286 323 38 14538 0.2 2 14098 183 257 39 11856 0.2 2 11314 146 396 40 9630 0.2 2 9142 151 337 41 5783 0.2 2 5295 105 383 42 3443 0.2 2 2670 75 698 43 2048 0.2 2 1632 44 372 44 1282 0.2 2 932 29 321 45 1168 0.2 2 889 23 256 46 2244 0.2 2 1856 46 342 47 1926 0.2 2 1551 36 339 48 1457 0.2 2 1176 58 223 49 1481 0.2 2 1048 33 400 50 1019 0.2 2 798 25 196 51 659 0.2 2 80 19 560 52 336 0.2 2 46 15 275 53 727 0.2 2 23 44 660 54 373 0.2 2 26 16 331 55 347 0.2 2 23 11 313 56 566 0.2 2 65 17 484 57 358 0.2 2 37 7 314 58 367 0.2 2 17 11 339 59 508 0.2 2 16 27 465 60 315 0.2 2 6 11 298 61 411 0.2 2 16 21 374 62 265 0.2 2 10 5 250 63 581 0.2 2 1 13 567 64 443 0.2 2 1 16 426 65 294 0.2 2 3 28 263 66 339 0.2 2 0 5 334 67 263 0.2 2 2 19 242 68 1269 0.2 2 0 52 1217 69 357 0.2 2 0 28 329 70 385 0.2 2 0 6 379 71 146 0.2 2 0 2 144 72 167 0.2 2 0 6 161 73 520 0.2 2 0 19 501 74 520 0.2 2 0 8 512 75 244 0.2 2 0 6 238 76 222 0.2 2 0 4 218 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kt_3_S15_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kt_3_S15_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kt_3_S15_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kt_3_S15_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kt_3_S15_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kt_3_S15_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kt_3_S15_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kt_3_S15_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 858.53 s (177 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 4,844,051 Read 1 with adapter: 1,953,412 (40.3%) Read 2 with adapter: 1,944,630 (40.1%) Pairs that were too short: 4,935 (0.1%) Pairs written (passing filters): 4,839,116 (99.9%) Total basepairs processed: 736,295,752 bp Read 1: 368,147,876 bp Read 2: 368,147,876 bp Total written (filtered): 651,711,868 bp (88.5%) Read 1: 325,782,777 bp Read 2: 325,929,091 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1953412 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 13.8% C: 38.5% G: 24.9% T: 22.9% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 73413 75688.3 0 73413 4 78270 18922.1 0 48003 30267 5 83256 4730.5 1 37611 45645 6 58174 1182.6 1 38039 20135 7 42904 295.7 1 37462 5442 8 44981 73.9 1 39209 5233 539 9 43713 18.5 1 40179 943 2591 10 41846 4.6 2 39364 659 1823 11 41939 1.2 2 39863 697 1379 12 41372 0.3 2 40374 473 525 13 41826 0.3 2 40816 485 525 14 42993 0.3 2 42091 434 468 15 45045 0.3 2 44138 488 419 16 44606 0.3 2 43532 495 579 17 46516 0.3 2 45472 482 562 18 46394 0.3 2 45471 439 484 19 48658 0.3 2 47511 428 719 20 49728 0.3 2 48736 483 509 21 49216 0.3 2 48201 506 509 22 48193 0.3 2 47226 435 532 23 45791 0.3 2 44765 425 601 24 50356 0.3 2 49366 468 522 25 56550 0.3 2 55658 511 381 26 56799 0.3 2 55806 533 460 27 50581 0.3 2 49745 447 389 28 49191 0.3 2 48264 399 528 29 52647 0.3 2 51667 477 503 30 55363 0.3 2 54459 456 448 31 55238 0.3 2 54304 521 413 32 53609 0.3 2 52697 516 396 33 50106 0.3 2 49007 418 681 34 47283 0.3 2 46350 356 577 35 58483 0.3 2 57617 484 382 36 60887 0.3 2 59997 464 426 37 44614 0.3 2 43743 410 461 38 36303 0.3 2 35686 300 317 39 28916 0.3 2 28079 243 594 40 21315 0.3 2 20662 185 468 41 12196 0.3 2 11610 117 469 42 6720 0.3 2 5710 103 907 43 4152 0.3 2 3546 62 544 44 2652 0.3 2 2149 42 461 45 2849 0.3 2 2444 38 367 46 7221 0.3 2 6706 71 444 47 5454 0.3 2 4920 48 486 48 3267 0.3 2 2881 72 314 49 3064 0.3 2 2426 26 612 50 1902 0.3 2 1549 22 331 51 1016 0.3 2 108 23 885 52 423 0.3 2 34 7 382 53 1100 0.3 2 29 64 1007 54 541 0.3 2 31 19 491 55 475 0.3 2 49 19 407 56 861 0.3 2 79 8 774 57 570 0.3 2 72 15 483 58 644 0.3 2 17 21 606 59 725 0.3 2 11 19 695 60 535 0.3 2 17 9 509 61 642 0.3 2 22 15 605 62 416 0.3 2 12 13 391 63 1010 0.3 2 4 20 986 64 704 0.3 2 2 20 682 65 491 0.3 2 0 31 460 66 498 0.3 2 0 7 491 67 413 0.3 2 2 41 370 68 2001 0.3 2 1 68 1932 69 465 0.3 2 0 33 432 70 660 0.3 2 0 8 652 71 165 0.3 2 0 0 165 72 235 0.3 2 0 1 234 73 857 0.3 2 0 25 832 74 736 0.3 2 0 9 727 75 350 0.3 2 0 13 337 76 327 0.3 2 0 9 318 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1944630 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.1% C: 37.3% G: 25.3% T: 23.3% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 73321 75688.3 0 73321 4 77547 18922.1 0 47778 29769 5 82222 4730.5 1 36122 46100 6 57609 1182.6 1 37069 20540 7 42451 295.7 1 36411 6040 8 44637 73.9 1 38745 5337 555 9 43411 18.5 1 39792 1035 2584 10 41658 4.6 2 38561 1203 1894 11 41754 1.2 2 39372 881 1501 12 41166 0.3 2 39873 695 598 13 41657 0.3 2 40395 648 614 14 42786 0.3 2 41622 642 522 15 44826 0.3 2 43651 654 521 16 44365 0.3 2 43110 637 618 17 46258 0.3 2 45029 640 589 18 46204 0.3 2 44931 685 588 19 48415 0.3 2 47004 671 740 20 49561 0.3 2 48245 700 616 21 49067 0.3 2 47784 694 589 22 48016 0.3 2 46828 584 604 23 45629 0.3 2 44407 581 641 24 50168 0.3 2 48951 616 601 25 56387 0.3 2 55185 722 480 26 56615 0.3 2 55333 756 526 27 50462 0.3 2 49315 641 506 28 49015 0.3 2 47927 549 539 29 52456 0.3 2 51276 693 487 30 55216 0.3 2 54049 648 519 31 55068 0.3 2 53878 684 506 32 53505 0.3 2 52341 685 479 33 50024 0.3 2 48676 574 774 34 47135 0.3 2 45981 563 591 35 58342 0.3 2 57261 624 457 36 60756 0.3 2 59568 702 486 37 44479 0.3 2 43397 573 509 38 36291 0.3 2 35423 428 440 39 28783 0.3 2 27896 319 568 40 21283 0.3 2 20491 271 521 41 12215 0.3 2 11521 152 542 42 6760 0.3 2 5673 102 985 43 4154 0.3 2 3514 86 554 44 2630 0.3 2 2133 51 446 45 2806 0.3 2 2416 57 333 46 7276 0.3 2 6670 76 530 47 5413 0.3 2 4875 71 467 48 3279 0.3 2 2841 97 341 49 3062 0.3 2 2398 48 616 50 1906 0.3 2 1533 33 340 51 946 0.3 2 111 26 809 52 455 0.3 2 38 16 401 53 1130 0.3 2 34 62 1034 54 556 0.3 2 30 24 502 55 475 0.3 2 40 18 417 56 896 0.3 2 74 13 809 57 556 0.3 2 68 8 480 58 622 0.3 2 16 19 587 59 745 0.3 2 12 39 694 60 512 0.3 2 15 9 488 61 621 0.3 2 16 24 581 62 396 0.3 2 10 4 382 63 928 0.3 2 4 24 900 64 688 0.3 2 3 18 667 65 450 0.3 2 0 25 425 66 556 0.3 2 0 10 546 67 384 0.3 2 2 44 338 68 1857 0.3 2 1 103 1753 69 523 0.3 2 0 44 479 70 627 0.3 2 0 10 617 71 231 0.3 2 0 3 228 72 223 0.3 2 0 3 220 73 798 0.3 2 0 31 767 74 731 0.3 2 0 13 718 75 361 0.3 2 0 22 339 76 317 0.3 2 0 10 307 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kz_300_S4_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kz_300_S4_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kz_300_S4_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kz_300_S4_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kz_300_S4_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kz_300_S4_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kz_300_S4_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kz_300_S4_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 1101.41 s (211 us/read; 0.28 M reads/minute). === Summary === Total read pairs processed: 5,220,330 Read 1 with adapter: 1,922,258 (36.8%) Read 2 with adapter: 1,905,543 (36.5%) Pairs that were too short: 5,776 (0.1%) Pairs written (passing filters): 5,214,554 (99.9%) Total basepairs processed: 793,490,160 bp Read 1: 396,745,080 bp Read 2: 396,745,080 bp Total written (filtered): 710,669,935 bp (89.6%) Read 1: 355,307,410 bp Read 2: 355,362,525 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1922258 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.6% C: 36.8% G: 24.5% T: 24.0% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 76327 81567.7 0 76327 4 80272 20391.9 0 48305 31967 5 85381 5098.0 1 35147 50234 6 58852 1274.5 1 36272 22580 7 41980 318.6 1 35107 6873 8 44042 79.7 1 37382 5911 749 9 43094 19.9 1 38519 1603 2972 10 41939 5.0 2 38345 1277 2317 11 41904 1.2 2 38277 1823 1804 12 41235 0.3 2 39291 1270 674 13 42212 0.3 2 40264 1264 684 14 43184 0.3 2 41272 1253 659 15 43681 0.3 2 41788 1241 652 16 42431 0.3 2 40542 1153 736 17 44248 0.3 2 42433 1129 686 18 44404 0.3 2 42658 1062 684 19 47571 0.3 2 45513 1165 893 20 49191 0.3 2 47307 1176 708 21 48309 0.3 2 46453 1148 708 22 47565 0.3 2 45715 1122 728 23 45586 0.3 2 43807 1007 772 24 49452 0.3 2 47662 1110 680 25 53952 0.3 2 52166 1209 577 26 53609 0.3 2 51828 1152 629 27 48525 0.3 2 47011 995 519 28 47406 0.3 2 45799 990 617 29 51615 0.3 2 49931 1080 604 30 54496 0.3 2 52912 1032 552 31 53934 0.3 2 52325 1037 572 32 53962 0.3 2 52334 1093 535 33 50109 0.3 2 48261 920 928 34 47259 0.3 2 45636 917 706 35 56662 0.3 2 55041 1100 521 36 57812 0.3 2 56113 1078 621 37 43563 0.3 2 42178 854 531 38 35307 0.3 2 34130 710 467 39 26959 0.3 2 25765 491 703 40 20358 0.3 2 19428 390 540 41 10774 0.3 2 10047 200 527 42 5908 0.3 2 4739 132 1037 43 3703 0.3 2 3077 84 542 44 2571 0.3 2 1994 62 515 45 2720 0.3 2 2252 62 406 46 6107 0.3 2 5481 112 514 47 5039 0.3 2 4457 80 502 48 3042 0.3 2 2669 64 309 49 2875 0.3 2 2167 41 667 50 1738 0.3 2 1356 38 344 51 1109 0.3 2 106 18 985 52 529 0.3 2 53 12 464 53 1298 0.3 2 33 39 1226 54 678 0.3 2 38 16 624 55 529 0.3 2 35 14 480 56 1154 0.3 2 78 19 1057 57 577 0.3 2 97 9 471 58 772 0.3 2 38 31 703 59 715 0.3 2 13 20 682 60 574 0.3 2 3 4 567 61 761 0.3 2 10 13 738 62 428 0.3 2 2 9 417 63 1165 0.3 2 3 29 1133 64 821 0.3 2 1 15 805 65 513 0.3 2 2 25 486 66 606 0.3 2 2 6 598 67 426 0.3 2 0 31 395 68 2257 0.3 2 1 77 2179 69 585 0.3 2 0 37 548 70 841 0.3 2 0 2 839 71 236 0.3 2 0 2 234 72 292 0.3 2 0 10 282 73 930 0.3 2 0 36 894 74 814 0.3 2 0 5 809 75 429 0.3 2 0 19 410 76 354 0.3 2 3 4 347 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1905543 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 15.0% C: 36.1% G: 24.4% T: 24.5% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 62638 81567.7 0 62638 4 79999 20391.9 0 48332 31667 5 83230 5098.0 1 30108 53122 6 58590 1274.5 1 34831 23759 7 42017 318.6 1 30778 11239 8 44201 79.7 1 37376 5808 1017 9 43437 19.9 1 38696 1491 3250 10 42264 5.0 2 38803 992 2469 11 42020 1.2 2 37475 2606 1939 12 41167 0.3 2 39638 897 632 13 42137 0.3 2 40647 887 603 14 43085 0.3 2 41636 931 518 15 43594 0.3 2 42196 906 492 16 42403 0.3 2 40937 847 619 17 44205 0.3 2 42700 915 590 18 44301 0.3 2 42943 827 531 19 47414 0.3 2 45874 847 693 20 49060 0.3 2 47611 915 534 21 48223 0.3 2 46729 906 588 22 47407 0.3 2 45993 865 549 23 45465 0.3 2 44051 776 638 24 49330 0.3 2 47928 880 522 25 53879 0.3 2 52575 843 461 26 53504 0.3 2 52116 902 486 27 48443 0.3 2 47293 738 412 28 47407 0.3 2 46043 776 588 29 51595 0.3 2 50256 823 516 30 54448 0.3 2 53171 786 491 31 53877 0.3 2 52590 816 471 32 53922 0.3 2 52602 852 468 33 49945 0.3 2 48492 726 727 34 47252 0.3 2 45914 708 630 35 56576 0.3 2 55342 835 399 36 57726 0.3 2 56461 773 492 37 43513 0.3 2 42452 595 466 38 35245 0.3 2 34387 463 395 39 26904 0.3 2 25910 347 647 40 20322 0.3 2 19530 271 521 41 10835 0.3 2 10112 155 568 42 5974 0.3 2 4777 103 1094 43 3722 0.3 2 3096 56 570 44 2551 0.3 2 2004 41 506 45 2702 0.3 2 2263 40 399 46 6176 0.3 2 5520 82 574 47 5024 0.3 2 4484 59 481 48 3064 0.3 2 2684 68 312 49 2793 0.3 2 2155 57 581 50 1734 0.3 2 1361 32 341 51 1175 0.3 2 104 24 1047 52 528 0.3 2 59 15 454 53 1325 0.3 2 37 47 1241 54 720 0.3 2 39 18 663 55 479 0.3 2 35 22 422 56 1262 0.3 2 74 30 1158 57 630 0.3 2 97 11 522 58 744 0.3 2 36 37 671 59 756 0.3 2 13 24 719 60 625 0.3 2 4 10 611 61 786 0.3 2 10 21 755 62 455 0.3 2 2 11 442 63 1232 0.3 2 0 25 1207 64 786 0.3 2 1 15 770 65 558 0.3 2 1 24 533 66 654 0.3 2 2 5 647 67 400 0.3 2 0 30 370 68 2441 0.3 2 2 83 2356 69 579 0.3 2 0 33 546 70 920 0.3 2 0 12 908 71 207 0.3 2 0 4 203 72 320 0.3 2 0 6 314 73 898 0.3 2 0 27 871 74 871 0.3 2 0 9 862 75 464 0.3 2 0 23 441 76 408 0.3 2 3 3 402 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kz_800_S10_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kz_800_S10_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kz_800_S10_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kz_800_S10_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kz_800_S10_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Kz_800_S10_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kz_800_S10_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Kz_800_S10_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 729.54 s (177 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 4,126,238 Read 1 with adapter: 1,559,609 (37.8%) Read 2 with adapter: 1,552,098 (37.6%) Pairs that were too short: 4,459 (0.1%) Pairs written (passing filters): 4,121,779 (99.9%) Total basepairs processed: 627,188,176 bp Read 1: 313,594,088 bp Read 2: 313,594,088 bp Total written (filtered): 556,426,950 bp (88.7%) Read 1: 278,151,062 bp Read 2: 278,275,888 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1559609 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.3% C: 37.2% G: 24.5% T: 24.0% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 58136 64472.5 0 58136 4 61465 16118.1 0 35748 25717 5 65896 4029.5 1 26581 39315 6 43660 1007.4 1 26419 17241 7 30311 251.8 1 25949 4362 8 31559 63.0 1 27389 3687 483 9 31109 15.7 1 28230 696 2183 10 29668 3.9 2 27650 480 1538 11 29834 1.0 2 28119 473 1242 12 29492 0.2 2 28676 367 449 13 30327 0.2 2 29496 363 468 14 32360 0.2 2 31615 352 393 15 33792 0.2 2 33047 367 378 16 31639 0.2 2 30818 362 459 17 32405 0.2 2 31672 310 423 18 32295 0.2 2 31577 332 386 19 35169 0.2 2 34233 354 582 20 36644 0.2 2 35854 356 434 21 36003 0.2 2 35186 387 430 22 35018 0.2 2 34246 319 453 23 34231 0.2 2 33446 274 511 24 40282 0.2 2 39502 358 422 25 48208 0.2 2 47413 434 361 26 46280 0.2 2 45463 413 404 27 38147 0.2 2 37495 308 344 28 36962 0.2 2 36251 298 413 29 40743 0.2 2 40040 298 405 30 44039 0.2 2 43327 361 351 31 43822 0.2 2 43084 375 363 32 43179 0.2 2 42400 401 378 33 40494 0.2 2 39565 362 567 34 41594 0.2 2 40733 336 525 35 61256 0.2 2 60461 460 335 36 66066 0.2 2 65106 537 423 37 43005 0.2 2 42312 344 349 38 32147 0.2 2 31572 235 340 39 25485 0.2 2 24808 172 505 40 20202 0.2 2 19645 163 394 41 10938 0.2 2 10446 74 418 42 5885 0.2 2 5092 81 712 43 3456 0.2 2 2940 40 476 44 2382 0.2 2 1972 24 386 45 3380 0.2 2 2979 39 362 46 10296 0.2 2 9787 91 418 47 6850 0.2 2 6376 66 408 48 3273 0.2 2 2930 53 290 49 2923 0.2 2 2378 35 510 50 2001 0.2 2 1659 32 310 51 912 0.2 2 139 20 753 52 435 0.2 2 54 11 370 53 931 0.2 2 39 35 857 54 533 0.2 2 53 17 463 55 450 0.2 2 64 13 373 56 904 0.2 2 122 14 768 57 584 0.2 2 157 13 414 58 582 0.2 2 49 21 512 59 583 0.2 2 16 27 540 60 502 0.2 2 11 7 484 61 577 0.2 2 13 9 555 62 349 0.2 2 10 6 333 63 847 0.2 2 7 24 816 64 598 0.2 2 2 13 583 65 453 0.2 2 2 24 427 66 508 0.2 2 3 4 501 67 351 0.2 2 1 27 323 68 1758 0.2 2 0 65 1693 69 478 0.2 2 0 30 448 70 567 0.2 2 0 5 562 71 198 0.2 2 0 1 197 72 227 0.2 2 0 3 224 73 739 0.2 2 0 27 712 74 650 0.2 2 0 11 639 75 287 0.2 2 0 15 272 76 298 0.2 2 5 5 288 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1552098 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.7% C: 36.0% G: 24.8% T: 24.4% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 58218 64472.5 0 58218 4 60459 16118.1 0 35255 25204 5 64407 4029.5 1 25696 38711 6 43215 1007.4 1 25847 17368 7 30010 251.8 1 25168 4842 8 31323 63.0 1 27082 3740 501 9 31069 15.7 1 28016 741 2312 10 29537 3.9 2 27197 735 1605 11 29680 1.0 2 27804 600 1276 12 29341 0.2 2 28371 446 524 13 30172 0.2 2 29188 465 519 14 32242 0.2 2 31287 475 480 15 33605 0.2 2 32726 456 423 16 31510 0.2 2 30553 440 517 17 32234 0.2 2 31347 424 463 18 32176 0.2 2 31281 436 459 19 35033 0.2 2 33938 467 628 20 36514 0.2 2 35556 456 502 21 35871 0.2 2 34927 477 467 22 34884 0.2 2 33994 418 472 23 34144 0.2 2 33138 415 591 24 40154 0.2 2 39182 508 464 25 48069 0.2 2 47074 559 436 26 46151 0.2 2 45142 547 462 27 38034 0.2 2 37220 419 395 28 36850 0.2 2 35977 426 447 29 40644 0.2 2 39758 444 442 30 43908 0.2 2 43065 435 408 31 43735 0.2 2 42834 468 433 32 43015 0.2 2 42142 476 397 33 40410 0.2 2 39322 461 627 34 41466 0.2 2 40468 465 533 35 61089 0.2 2 60049 604 436 36 65901 0.2 2 64679 700 522 37 42905 0.2 2 42018 479 408 38 32068 0.2 2 31403 302 363 39 25457 0.2 2 24628 258 571 40 20160 0.2 2 19528 205 427 41 10933 0.2 2 10365 121 447 42 5947 0.2 2 5054 80 813 43 3455 0.2 2 2922 47 486 44 2399 0.2 2 1960 33 406 45 3325 0.2 2 2968 31 326 46 10272 0.2 2 9742 93 437 47 6777 0.2 2 6332 74 371 48 3270 0.2 2 2925 68 277 49 2924 0.2 2 2362 39 523 50 1991 0.2 2 1646 48 297 51 849 0.2 2 128 20 701 52 451 0.2 2 55 14 382 53 911 0.2 2 41 42 828 54 527 0.2 2 61 11 455 55 397 0.2 2 56 10 331 56 919 0.2 2 112 16 791 57 591 0.2 2 156 15 420 58 479 0.2 2 47 20 412 59 585 0.2 2 15 19 551 60 456 0.2 2 9 5 442 61 559 0.2 2 11 15 533 62 370 0.2 2 8 6 356 63 840 0.2 2 5 19 816 64 581 0.2 2 1 18 562 65 400 0.2 2 2 24 374 66 508 0.2 2 3 5 500 67 330 0.2 2 1 24 305 68 1844 0.2 2 0 74 1770 69 443 0.2 2 0 25 418 70 643 0.2 2 0 14 629 71 198 0.2 2 0 2 196 72 240 0.2 2 0 6 234 73 678 0.2 2 0 23 655 74 671 0.2 2 0 14 657 75 330 0.2 2 0 20 310 76 345 0.2 2 5 10 330 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_1_S19_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_1_S19_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_1_S19_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_1_S19_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_1_S19_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_1_S19_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_1_S19_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_1_S19_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 472.52 s (178 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 2,650,649 Read 1 with adapter: 879,369 (33.2%) Read 2 with adapter: 874,929 (33.0%) Pairs that were too short: 2,998 (0.1%) Pairs written (passing filters): 2,647,651 (99.9%) Total basepairs processed: 402,898,648 bp Read 1: 201,449,324 bp Read 2: 201,449,324 bp Total written (filtered): 367,152,117 bp (91.1%) Read 1: 183,545,505 bp Read 2: 183,606,612 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 879369 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.3% C: 37.6% G: 24.9% T: 23.2% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 42959 41416.4 0 42959 4 44814 10354.1 0 26042 18772 5 46933 2588.5 1 19383 27550 6 31837 647.1 1 19649 12188 7 22601 161.8 1 19295 3306 8 23132 40.4 1 19741 3033 358 9 22196 10.1 1 20171 533 1492 10 21063 2.5 2 19570 358 1135 11 20887 0.6 2 19689 356 842 12 19959 0.2 2 19377 241 341 13 19898 0.2 2 19348 223 327 14 19796 0.2 2 19310 216 270 15 20081 0.2 2 19618 216 247 16 20272 0.2 2 19719 231 322 17 21167 0.2 2 20663 182 322 18 21058 0.2 2 20541 229 288 19 21921 0.2 2 21276 227 418 20 22184 0.2 2 21639 245 300 21 21249 0.2 2 20736 250 263 22 20798 0.2 2 20277 196 325 23 19469 0.2 2 18920 200 349 24 20045 0.2 2 19585 169 291 25 21488 0.2 2 21073 201 214 26 22266 0.2 2 21768 222 276 27 20868 0.2 2 20446 186 236 28 20296 0.2 2 19858 171 267 29 21448 0.2 2 20974 196 278 30 21853 0.2 2 21410 210 233 31 21723 0.2 2 21258 215 250 32 20954 0.2 2 20477 243 234 33 19181 0.2 2 18606 152 423 34 17457 0.2 2 16944 157 356 35 19370 0.2 2 19020 164 186 36 20871 0.2 2 20434 182 255 37 17795 0.2 2 17382 168 245 38 15831 0.2 2 15448 155 228 39 12959 0.2 2 12490 109 360 40 10005 0.2 2 9612 102 291 41 5651 0.2 2 5302 53 296 42 3083 0.2 2 2487 50 546 43 1766 0.2 2 1398 26 342 44 1160 0.2 2 822 26 312 45 1053 0.2 2 780 21 252 46 1838 0.2 2 1504 27 307 47 1747 0.2 2 1436 29 282 48 1413 0.2 2 1155 48 210 49 1469 0.2 2 1038 31 400 50 1110 0.2 2 866 22 222 51 625 0.2 2 75 17 533 52 304 0.2 2 35 11 258 53 729 0.2 2 27 49 653 54 394 0.2 2 30 14 350 55 299 0.2 2 43 7 249 56 558 0.2 2 50 15 493 57 366 0.2 2 34 11 321 58 380 0.2 2 25 19 336 59 447 0.2 2 16 19 412 60 307 0.2 2 7 5 295 61 384 0.2 2 22 15 347 62 266 0.2 2 12 14 240 63 579 0.2 2 1 13 565 64 405 0.2 2 1 9 395 65 290 0.2 2 2 13 275 66 337 0.2 2 0 7 330 67 232 0.2 2 2 23 207 68 1154 0.2 2 2 64 1088 69 343 0.2 2 1 24 318 70 387 0.2 2 0 10 377 71 111 0.2 2 0 2 109 72 136 0.2 2 0 2 134 73 492 0.2 2 0 26 466 74 473 0.2 2 0 5 468 75 198 0.2 2 0 10 188 76 199 0.2 2 0 8 191 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 874929 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.7% C: 36.6% G: 25.2% T: 23.5% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 42484 41416.4 0 42484 4 44168 10354.1 0 25517 18651 5 46650 2588.5 1 18741 27909 6 31335 647.1 1 19214 12121 7 22316 161.8 1 18776 3540 8 22761 40.4 1 19558 2825 378 9 22166 10.1 1 20037 564 1565 10 20903 2.5 2 19265 484 1154 11 20836 0.6 2 19540 422 874 12 19871 0.2 2 19210 303 358 13 19850 0.2 2 19194 297 359 14 19703 0.2 2 19107 289 307 15 20025 0.2 2 19471 270 284 16 20187 0.2 2 19535 303 349 17 21090 0.2 2 20484 262 344 18 20967 0.2 2 20419 257 291 19 21846 0.2 2 21147 260 439 20 22072 0.2 2 21490 266 316 21 21241 0.2 2 20581 316 344 22 20694 0.2 2 20128 245 321 23 19413 0.2 2 18832 219 362 24 19999 0.2 2 19425 262 312 25 21450 0.2 2 20960 251 239 26 22204 0.2 2 21639 258 307 27 20824 0.2 2 20318 239 267 28 20285 0.2 2 19769 203 313 29 21412 0.2 2 20870 239 303 30 21832 0.2 2 21294 266 272 31 21653 0.2 2 21140 264 249 32 20885 0.2 2 20345 291 249 33 19139 0.2 2 18499 207 433 34 17399 0.2 2 16846 191 362 35 19367 0.2 2 18937 193 237 36 20891 0.2 2 20367 217 307 37 17800 0.2 2 17299 220 281 38 15802 0.2 2 15400 160 242 39 12958 0.2 2 12411 151 396 40 10003 0.2 2 9583 94 326 41 5701 0.2 2 5277 69 355 42 3123 0.2 2 2475 62 586 43 1757 0.2 2 1382 45 330 44 1163 0.2 2 818 24 321 45 1028 0.2 2 774 28 226 46 1820 0.2 2 1498 23 299 47 1762 0.2 2 1430 34 298 48 1381 0.2 2 1148 58 175 49 1417 0.2 2 1032 29 356 50 1092 0.2 2 853 27 212 51 611 0.2 2 71 21 519 52 302 0.2 2 40 12 250 53 685 0.2 2 33 48 604 54 390 0.2 2 30 13 347 55 344 0.2 2 39 8 297 56 585 0.2 2 46 19 520 57 304 0.2 2 33 11 260 58 366 0.2 2 26 12 328 59 468 0.2 2 10 26 432 60 281 0.2 2 4 2 275 61 403 0.2 2 16 17 370 62 222 0.2 2 9 9 204 63 550 0.2 2 1 15 534 64 399 0.2 2 1 12 386 65 315 0.2 2 2 16 297 66 340 0.2 2 0 5 335 67 255 0.2 2 3 20 232 68 1079 0.2 2 2 61 1016 69 316 0.2 2 1 20 295 70 370 0.2 2 1 6 363 71 119 0.2 2 0 0 119 72 146 0.2 2 0 2 144 73 500 0.2 2 0 21 479 74 450 0.2 2 0 5 445 75 192 0.2 2 0 11 181 76 212 0.2 2 0 3 209 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_2_S20_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_2_S20_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_2_S20_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_2_S20_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_2_S20_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_2_S20_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_2_S20_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_2_S20_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 307.84 s (184 us/read; 0.33 M reads/minute). === Summary === Total read pairs processed: 1,674,209 Read 1 with adapter: 541,900 (32.4%) Read 2 with adapter: 539,230 (32.2%) Pairs that were too short: 1,731 (0.1%) Pairs written (passing filters): 1,672,478 (99.9%) Total basepairs processed: 254,479,768 bp Read 1: 127,239,884 bp Read 2: 127,239,884 bp Total written (filtered): 233,014,939 bp (91.6%) Read 1: 116,489,984 bp Read 2: 116,524,955 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 541900 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.4% C: 37.8% G: 25.0% T: 22.8% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 27378 26159.5 0 27378 4 29438 6539.9 0 17341 12097 5 30030 1635.0 1 12346 17684 6 20687 408.7 1 12868 7819 7 14616 102.2 1 12557 2059 8 15216 25.5 1 12739 2279 198 9 14069 6.4 1 12779 309 981 10 13620 1.6 2 12658 252 710 11 13564 0.4 2 12803 244 517 12 12828 0.1 2 12467 168 193 13 13081 0.1 2 12729 176 176 14 12466 0.1 2 12125 168 173 15 12873 0.1 2 12554 161 158 16 12727 0.1 2 12384 136 207 17 13213 0.1 2 12872 151 190 18 12936 0.1 2 12619 155 162 19 13668 0.1 2 13242 143 283 20 13488 0.1 2 13188 137 163 21 13121 0.1 2 12818 141 162 22 12604 0.1 2 12272 143 189 23 11700 0.1 2 11422 108 170 24 12286 0.1 2 12020 124 142 25 12861 0.1 2 12619 132 110 26 12909 0.1 2 12635 138 136 27 12201 0.1 2 11952 119 130 28 12203 0.1 2 11948 103 152 29 12804 0.1 2 12502 126 176 30 12899 0.1 2 12648 124 127 31 12940 0.1 2 12652 116 172 32 12334 0.1 2 12058 130 146 33 11299 0.1 2 10919 113 267 34 10305 0.1 2 10027 78 200 35 11384 0.1 2 11174 88 122 36 12006 0.1 2 11762 103 141 37 10203 0.1 2 9971 98 134 38 9048 0.1 2 8845 75 128 39 7350 0.1 2 7087 51 212 40 5761 0.1 2 5546 46 169 41 3192 0.1 2 2965 38 189 42 1874 0.1 2 1487 28 359 43 1040 0.1 2 825 27 188 44 676 0.1 2 479 14 183 45 565 0.1 2 416 14 135 46 1057 0.1 2 877 19 161 47 915 0.1 2 734 12 169 48 748 0.1 2 602 18 128 49 835 0.1 2 587 9 239 50 586 0.1 2 434 20 132 51 383 0.1 2 49 12 322 52 214 0.1 2 31 14 169 53 479 0.1 2 24 27 428 54 216 0.1 2 24 5 187 55 202 0.1 2 39 9 154 56 378 0.1 2 57 14 307 57 185 0.1 2 25 9 151 58 221 0.1 2 16 7 198 59 288 0.1 2 14 11 263 60 201 0.1 2 10 1 190 61 227 0.1 2 16 9 202 62 150 0.1 2 17 4 129 63 335 0.1 2 6 3 326 64 253 0.1 2 5 3 245 65 177 0.1 2 5 8 164 66 171 0.1 2 1 3 167 67 146 0.1 2 1 14 131 68 712 0.1 2 0 27 685 69 192 0.1 2 0 10 182 70 218 0.1 2 0 5 213 71 77 0.1 2 0 1 76 72 73 0.1 2 0 1 72 73 275 0.1 2 0 8 267 74 307 0.1 2 0 7 300 75 107 0.1 2 0 2 105 76 109 0.1 2 0 6 103 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 539230 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.8% C: 36.8% G: 25.3% T: 23.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 27238 26159.5 0 27238 4 29025 6539.9 0 17077 11948 5 29844 1635.0 1 11769 18075 6 20305 408.7 1 12510 7795 7 14474 102.2 1 12064 2410 8 14975 25.5 1 12526 2211 238 9 14057 6.4 1 12616 442 999 10 13546 1.6 2 12360 453 733 11 13541 0.4 2 12586 340 615 12 12787 0.1 2 12283 269 235 13 12994 0.1 2 12529 257 208 14 12407 0.1 2 11954 248 205 15 12809 0.1 2 12378 250 181 16 12675 0.1 2 12203 240 232 17 13144 0.1 2 12673 240 231 18 12875 0.1 2 12432 262 181 19 13611 0.1 2 13075 220 316 20 13467 0.1 2 13040 209 218 21 13088 0.1 2 12619 255 214 22 12566 0.1 2 12146 186 234 23 11697 0.1 2 11273 202 222 24 12279 0.1 2 11881 187 211 25 12848 0.1 2 12489 206 153 26 12905 0.1 2 12512 194 199 27 12165 0.1 2 11811 183 171 28 12177 0.1 2 11813 185 179 29 12751 0.1 2 12366 181 204 30 12879 0.1 2 12506 183 190 31 12866 0.1 2 12500 199 167 32 12277 0.1 2 11937 187 153 33 11268 0.1 2 10819 148 301 34 10307 0.1 2 9934 134 239 35 11347 0.1 2 11011 182 154 36 11993 0.1 2 11657 162 174 37 10206 0.1 2 9885 135 186 38 9023 0.1 2 8763 120 140 39 7337 0.1 2 7001 94 242 40 5727 0.1 2 5463 94 170 41 3192 0.1 2 2936 53 203 42 1915 0.1 2 1459 47 409 43 1034 0.1 2 817 25 192 44 671 0.1 2 475 16 180 45 575 0.1 2 411 18 146 46 1082 0.1 2 875 24 183 47 933 0.1 2 724 21 188 48 758 0.1 2 589 37 132 49 810 0.1 2 581 22 207 50 565 0.1 2 427 23 115 51 372 0.1 2 48 9 315 52 190 0.1 2 28 15 147 53 433 0.1 2 24 29 380 54 227 0.1 2 25 7 195 55 203 0.1 2 38 9 156 56 363 0.1 2 55 10 298 57 202 0.1 2 24 6 172 58 243 0.1 2 14 12 217 59 286 0.1 2 12 14 260 60 188 0.1 2 9 6 173 61 245 0.1 2 16 9 220 62 157 0.1 2 14 7 136 63 310 0.1 2 2 4 304 64 251 0.1 2 2 10 239 65 200 0.1 2 5 12 183 66 151 0.1 2 1 3 147 67 154 0.1 2 2 16 136 68 709 0.1 2 0 51 658 69 170 0.1 2 0 14 156 70 221 0.1 2 0 1 220 71 79 0.1 2 0 0 79 72 83 0.1 2 0 2 81 73 269 0.1 2 0 9 260 74 268 0.1 2 0 5 263 75 123 0.1 2 0 6 117 76 118 0.1 2 0 1 117 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_300_S2_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_300_S2_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_300_S2_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_300_S2_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_300_S2_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_300_S2_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_300_S2_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_300_S2_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 922.66 s (176 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 5,232,364 Read 1 with adapter: 1,953,901 (37.3%) Read 2 with adapter: 1,942,182 (37.1%) Pairs that were too short: 5,583 (0.1%) Pairs written (passing filters): 5,226,781 (99.9%) Total basepairs processed: 795,319,328 bp Read 1: 397,659,664 bp Read 2: 397,659,664 bp Total written (filtered): 709,291,488 bp (89.2%) Read 1: 354,572,896 bp Read 2: 354,718,592 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1953901 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.5% C: 36.9% G: 24.4% T: 24.2% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 76155 81755.7 0 76155 4 79645 20438.9 0 46758 32887 5 84777 5109.7 1 34824 49953 6 57153 1277.4 1 35126 22027 7 40110 319.4 1 34469 5641 8 42414 79.8 1 36573 5227 614 9 41488 20.0 1 37515 941 3032 10 40282 5.0 2 37644 708 1930 11 40657 1.2 2 38264 830 1563 12 39894 0.3 2 38808 519 567 13 40756 0.3 2 39550 572 634 14 41852 0.3 2 40762 557 533 15 42053 0.3 2 41012 532 509 16 40712 0.3 2 39632 518 562 17 42263 0.3 2 41233 482 548 18 43381 0.3 2 42300 574 507 19 46693 0.3 2 45415 540 738 20 48550 0.3 2 47456 541 553 21 48199 0.3 2 47007 608 584 22 47577 0.3 2 46459 515 603 23 45533 0.3 2 44437 487 609 24 50418 0.3 2 49352 542 524 25 55611 0.3 2 54591 605 415 26 55066 0.3 2 53991 577 498 27 48896 0.3 2 48001 489 406 28 48402 0.3 2 47401 493 508 29 53063 0.3 2 51990 534 539 30 57554 0.3 2 56489 590 475 31 56839 0.3 2 55781 594 464 32 56517 0.3 2 55491 583 443 33 53298 0.3 2 52060 502 736 34 50821 0.3 2 49708 484 629 35 62975 0.3 2 61952 610 413 36 65212 0.3 2 64087 592 533 37 47089 0.3 2 46055 515 519 38 38183 0.3 2 37329 401 453 39 29711 0.3 2 28742 319 650 40 22524 0.3 2 21764 246 514 41 12022 0.3 2 11390 132 500 42 6641 0.3 2 5536 98 1007 43 4090 0.3 2 3498 61 531 44 2783 0.3 2 2272 37 474 45 3232 0.3 2 2740 50 442 46 8249 0.3 2 7594 81 574 47 5988 0.3 2 5431 63 494 48 3662 0.3 2 3237 83 342 49 3331 0.3 2 2633 41 657 50 2177 0.3 2 1811 32 334 51 1127 0.3 2 137 13 977 52 572 0.3 2 82 18 472 53 1309 0.3 2 59 43 1207 54 709 0.3 2 43 8 658 55 521 0.3 2 58 19 444 56 1156 0.3 2 128 13 1015 57 671 0.3 2 129 17 525 58 768 0.3 2 58 27 683 59 743 0.3 2 31 16 696 60 604 0.3 2 13 11 580 61 739 0.3 2 19 18 702 62 396 0.3 2 13 8 375 63 1096 0.3 2 2 17 1077 64 726 0.3 2 2 15 709 65 557 0.3 2 0 18 539 66 581 0.3 2 2 7 572 67 430 0.3 2 4 36 390 68 2308 0.3 2 0 80 2228 69 563 0.3 2 0 39 524 70 837 0.3 2 0 9 828 71 198 0.3 2 0 3 195 72 313 0.3 2 0 2 311 73 888 0.3 2 1 21 866 74 815 0.3 2 0 6 809 75 391 0.3 2 0 17 374 76 385 0.3 2 2 8 375 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1942182 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.9% C: 35.9% G: 24.6% T: 24.6% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 74191 81755.7 0 74191 4 78264 20438.9 0 46343 31921 5 82701 5109.7 1 32763 49938 6 56465 1277.4 1 34308 22157 7 39711 319.4 1 32487 7224 8 42122 79.8 1 36087 5269 766 9 41251 20.0 1 37119 1080 3052 10 40029 5.0 2 37174 812 2043 11 40560 1.2 2 37672 1082 1806 12 39690 0.3 2 38422 677 591 13 40504 0.3 2 39138 738 628 14 41652 0.3 2 40345 715 592 15 41824 0.3 2 40663 658 503 16 40594 0.3 2 39264 648 682 17 42069 0.3 2 40868 618 583 18 43228 0.3 2 41905 705 618 19 46534 0.3 2 45021 702 811 20 48405 0.3 2 47016 738 651 21 48046 0.3 2 46659 758 629 22 47373 0.3 2 46036 716 621 23 45421 0.3 2 44056 656 709 24 50227 0.3 2 48922 714 591 25 55450 0.3 2 54149 799 502 26 54933 0.3 2 53535 811 587 27 48796 0.3 2 47649 679 468 28 48335 0.3 2 47068 649 618 29 52914 0.3 2 51541 772 601 30 57439 0.3 2 56040 817 582 31 56694 0.3 2 55394 744 556 32 56406 0.3 2 55023 825 558 33 53204 0.3 2 51645 701 858 34 50738 0.3 2 49350 695 693 35 62856 0.3 2 61465 870 521 36 65031 0.3 2 63621 837 573 37 46930 0.3 2 45786 621 523 38 38060 0.3 2 37014 567 479 39 29676 0.3 2 28518 422 736 40 22491 0.3 2 21654 302 535 41 12020 0.3 2 11286 185 549 42 6707 0.3 2 5484 130 1093 43 4087 0.3 2 3478 60 549 44 2818 0.3 2 2243 49 526 45 3149 0.3 2 2717 67 365 46 8211 0.3 2 7519 126 566 47 5962 0.3 2 5381 86 495 48 3639 0.3 2 3215 82 342 49 3314 0.3 2 2609 43 662 50 2185 0.3 2 1801 33 351 51 1147 0.3 2 134 22 991 52 565 0.3 2 82 13 470 53 1205 0.3 2 50 44 1111 54 697 0.3 2 36 16 645 55 537 0.3 2 54 12 471 56 1137 0.3 2 124 18 995 57 697 0.3 2 119 23 555 58 721 0.3 2 54 18 649 59 695 0.3 2 32 20 643 60 591 0.3 2 11 10 570 61 788 0.3 2 18 23 747 62 412 0.3 2 11 12 389 63 1089 0.3 2 0 23 1066 64 734 0.3 2 1 17 716 65 557 0.3 2 0 26 531 66 638 0.3 2 2 5 631 67 381 0.3 2 4 30 347 68 2256 0.3 2 0 84 2172 69 558 0.3 2 0 33 525 70 832 0.3 2 0 10 822 71 245 0.3 2 0 3 242 72 259 0.3 2 0 4 255 73 879 0.3 2 1 22 856 74 881 0.3 2 0 14 867 75 412 0.3 2 0 16 396 76 363 0.3 2 2 5 356 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_3_S21_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_3_S21_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_3_S21_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_3_S21_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_3_S21_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_3_S21_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_3_S21_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_3_S21_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 281.83 s (181 us/read; 0.33 M reads/minute). === Summary === Total read pairs processed: 1,560,422 Read 1 with adapter: 538,739 (34.5%) Read 2 with adapter: 535,960 (34.3%) Pairs that were too short: 1,655 (0.1%) Pairs written (passing filters): 1,558,767 (99.9%) Total basepairs processed: 237,184,144 bp Read 1: 118,592,072 bp Read 2: 118,592,072 bp Total written (filtered): 215,627,164 bp (90.9%) Read 1: 107,797,251 bp Read 2: 107,829,913 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 538739 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.2% C: 38.0% G: 25.1% T: 22.7% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 26575 24381.6 0 26575 4 27726 6095.4 0 16454 11272 5 28819 1523.8 1 12708 16111 6 19632 381.0 1 12617 7015 7 14362 95.2 1 12461 1901 8 14954 23.8 1 12869 1910 175 9 14112 6.0 1 12935 349 828 10 13470 1.5 2 12609 256 605 11 13150 0.4 2 12475 210 465 12 12825 0.1 2 12478 166 181 13 12655 0.1 2 12303 171 181 14 12461 0.1 2 12152 158 151 15 12784 0.1 2 12505 145 134 16 12916 0.1 2 12599 130 187 17 13179 0.1 2 12862 124 193 18 13069 0.1 2 12776 144 149 19 13551 0.1 2 13165 141 245 20 13673 0.1 2 13362 141 170 21 13261 0.1 2 12917 174 170 22 12467 0.1 2 12193 123 151 23 11979 0.1 2 11673 139 167 24 12318 0.1 2 12031 114 173 25 13015 0.1 2 12767 130 118 26 13426 0.1 2 13163 126 137 27 12362 0.1 2 12139 106 117 28 12296 0.1 2 12053 104 139 29 12433 0.1 2 12146 126 161 30 12928 0.1 2 12682 112 134 31 12962 0.1 2 12707 121 134 32 12408 0.1 2 12122 145 141 33 11444 0.1 2 11110 108 226 34 10642 0.1 2 10363 85 194 35 11808 0.1 2 11605 85 118 36 12529 0.1 2 12313 99 117 37 10296 0.1 2 10072 100 124 38 9085 0.1 2 8872 97 116 39 7430 0.1 2 7171 73 186 40 6046 0.1 2 5847 51 148 41 3406 0.1 2 3187 22 197 42 1916 0.1 2 1525 35 356 43 1126 0.1 2 915 17 194 44 722 0.1 2 546 13 163 45 566 0.1 2 439 10 117 46 1077 0.1 2 900 17 160 47 1010 0.1 2 821 8 181 48 779 0.1 2 634 35 110 49 816 0.1 2 604 9 203 50 571 0.1 2 452 16 103 51 304 0.1 2 37 10 257 52 160 0.1 2 31 6 123 53 429 0.1 2 21 31 377 54 227 0.1 2 17 9 201 55 174 0.1 2 29 10 135 56 343 0.1 2 54 15 274 57 199 0.1 2 27 10 162 58 224 0.1 2 19 7 198 59 263 0.1 2 9 14 240 60 172 0.1 2 6 3 163 61 206 0.1 2 14 4 188 62 125 0.1 2 14 8 103 63 312 0.1 2 7 7 298 64 229 0.1 2 0 7 222 65 150 0.1 2 1 11 138 66 170 0.1 2 2 4 164 67 122 0.1 2 1 15 106 68 644 0.1 2 0 20 624 69 173 0.1 2 0 16 157 70 184 0.1 2 0 4 180 71 71 0.1 2 0 0 71 72 75 0.1 2 0 1 74 73 272 0.1 2 0 10 262 74 259 0.1 2 0 3 256 75 118 0.1 2 0 6 112 76 97 0.1 2 1 2 94 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 535960 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.6% C: 37.1% G: 25.4% T: 22.9% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 26350 24381.6 0 26350 4 27315 6095.4 0 16205 11110 5 28456 1523.8 1 12021 16435 6 19421 381.0 1 12245 7176 7 14269 95.2 1 11896 2373 8 14744 23.8 1 12651 1881 212 9 14067 6.0 1 12793 366 908 10 13385 1.5 2 12337 405 643 11 13115 0.4 2 12266 334 515 12 12787 0.1 2 12318 246 223 13 12564 0.1 2 12109 265 190 14 12419 0.1 2 12003 225 191 15 12676 0.1 2 12301 232 143 16 12864 0.1 2 12420 236 208 17 13098 0.1 2 12667 231 200 18 13006 0.1 2 12591 238 177 19 13476 0.1 2 12997 230 249 20 13627 0.1 2 13203 238 186 21 13210 0.1 2 12801 224 185 22 12421 0.1 2 12029 195 197 23 11952 0.1 2 11539 190 223 24 12264 0.1 2 11912 182 170 25 12973 0.1 2 12571 253 149 26 13392 0.1 2 13000 201 191 27 12340 0.1 2 12021 176 143 28 12290 0.1 2 11906 185 199 29 12393 0.1 2 11998 191 204 30 12895 0.1 2 12527 210 158 31 12918 0.1 2 12580 184 154 32 12373 0.1 2 11984 216 173 33 11440 0.1 2 11002 149 289 34 10625 0.1 2 10252 157 216 35 11792 0.1 2 11470 167 155 36 12505 0.1 2 12156 192 157 37 10295 0.1 2 9943 178 174 38 9030 0.1 2 8788 135 107 39 7431 0.1 2 7086 114 231 40 6066 0.1 2 5777 96 193 41 3387 0.1 2 3137 66 184 42 1957 0.1 2 1516 48 393 43 1101 0.1 2 901 17 183 44 728 0.1 2 541 17 170 45 571 0.1 2 429 19 123 46 1079 0.1 2 900 16 163 47 989 0.1 2 803 20 166 48 792 0.1 2 635 25 132 49 822 0.1 2 597 12 213 50 574 0.1 2 449 12 113 51 324 0.1 2 40 9 275 52 188 0.1 2 27 7 154 53 376 0.1 2 26 21 329 54 181 0.1 2 13 5 163 55 178 0.1 2 24 8 146 56 332 0.1 2 51 19 262 57 189 0.1 2 26 4 159 58 245 0.1 2 20 9 216 59 276 0.1 2 7 9 260 60 154 0.1 2 5 3 146 61 216 0.1 2 9 10 197 62 152 0.1 2 13 5 134 63 298 0.1 2 5 10 283 64 243 0.1 2 0 8 235 65 155 0.1 2 1 6 148 66 193 0.1 2 2 5 186 67 136 0.1 2 1 15 120 68 618 0.1 2 0 35 583 69 169 0.1 2 0 11 158 70 186 0.1 2 0 4 182 71 72 0.1 2 0 1 71 72 65 0.1 2 0 2 63 73 272 0.1 2 0 9 263 74 259 0.1 2 0 6 253 75 107 0.1 2 0 5 102 76 132 0.1 2 1 3 128 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_800_S8_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_800_S8_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_800_S8_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_800_S8_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_800_S8_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/Mz_800_S8_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_800_S8_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/Mz_800_S8_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 765.53 s (177 us/read; 0.34 M reads/minute). === Summary === Total read pairs processed: 4,330,281 Read 1 with adapter: 1,571,080 (36.3%) Read 2 with adapter: 1,562,084 (36.1%) Pairs that were too short: 4,658 (0.1%) Pairs written (passing filters): 4,325,623 (99.9%) Total basepairs processed: 658,202,712 bp Read 1: 329,101,356 bp Read 2: 329,101,356 bp Total written (filtered): 588,373,788 bp (89.4%) Read 1: 294,113,880 bp Read 2: 294,259,908 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1571080 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.4% C: 36.9% G: 24.5% T: 24.2% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 62390 67660.6 0 62390 4 65327 16915.2 0 37538 27789 5 69188 4228.8 1 27498 41690 6 45991 1057.2 1 27473 18518 7 31802 264.3 1 27102 4700 8 32926 66.1 1 28464 3932 530 9 32912 16.5 1 29564 797 2551 10 31401 4.1 2 29287 530 1584 11 32015 1.0 2 30145 496 1374 12 31291 0.3 2 30420 367 504 13 32062 0.3 2 31203 346 513 14 33147 0.3 2 32350 334 463 15 33981 0.3 2 33252 345 384 16 32575 0.3 2 31739 340 496 17 33500 0.3 2 32714 345 441 18 33502 0.3 2 32703 346 453 19 36133 0.3 2 35139 355 639 20 37250 0.3 2 36419 380 451 21 37083 0.3 2 36194 412 477 22 35957 0.3 2 35130 344 483 23 35453 0.3 2 34557 317 579 24 39981 0.3 2 39160 367 454 25 45403 0.3 2 44688 367 348 26 44360 0.3 2 43529 411 420 27 38444 0.3 2 37747 335 362 28 37404 0.3 2 36623 309 472 29 41183 0.3 2 40363 333 487 30 44258 0.3 2 43496 387 375 31 43574 0.3 2 42840 367 367 32 43354 0.3 2 42571 376 407 33 41540 0.3 2 40571 336 633 34 40568 0.3 2 39750 314 504 35 54448 0.3 2 53664 409 375 36 57935 0.3 2 57049 446 440 37 41283 0.3 2 40503 358 422 38 32829 0.3 2 32196 260 373 39 25865 0.3 2 25072 204 589 40 19869 0.3 2 19227 189 453 41 10907 0.3 2 10348 90 469 42 6006 0.3 2 5101 69 836 43 3551 0.3 2 3006 52 493 44 2319 0.3 2 1844 26 449 45 2653 0.3 2 2267 43 343 46 7356 0.3 2 6822 76 458 47 5526 0.3 2 4997 73 456 48 3290 0.3 2 2883 65 342 49 2814 0.3 2 2258 47 509 50 1941 0.3 2 1606 30 305 51 962 0.3 2 119 12 831 52 472 0.3 2 68 8 396 53 1040 0.3 2 44 32 964 54 572 0.3 2 50 19 503 55 488 0.3 2 46 14 428 56 948 0.3 2 116 16 816 57 676 0.3 2 155 17 504 58 622 0.3 2 52 21 549 59 632 0.3 2 24 18 590 60 518 0.3 2 16 6 496 61 676 0.3 2 20 9 647 62 389 0.3 2 9 8 372 63 934 0.3 2 3 12 919 64 628 0.3 2 2 12 614 65 502 0.3 2 1 22 479 66 535 0.3 2 2 8 525 67 375 0.3 2 3 43 329 68 1928 0.3 2 1 71 1856 69 484 0.3 2 0 29 455 70 615 0.3 2 0 10 605 71 205 0.3 2 0 0 205 72 215 0.3 2 0 5 210 73 764 0.3 2 0 16 748 74 711 0.3 2 0 12 699 75 334 0.3 2 0 21 313 76 308 0.3 2 8 6 294 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1562084 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.9% C: 35.9% G: 24.7% T: 24.5% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 61919 67660.6 0 61919 4 63928 16915.2 0 36903 27025 5 67926 4228.8 1 26204 41722 6 45427 1057.2 1 26969 18458 7 31529 264.3 1 26018 5511 8 32608 66.1 1 28115 3902 591 9 32632 16.5 1 29204 892 2536 10 31263 4.1 2 28730 786 1747 11 31858 1.0 2 29747 700 1411 12 31104 0.3 2 30094 495 515 13 31840 0.3 2 30855 482 503 14 33006 0.3 2 32037 486 483 15 33878 0.3 2 32905 506 467 16 32460 0.3 2 31446 455 559 17 33380 0.3 2 32307 551 522 18 33357 0.3 2 32352 496 509 19 35969 0.3 2 34811 493 665 20 37130 0.3 2 36090 488 552 21 36941 0.3 2 35875 529 537 22 35799 0.3 2 34787 485 527 23 35345 0.3 2 34264 425 656 24 39831 0.3 2 38784 542 505 25 45210 0.3 2 44213 587 410 26 44270 0.3 2 43206 574 490 27 38329 0.3 2 37422 493 414 28 37293 0.3 2 36299 480 514 29 41030 0.3 2 40045 490 495 30 44141 0.3 2 43145 536 460 31 43522 0.3 2 42503 529 490 32 43250 0.3 2 42202 568 480 33 41458 0.3 2 40292 492 674 34 40505 0.3 2 39391 496 618 35 54290 0.3 2 53191 673 426 36 57779 0.3 2 56520 760 499 37 41162 0.3 2 40174 504 484 38 32737 0.3 2 31937 390 410 39 25808 0.3 2 24841 332 635 40 19791 0.3 2 19053 247 491 41 10907 0.3 2 10227 162 518 42 6074 0.3 2 5045 107 922 43 3548 0.3 2 2976 72 500 44 2331 0.3 2 1833 49 449 45 2681 0.3 2 2245 58 378 46 7338 0.3 2 6776 91 471 47 5471 0.3 2 4944 88 439 48 3240 0.3 2 2856 70 314 49 2818 0.3 2 2237 48 533 50 1949 0.3 2 1592 33 324 51 912 0.3 2 116 19 777 52 446 0.3 2 65 12 369 53 981 0.3 2 42 38 901 54 574 0.3 2 51 15 508 55 503 0.3 2 43 15 445 56 869 0.3 2 110 18 741 57 599 0.3 2 150 6 443 58 562 0.3 2 51 15 496 59 591 0.3 2 24 20 547 60 520 0.3 2 13 10 497 61 620 0.3 2 16 10 594 62 397 0.3 2 8 15 374 63 895 0.3 2 3 20 872 64 664 0.3 2 2 11 651 65 444 0.3 2 1 11 432 66 533 0.3 2 1 9 523 67 393 0.3 2 3 34 356 68 1876 0.3 2 1 75 1800 69 494 0.3 2 0 34 460 70 603 0.3 2 0 12 591 71 210 0.3 2 0 3 207 72 198 0.3 2 0 2 196 73 797 0.3 2 0 22 775 74 666 0.3 2 0 15 651 75 344 0.3 2 0 31 313 76 331 0.3 2 9 8 314 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/U_1_S28_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/U_1_S28_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/U_1_S28_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/U_1_S28_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/U_1_S28_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/U_1_S28_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/U_1_S28_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/U_1_S28_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 348.24 s (182 us/read; 0.33 M reads/minute). === Summary === Total read pairs processed: 1,917,485 Read 1 with adapter: 697,187 (36.4%) Read 2 with adapter: 693,170 (36.1%) Pairs that were too short: 2,040 (0.1%) Pairs written (passing filters): 1,915,445 (99.9%) Total basepairs processed: 291,457,720 bp Read 1: 145,728,860 bp Read 2: 145,728,860 bp Total written (filtered): 261,766,586 bp (89.8%) Read 1: 130,850,903 bp Read 2: 130,915,683 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 697187 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.5% C: 36.8% G: 24.3% T: 24.4% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 28013 29960.7 0 28013 4 29715 7490.2 0 17720 11995 5 31923 1872.5 1 13458 18465 6 21545 468.1 1 13598 7947 7 15826 117.0 1 13626 2200 8 16216 29.3 1 14171 1800 245 9 15567 7.3 1 14121 380 1066 10 15492 1.8 2 14424 283 785 11 15412 0.5 2 14533 262 617 12 14870 0.1 2 14410 208 252 13 14752 0.1 2 14331 205 216 14 15213 0.1 2 14812 202 199 15 15759 0.1 2 15366 201 192 16 15758 0.1 2 15332 176 250 17 16652 0.1 2 16168 252 232 18 16731 0.1 2 16328 180 223 19 17820 0.1 2 17304 190 326 20 18224 0.1 2 17766 217 241 21 17987 0.1 2 17540 199 248 22 17239 0.1 2 16827 179 233 23 16774 0.1 2 16311 176 287 24 17310 0.1 2 16877 199 234 25 18617 0.1 2 18228 201 188 26 19495 0.1 2 19100 182 213 27 18932 0.1 2 18557 180 195 28 18687 0.1 2 18256 187 244 29 19872 0.1 2 19468 185 219 30 20852 0.1 2 20462 202 188 31 20342 0.1 2 19945 215 182 32 19425 0.1 2 19055 196 174 33 17847 0.1 2 17415 178 254 34 15767 0.1 2 15382 155 230 35 17231 0.1 2 16897 188 146 36 17948 0.1 2 17614 155 179 37 14508 0.1 2 14199 132 177 38 12385 0.1 2 12100 113 172 39 9890 0.1 2 9575 100 215 40 7210 0.1 2 6941 59 210 41 4041 0.1 2 3787 32 222 42 2192 0.1 2 1882 23 287 43 1437 0.1 2 1194 25 218 44 1013 0.1 2 786 25 202 45 953 0.1 2 759 28 166 46 1666 0.1 2 1431 27 208 47 1621 0.1 2 1403 9 209 48 1279 0.1 2 1110 19 150 49 1201 0.1 2 923 23 255 50 817 0.1 2 650 16 151 51 419 0.1 2 66 8 345 52 216 0.1 2 31 10 175 53 410 0.1 2 22 21 367 54 241 0.1 2 31 8 202 55 230 0.1 2 33 6 191 56 373 0.1 2 42 7 324 57 281 0.1 2 49 5 227 58 261 0.1 2 16 10 235 59 286 0.1 2 23 11 252 60 244 0.1 2 2 8 234 61 277 0.1 2 16 8 253 62 199 0.1 2 4 2 193 63 401 0.1 2 4 11 386 64 267 0.1 2 2 8 257 65 217 0.1 2 1 20 196 66 268 0.1 2 0 4 264 67 152 0.1 2 2 11 139 68 813 0.1 2 0 27 786 69 182 0.1 2 0 10 172 70 289 0.1 2 0 6 283 71 100 0.1 2 0 0 100 72 99 0.1 2 0 1 98 73 334 0.1 2 0 10 324 74 270 0.1 2 0 5 265 75 147 0.1 2 0 6 141 76 185 0.1 2 0 4 181 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 693170 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.8% C: 35.9% G: 24.6% T: 24.7% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 27814 29960.7 0 27814 4 29305 7490.2 0 17218 12087 5 31257 1872.5 1 12736 18521 6 21247 468.1 1 13272 7975 7 15641 117.0 1 12939 2702 8 16115 29.3 1 13954 1903 258 9 15487 7.3 1 13981 449 1057 10 15406 1.8 2 14164 406 836 11 15298 0.5 2 14263 396 639 12 14787 0.1 2 14246 260 281 13 14705 0.1 2 14138 285 282 14 15178 0.1 2 14620 289 269 15 15665 0.1 2 15173 272 220 16 15693 0.1 2 15142 269 282 17 16580 0.1 2 16016 289 275 18 16666 0.1 2 16120 280 266 19 17762 0.1 2 17113 280 369 20 18135 0.1 2 17537 316 282 21 17934 0.1 2 17333 305 296 22 17177 0.1 2 16617 287 273 23 16666 0.1 2 16113 265 288 24 17221 0.1 2 16688 271 262 25 18519 0.1 2 18015 292 212 26 19438 0.1 2 18876 314 248 27 18898 0.1 2 18359 281 258 28 18607 0.1 2 18059 296 252 29 19776 0.1 2 19229 306 241 30 20809 0.1 2 20245 313 251 31 20283 0.1 2 19745 318 220 32 19413 0.1 2 18852 307 254 33 17849 0.1 2 17267 260 322 34 15748 0.1 2 15222 239 287 35 17187 0.1 2 16733 241 213 36 17887 0.1 2 17412 252 223 37 14449 0.1 2 14027 249 173 38 12367 0.1 2 11991 154 222 39 9888 0.1 2 9498 120 270 40 7183 0.1 2 6858 105 220 41 4051 0.1 2 3737 73 241 42 2215 0.1 2 1858 42 315 43 1476 0.1 2 1177 30 269 44 970 0.1 2 782 21 167 45 931 0.1 2 758 17 156 46 1687 0.1 2 1423 22 242 47 1574 0.1 2 1378 23 173 48 1269 0.1 2 1088 36 145 49 1168 0.1 2 912 27 229 50 841 0.1 2 649 21 171 51 384 0.1 2 62 11 311 52 239 0.1 2 33 10 196 53 375 0.1 2 17 26 332 54 262 0.1 2 31 7 224 55 258 0.1 2 30 12 216 56 371 0.1 2 37 14 320 57 270 0.1 2 48 7 215 58 268 0.1 2 15 10 243 59 254 0.1 2 20 10 224 60 232 0.1 2 2 6 224 61 278 0.1 2 11 8 259 62 188 0.1 2 5 4 179 63 359 0.1 2 1 14 344 64 276 0.1 2 2 5 269 65 183 0.1 2 1 8 174 66 238 0.1 2 0 3 235 67 182 0.1 2 3 15 164 68 787 0.1 2 0 28 759 69 196 0.1 2 0 9 187 70 262 0.1 2 0 3 259 71 81 0.1 2 0 3 78 72 99 0.1 2 0 3 96 73 369 0.1 2 0 12 357 74 250 0.1 2 0 3 247 75 145 0.1 2 0 13 132 76 142 0.1 2 0 2 140 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/U_2_S29_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/U_2_S29_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/U_2_S29_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/U_2_S29_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/U_2_S29_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/U_2_S29_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/U_2_S29_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/U_2_S29_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 913.01 s (185 us/read; 0.32 M reads/minute). === Summary === Total read pairs processed: 4,925,730 Read 1 with adapter: 1,809,720 (36.7%) Read 2 with adapter: 1,801,647 (36.6%) Pairs that were too short: 5,191 (0.1%) Pairs written (passing filters): 4,920,539 (99.9%) Total basepairs processed: 748,710,960 bp Read 1: 374,355,480 bp Read 2: 374,355,480 bp Total written (filtered): 671,354,120 bp (89.7%) Read 1: 335,615,405 bp Read 2: 335,738,715 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1809720 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.3% C: 37.4% G: 24.5% T: 23.8% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 72824 76964.5 0 72824 4 77949 19241.1 0 45882 32067 5 82485 4810.3 1 34411 48074 6 56157 1202.6 1 35308 20849 7 40136 300.6 1 34501 5635 8 42087 75.2 1 35670 5855 562 9 40690 18.8 1 36988 945 2757 10 39143 4.7 2 36461 714 1968 11 39376 1.2 2 37066 778 1532 12 38188 0.3 2 37077 560 551 13 38927 0.3 2 37847 535 545 14 38416 0.3 2 37370 539 507 15 40287 0.3 2 39324 511 452 16 40555 0.3 2 39497 518 540 17 43196 0.3 2 42036 527 633 18 42752 0.3 2 41644 515 593 19 45377 0.3 2 43986 551 840 20 46833 0.3 2 45661 567 605 21 46375 0.3 2 45217 565 593 22 44723 0.3 2 43581 505 637 23 42929 0.3 2 41756 476 697 24 45002 0.3 2 43995 450 557 25 48165 0.3 2 47239 534 392 26 49894 0.3 2 48889 517 488 27 48133 0.3 2 47172 486 475 28 48013 0.3 2 46998 470 545 29 51449 0.3 2 50386 553 510 30 53870 0.3 2 52908 511 451 31 53580 0.3 2 52585 550 445 32 51599 0.3 2 50588 578 433 33 47469 0.3 2 46344 449 676 34 42764 0.3 2 41754 392 618 35 45762 0.3 2 44964 455 343 36 47351 0.3 2 46457 425 469 37 38589 0.3 2 37708 419 462 38 32433 0.3 2 31728 302 403 39 25561 0.3 2 24724 239 598 40 19136 0.3 2 18503 185 448 41 10722 0.3 2 10125 102 495 42 6193 0.3 2 5133 111 949 43 3981 0.3 2 3338 47 596 44 2821 0.3 2 2304 43 474 45 2611 0.3 2 2146 49 416 46 5031 0.3 2 4458 62 511 47 4484 0.3 2 3947 64 473 48 3377 0.3 2 2916 77 384 49 2894 0.3 2 2215 61 618 50 2067 0.3 2 1690 34 343 51 1046 0.3 2 167 32 847 52 498 0.3 2 87 11 400 53 1097 0.3 2 56 60 981 54 639 0.3 2 58 18 563 55 521 0.3 2 78 18 425 56 1003 0.3 2 146 27 830 57 589 0.3 2 116 15 458 58 689 0.3 2 46 20 623 59 748 0.3 2 35 33 680 60 517 0.3 2 21 6 490 61 647 0.3 2 24 22 601 62 433 0.3 2 21 6 406 63 963 0.3 2 8 24 931 64 651 0.3 2 5 16 630 65 551 0.3 2 2 31 518 66 548 0.3 2 1 12 535 67 409 0.3 2 2 34 373 68 1969 0.3 2 0 92 1877 69 507 0.3 2 0 31 476 70 629 0.3 2 0 6 623 71 185 0.3 2 0 1 184 72 256 0.3 2 0 2 254 73 784 0.3 2 0 26 758 74 782 0.3 2 0 15 767 75 359 0.3 2 0 26 333 76 344 0.3 2 0 8 336 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 1801647 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 14.7% C: 36.2% G: 24.9% T: 24.2% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 73496 76964.5 0 73496 4 76608 19241.1 0 45294 31314 5 81520 4810.3 1 32999 48521 6 55608 1202.6 1 34212 21396 7 39781 300.6 1 33480 6301 8 41446 75.2 1 35215 5645 586 9 40357 18.8 1 36575 1153 2629 10 38923 4.7 2 35543 1289 2091 11 39063 1.2 2 36630 909 1524 12 38000 0.3 2 36655 711 634 13 38751 0.3 2 37351 755 645 14 38269 0.3 2 36992 651 626 15 40115 0.3 2 38895 681 539 16 40399 0.3 2 39070 692 637 17 43012 0.3 2 41593 701 718 18 42565 0.3 2 41167 737 661 19 45145 0.3 2 43574 691 880 20 46611 0.3 2 45188 770 653 21 46169 0.3 2 44746 738 685 22 44487 0.3 2 43145 678 664 23 42793 0.3 2 41398 610 785 24 44882 0.3 2 43552 664 666 25 48042 0.3 2 46843 679 520 26 49766 0.3 2 48448 714 604 27 47940 0.3 2 46710 720 510 28 47901 0.3 2 46580 675 646 29 51311 0.3 2 50016 685 610 30 53729 0.3 2 52463 706 560 31 53427 0.3 2 52179 702 546 32 51410 0.3 2 50147 765 498 33 47366 0.3 2 45944 662 760 34 42611 0.3 2 41390 575 646 35 45607 0.3 2 44585 605 417 36 47258 0.3 2 46127 584 547 37 38500 0.3 2 37382 591 527 38 32346 0.3 2 31442 435 469 39 25457 0.3 2 24508 328 621 40 19124 0.3 2 18366 245 513 41 10711 0.3 2 10032 141 538 42 6214 0.3 2 5100 122 992 43 4002 0.3 2 3312 80 610 44 2805 0.3 2 2292 50 463 45 2601 0.3 2 2135 44 422 46 5024 0.3 2 4431 90 503 47 4520 0.3 2 3909 74 537 48 3385 0.3 2 2885 103 397 49 2908 0.3 2 2212 52 644 50 2067 0.3 2 1671 49 347 51 1041 0.3 2 158 24 859 52 544 0.3 2 92 25 427 53 1109 0.3 2 56 78 975 54 627 0.3 2 52 18 557 55 555 0.3 2 63 22 470 56 992 0.3 2 135 30 827 57 627 0.3 2 118 12 497 58 610 0.3 2 41 27 542 59 734 0.3 2 33 37 664 60 577 0.3 2 19 13 545 61 639 0.3 2 22 18 599 62 430 0.3 2 16 12 402 63 951 0.3 2 6 25 920 64 698 0.3 2 3 27 668 65 534 0.3 2 2 23 509 66 518 0.3 2 1 12 505 67 413 0.3 2 2 51 360 68 1932 0.3 2 0 103 1829 69 504 0.3 2 0 34 470 70 669 0.3 2 0 10 659 71 243 0.3 2 0 4 239 72 282 0.3 2 0 4 278 73 847 0.3 2 0 40 807 74 781 0.3 2 0 12 769 75 389 0.3 2 0 17 372 76 369 0.3 2 0 5 364 cutadapt -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/U_3_S30_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/U_3_S30_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/U_3_S30_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/U_3_S30_R2.fastq.gz This is cutadapt 1.10 with Python 3.5.2 Command line parameters: -m 5 -e 0.20 -a CTGTCTCTTATA -A CTGTCTCTTATA -o /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/U_3_S30_R1.trimmed.fastq.gz -p /srv/scratch/training_camp/tc2016/user23/analysis//trimmed/U_3_S30_R2.trimmed.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/U_3_S30_R1.fastq.gz /srv/scratch/training_camp/tc2016/user23/data/fastq/U_3_S30_R2.fastq.gz Trimming 2 adapters with at most 20.0% errors in paired-end mode ... Finished in 465.42 s (188 us/read; 0.32 M reads/minute). === Summary === Total read pairs processed: 2,477,411 Read 1 with adapter: 708,887 (28.6%) Read 2 with adapter: 705,248 (28.5%) Pairs that were too short: 2,676 (0.1%) Pairs written (passing filters): 2,474,735 (99.9%) Total basepairs processed: 376,566,472 bp Read 1: 188,283,236 bp Read 2: 188,283,236 bp Total written (filtered): 349,490,345 bp (92.8%) Read 1: 174,724,891 bp Read 2: 174,765,454 bp === First read: Adapter 1 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 708887 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 15.0% C: 36.9% G: 24.8% T: 23.2% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 39614 38709.5 0 39614 4 42315 9677.4 0 23533 18782 5 43779 2419.3 1 16591 27188 6 28535 604.8 1 16646 11889 7 19241 151.2 1 16010 3231 8 20145 37.8 1 16446 3414 285 9 18517 9.5 1 16471 538 1508 10 17799 2.4 2 16362 358 1079 11 17742 0.6 2 16527 386 829 12 16936 0.1 2 16411 234 291 13 16846 0.1 2 16298 247 301 14 16690 0.1 2 16188 238 264 15 17205 0.1 2 16729 251 225 16 16895 0.1 2 16345 235 315 17 17495 0.1 2 16948 239 308 18 16771 0.1 2 16276 250 245 19 17555 0.1 2 16963 215 377 20 17762 0.1 2 17248 223 291 21 17473 0.1 2 16971 227 275 22 16681 0.1 2 16179 218 284 23 15577 0.1 2 15094 175 308 24 16679 0.1 2 16223 194 262 25 17719 0.1 2 17266 249 204 26 17801 0.1 2 17342 217 242 27 16108 0.1 2 15705 176 227 28 15660 0.1 2 15251 167 242 29 16298 0.1 2 15821 197 280 30 16726 0.1 2 16296 184 246 31 16559 0.1 2 16170 182 207 32 15937 0.1 2 15488 223 226 33 14467 0.1 2 13889 187 391 34 13295 0.1 2 12834 162 299 35 14480 0.1 2 14114 171 195 36 14387 0.1 2 13979 177 231 37 10768 0.1 2 10384 145 239 38 8518 0.1 2 8220 104 194 39 6388 0.1 2 5984 76 328 40 4723 0.1 2 4417 63 243 41 2725 0.1 2 2438 47 240 42 1720 0.1 2 1146 41 533 43 1060 0.1 2 751 27 282 44 781 0.1 2 511 20 250 45 736 0.1 2 486 19 231 46 1619 0.1 2 1305 31 283 47 1207 0.1 2 929 23 255 48 733 0.1 2 527 38 168 49 790 0.1 2 418 15 357 50 471 0.1 2 292 19 160 51 546 0.1 2 41 14 491 52 257 0.1 2 9 7 241 53 597 0.1 2 14 33 550 54 301 0.1 2 12 6 283 55 271 0.1 2 18 9 244 56 453 0.1 2 23 8 422 57 315 0.1 2 25 10 280 58 339 0.1 2 13 10 316 59 379 0.1 2 9 18 352 60 240 0.1 2 2 5 233 61 313 0.1 2 4 16 293 62 244 0.1 2 4 6 234 63 518 0.1 2 1 12 505 64 345 0.1 2 3 16 326 65 271 0.1 2 0 18 253 66 279 0.1 2 1 1 277 67 180 0.1 2 0 10 170 68 1062 0.1 2 0 48 1014 69 266 0.1 2 0 23 243 70 325 0.1 2 0 4 321 71 91 0.1 2 0 1 90 72 125 0.1 2 0 5 120 73 477 0.1 2 0 23 454 74 389 0.1 2 0 3 386 75 179 0.1 2 1 3 175 76 197 0.1 2 1 3 193 === Second read: Adapter 2 === Sequence: CTGTCTCTTATA; Type: regular 3'; Length: 12; Trimmed: 705248 times. No. of allowed errors: 0-4 bp: 0; 5-9 bp: 1; 10-12 bp: 2 Bases preceding removed adapters: A: 15.2% C: 36.0% G: 25.1% T: 23.7% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 3 39457 38709.5 0 39457 4 41735 9677.4 0 23259 18476 5 43301 2419.3 1 15908 27393 6 28168 604.8 1 16240 11928 7 18967 151.2 1 15467 3500 8 19885 37.8 1 16287 3264 334 9 18482 9.5 1 16310 609 1563 10 17773 2.4 2 16112 475 1186 11 17650 0.6 2 16301 492 857 12 16888 0.1 2 16207 331 350 13 16754 0.1 2 16152 297 305 14 16621 0.1 2 16046 298 277 15 17127 0.1 2 16573 297 257 16 16792 0.1 2 16144 332 316 17 17416 0.1 2 16799 286 331 18 16721 0.1 2 16159 283 279 19 17484 0.1 2 16825 265 394 20 17705 0.1 2 17090 282 333 21 17458 0.1 2 16816 323 319 22 16616 0.1 2 16029 279 308 23 15541 0.1 2 14991 219 331 24 16601 0.1 2 16063 250 288 25 17657 0.1 2 17121 288 248 26 17784 0.1 2 17191 290 303 27 16074 0.1 2 15602 243 229 28 15612 0.1 2 15129 222 261 29 16225 0.1 2 15663 271 291 30 16673 0.1 2 16153 262 258 31 16515 0.1 2 16040 256 219 32 15910 0.1 2 15369 294 247 33 14417 0.1 2 13787 207 423 34 13260 0.1 2 12727 203 330 35 14464 0.1 2 13985 250 229 36 14361 0.1 2 13877 220 264 37 10741 0.1 2 10304 172 265 38 8526 0.1 2 8172 135 219 39 6337 0.1 2 5911 124 302 40 4724 0.1 2 4384 69 271 41 2769 0.1 2 2409 50 310 42 1700 0.1 2 1140 45 515 43 1101 0.1 2 747 28 326 44 832 0.1 2 508 18 306 45 717 0.1 2 484 24 209 46 1597 0.1 2 1291 37 269 47 1222 0.1 2 928 28 266 48 764 0.1 2 526 43 195 49 779 0.1 2 411 15 353 50 467 0.1 2 293 15 159 51 499 0.1 2 41 13 445 52 256 0.1 2 10 8 238 53 576 0.1 2 11 45 520 54 331 0.1 2 10 10 311 55 237 0.1 2 19 11 207 56 456 0.1 2 23 6 427 57 294 0.1 2 25 2 267 58 322 0.1 2 11 8 303 59 404 0.1 2 6 19 379 60 280 0.1 2 3 3 274 61 318 0.1 2 2 14 302 62 217 0.1 2 2 5 210 63 469 0.1 2 1 12 456 64 376 0.1 2 3 16 357 65 290 0.1 2 0 14 276 66 295 0.1 2 1 4 290 67 217 0.1 2 0 20 197 68 1042 0.1 2 0 56 986 69 274 0.1 2 0 22 252 70 289 0.1 2 0 4 285 71 125 0.1 2 0 2 123 72 121 0.1 2 0 7 114 73 433 0.1 2 0 9 424 74 400 0.1 2 0 7 393 75 182 0.1 2 0 10 172 76 175 0.1 2 1 7 167
Now, we're ready to align our trimmed reads to the Yeast SacCer3 reference genome.
We'll use Bowtie2, which is a Burrows-Wheeler based spliced aligner.
Bowtie2 outputs a SAM (Sequence Alignment Map) file, which is a standard text encoding. To save space, we'll use samtools view -b
to encode the output as a binarized SAM file — a BAM file.
#set the bowtie index
export bowtie_index=$YEAST_INDEX
echo $bowtie_index
/srv/scratch/training_camp/saccer3/bowtie2_index/saccer3
#create a directory to store the aligned data
export ALIGNMENT_DIR="$ANALYSIS_DIR/aligned/"
[[ ! -d $ALIGNMENT_DIR ]] && mkdir -p "$ALIGNMENT_DIR"
for trimmed_fq1 in ${TRIMMED_DIR}*_R1*fastq.gz; do
trimmed_fq2=$(echo $trimmed_fq1 | sed -e 's/_R1/_R2/')
bam=$(echo "${ALIGNMENT_DIR}${trimmed_fq1##*/}" | sed -e 's/.fastq.gz/.bam/')
bowtie2 -X2000 --mm --threads 35 -x $bowtie_index -1 $trimmed_fq1 -2 $trimmed_fq2 | samtools view -bS - > $bam
done
[samopen] SAM header is present: 17 sequences. 3925710 reads; of these: 3925710 (100.00%) were paired; of these: 273596 (6.97%) aligned concordantly 0 times 1778827 (45.31%) aligned concordantly exactly 1 time 1873287 (47.72%) aligned concordantly >1 times ---- 273596 pairs aligned concordantly 0 times; of these: 26806 (9.80%) aligned discordantly 1 time ---- 246790 pairs aligned 0 times concordantly or discordantly; of these: 493580 mates make up the pairs; of these: 402256 (81.50%) aligned 0 times 17544 (3.55%) aligned exactly 1 time 73780 (14.95%) aligned >1 times 94.88% overall alignment rate [samopen] SAM header is present: 17 sequences. 3025310 reads; of these: 3025310 (100.00%) were paired; of these: 205484 (6.79%) aligned concordantly 0 times 1367132 (45.19%) aligned concordantly exactly 1 time 1452694 (48.02%) aligned concordantly >1 times ---- 205484 pairs aligned concordantly 0 times; of these: 22358 (10.88%) aligned discordantly 1 time ---- 183126 pairs aligned 0 times concordantly or discordantly; of these: 366252 mates make up the pairs; of these: 296087 (80.84%) aligned 0 times 12368 (3.38%) aligned exactly 1 time 57797 (15.78%) aligned >1 times 95.11% overall alignment rate [samopen] SAM header is present: 17 sequences. 5265654 reads; of these: 5265654 (100.00%) were paired; of these: 313855 (5.96%) aligned concordantly 0 times 2411180 (45.79%) aligned concordantly exactly 1 time 2540619 (48.25%) aligned concordantly >1 times ---- 313855 pairs aligned concordantly 0 times; of these: 35277 (11.24%) aligned discordantly 1 time ---- 278578 pairs aligned 0 times concordantly or discordantly; of these: 557156 mates make up the pairs; of these: 442853 (79.48%) aligned 0 times 23691 (4.25%) aligned exactly 1 time 90612 (16.26%) aligned >1 times 95.79% overall alignment rate [samopen] SAM header is present: 17 sequences. 2645906 reads; of these: 2645906 (100.00%) were paired; of these: 158288 (5.98%) aligned concordantly 0 times 1181733 (44.66%) aligned concordantly exactly 1 time 1305885 (49.35%) aligned concordantly >1 times ---- 158288 pairs aligned concordantly 0 times; of these: 16937 (10.70%) aligned discordantly 1 time ---- 141351 pairs aligned 0 times concordantly or discordantly; of these: 282702 mates make up the pairs; of these: 223749 (79.15%) aligned 0 times 10568 (3.74%) aligned exactly 1 time 48385 (17.12%) aligned >1 times 95.77% overall alignment rate [samopen] SAM header is present: 17 sequences. 3672298 reads; of these: 3672298 (100.00%) were paired; of these: 237694 (6.47%) aligned concordantly 0 times 1865980 (50.81%) aligned concordantly exactly 1 time 1568624 (42.72%) aligned concordantly >1 times ---- 237694 pairs aligned concordantly 0 times; of these: 25612 (10.78%) aligned discordantly 1 time ---- 212082 pairs aligned 0 times concordantly or discordantly; of these: 424164 mates make up the pairs; of these: 350929 (82.73%) aligned 0 times 18495 (4.36%) aligned exactly 1 time 54740 (12.91%) aligned >1 times 95.22% overall alignment rate [samopen] SAM header is present: 17 sequences. 2637195 reads; of these: 2637195 (100.00%) were paired; of these: 156805 (5.95%) aligned concordantly 0 times 1265909 (48.00%) aligned concordantly exactly 1 time 1214481 (46.05%) aligned concordantly >1 times ---- 156805 pairs aligned concordantly 0 times; of these: 19467 (12.41%) aligned discordantly 1 time ---- 137338 pairs aligned 0 times concordantly or discordantly; of these: 274676 mates make up the pairs; of these: 215033 (78.29%) aligned 0 times 9808 (3.57%) aligned exactly 1 time 49835 (18.14%) aligned >1 times 95.92% overall alignment rate [samopen] SAM header is present: 17 sequences. 5110225 reads; of these: 5110225 (100.00%) were paired; of these: 293961 (5.75%) aligned concordantly 0 times 2379119 (46.56%) aligned concordantly exactly 1 time 2437145 (47.69%) aligned concordantly >1 times ---- 293961 pairs aligned concordantly 0 times; of these: 31228 (10.62%) aligned discordantly 1 time ---- 262733 pairs aligned 0 times concordantly or discordantly; of these: 525466 mates make up the pairs; of these: 416062 (79.18%) aligned 0 times 22271 (4.24%) aligned exactly 1 time 87133 (16.58%) aligned >1 times 95.93% overall alignment rate [samopen] SAM header is present: 17 sequences. 5138423 reads; of these: 5138423 (100.00%) were paired; of these: 299033 (5.82%) aligned concordantly 0 times 2343253 (45.60%) aligned concordantly exactly 1 time 2496137 (48.58%) aligned concordantly >1 times ---- 299033 pairs aligned concordantly 0 times; of these: 36866 (12.33%) aligned discordantly 1 time ---- 262167 pairs aligned 0 times concordantly or discordantly; of these: 524334 mates make up the pairs; of these: 409772 (78.15%) aligned 0 times 23123 (4.41%) aligned exactly 1 time 91439 (17.44%) aligned >1 times 96.01% overall alignment rate [samopen] SAM header is present: 17 sequences. 2799988 reads; of these: 2799988 (100.00%) were paired; of these: 157614 (5.63%) aligned concordantly 0 times 1387372 (49.55%) aligned concordantly exactly 1 time 1255002 (44.82%) aligned concordantly >1 times ---- 157614 pairs aligned concordantly 0 times; of these: 21960 (13.93%) aligned discordantly 1 time ---- 135654 pairs aligned 0 times concordantly or discordantly; of these: 271308 mates make up the pairs; of these: 209028 (77.04%) aligned 0 times 12964 (4.78%) aligned exactly 1 time 49316 (18.18%) aligned >1 times 96.27% overall alignment rate [samopen] SAM header is present: 17 sequences. 3433770 reads; of these: 3433770 (100.00%) were paired; of these: 201065 (5.86%) aligned concordantly 0 times 1669279 (48.61%) aligned concordantly exactly 1 time 1563426 (45.53%) aligned concordantly >1 times ---- 201065 pairs aligned concordantly 0 times; of these: 22635 (11.26%) aligned discordantly 1 time ---- 178430 pairs aligned 0 times concordantly or discordantly; of these: 356860 mates make up the pairs; of these: 285968 (80.13%) aligned 0 times 16991 (4.76%) aligned exactly 1 time 53901 (15.10%) aligned >1 times 95.84% overall alignment rate [samopen] SAM header is present: 17 sequences. 5271326 reads; of these: 5271326 (100.00%) were paired; of these: 325327 (6.17%) aligned concordantly 0 times 2349759 (44.58%) aligned concordantly exactly 1 time 2596240 (49.25%) aligned concordantly >1 times ---- 325327 pairs aligned concordantly 0 times; of these: 37285 (11.46%) aligned discordantly 1 time ---- 288042 pairs aligned 0 times concordantly or discordantly; of these: 576084 mates make up the pairs; of these: 456636 (79.27%) aligned 0 times 21751 (3.78%) aligned exactly 1 time 97697 (16.96%) aligned >1 times 95.67% overall alignment rate [samopen] SAM header is present: 17 sequences. 4933160 reads; of these: 4933160 (100.00%) were paired; of these: 322681 (6.54%) aligned concordantly 0 times 2618997 (53.09%) aligned concordantly exactly 1 time 1991482 (40.37%) aligned concordantly >1 times ---- 322681 pairs aligned concordantly 0 times; of these: 41494 (12.86%) aligned discordantly 1 time ---- 281187 pairs aligned 0 times concordantly or discordantly; of these: 562374 mates make up the pairs; of these: 460072 (81.81%) aligned 0 times 24569 (4.37%) aligned exactly 1 time 77733 (13.82%) aligned >1 times 95.34% overall alignment rate [samopen] SAM header is present: 17 sequences. 3747206 reads; of these: 3747206 (100.00%) were paired; of these: 234734 (6.26%) aligned concordantly 0 times 1803562 (48.13%) aligned concordantly exactly 1 time 1708910 (45.60%) aligned concordantly >1 times ---- 234734 pairs aligned concordantly 0 times; of these: 25823 (11.00%) aligned discordantly 1 time ---- 208911 pairs aligned 0 times concordantly or discordantly; of these: 417822 mates make up the pairs; of these: 339022 (81.14%) aligned 0 times 19023 (4.55%) aligned exactly 1 time 59777 (14.31%) aligned >1 times 95.48% overall alignment rate [samopen] SAM header is present: 17 sequences. 5730480 reads; of these: 5730480 (100.00%) were paired; of these: 386018 (6.74%) aligned concordantly 0 times 2621844 (45.75%) aligned concordantly exactly 1 time 2722618 (47.51%) aligned concordantly >1 times ---- 386018 pairs aligned concordantly 0 times; of these: 41425 (10.73%) aligned discordantly 1 time ---- 344593 pairs aligned 0 times concordantly or discordantly; of these: 689186 mates make up the pairs; of these: 562137 (81.57%) aligned 0 times 23883 (3.47%) aligned exactly 1 time 103166 (14.97%) aligned >1 times 95.10% overall alignment rate [samopen] SAM header is present: 17 sequences. 5424443 reads; of these: 5424443 (100.00%) were paired; of these: 395726 (7.30%) aligned concordantly 0 times 2701382 (49.80%) aligned concordantly exactly 1 time 2327335 (42.90%) aligned concordantly >1 times ---- 395726 pairs aligned concordantly 0 times; of these: 45820 (11.58%) aligned discordantly 1 time ---- 349906 pairs aligned 0 times concordantly or discordantly; of these: 699812 mates make up the pairs; of these: 584734 (83.56%) aligned 0 times 23084 (3.30%) aligned exactly 1 time 91994 (13.15%) aligned >1 times 94.61% overall alignment rate [samopen] SAM header is present: 17 sequences. 2730465 reads; of these: 2730465 (100.00%) were paired; of these: 197625 (7.24%) aligned concordantly 0 times 1285887 (47.09%) aligned concordantly exactly 1 time 1246953 (45.67%) aligned concordantly >1 times ---- 197625 pairs aligned concordantly 0 times; of these: 20746 (10.50%) aligned discordantly 1 time ---- 176879 pairs aligned 0 times concordantly or discordantly; of these: 353758 mates make up the pairs; of these: 294213 (83.17%) aligned 0 times 10886 (3.08%) aligned exactly 1 time 48659 (13.75%) aligned >1 times 94.61% overall alignment rate [samopen] SAM header is present: 17 sequences. 5083822 reads; of these: 5083822 (100.00%) were paired; of these: 321386 (6.32%) aligned concordantly 0 times 2434488 (47.89%) aligned concordantly exactly 1 time 2327948 (45.79%) aligned concordantly >1 times ---- 321386 pairs aligned concordantly 0 times; of these: 36012 (11.21%) aligned discordantly 1 time ---- 285374 pairs aligned 0 times concordantly or discordantly; of these: 570748 mates make up the pairs; of these: 463098 (81.14%) aligned 0 times 23280 (4.08%) aligned exactly 1 time 84370 (14.78%) aligned >1 times 95.45% overall alignment rate [samopen] SAM header is present: 17 sequences. 2099451 reads; of these: 2099451 (100.00%) were paired; of these: 147418 (7.02%) aligned concordantly 0 times 1012484 (48.23%) aligned concordantly exactly 1 time 939549 (44.75%) aligned concordantly >1 times ---- 147418 pairs aligned concordantly 0 times; of these: 14783 (10.03%) aligned discordantly 1 time ---- 132635 pairs aligned 0 times concordantly or discordantly; of these: 265270 mates make up the pairs; of these: 217635 (82.04%) aligned 0 times 10259 (3.87%) aligned exactly 1 time 37376 (14.09%) aligned >1 times 94.82% overall alignment rate [samopen] SAM header is present: 17 sequences. 3826625 reads; of these: 3826625 (100.00%) were paired; of these: 204408 (5.34%) aligned concordantly 0 times 1954683 (51.08%) aligned concordantly exactly 1 time 1667534 (43.58%) aligned concordantly >1 times ---- 204408 pairs aligned concordantly 0 times; of these: 27036 (13.23%) aligned discordantly 1 time ---- 177372 pairs aligned 0 times concordantly or discordantly; of these: 354744 mates make up the pairs; of these: 279888 (78.90%) aligned 0 times 18421 (5.19%) aligned exactly 1 time 56435 (15.91%) aligned >1 times 96.34% overall alignment rate [samopen] SAM header is present: 17 sequences. 5051131 reads; of these: 5051131 (100.00%) were paired; of these: 386296 (7.65%) aligned concordantly 0 times 2359493 (46.71%) aligned concordantly exactly 1 time 2305342 (45.64%) aligned concordantly >1 times ---- 386296 pairs aligned concordantly 0 times; of these: 37784 (9.78%) aligned discordantly 1 time ---- 348512 pairs aligned 0 times concordantly or discordantly; of these: 697024 mates make up the pairs; of these: 586570 (84.15%) aligned 0 times 23608 (3.39%) aligned exactly 1 time 86846 (12.46%) aligned >1 times 94.19% overall alignment rate [samopen] SAM header is present: 17 sequences. 3220260 reads; of these: 3220260 (100.00%) were paired; of these: 195199 (6.06%) aligned concordantly 0 times 1415289 (43.95%) aligned concordantly exactly 1 time 1609772 (49.99%) aligned concordantly >1 times ---- 195199 pairs aligned concordantly 0 times; of these: 19854 (10.17%) aligned discordantly 1 time ---- 175345 pairs aligned 0 times concordantly or discordantly; of these: 350690 mates make up the pairs; of these: 277732 (79.20%) aligned 0 times 13971 (3.98%) aligned exactly 1 time 58987 (16.82%) aligned >1 times 95.69% overall alignment rate [samopen] SAM header is present: 17 sequences. 4839116 reads; of these: 4839116 (100.00%) were paired; of these: 338946 (7.00%) aligned concordantly 0 times 2414054 (49.89%) aligned concordantly exactly 1 time 2086116 (43.11%) aligned concordantly >1 times ---- 338946 pairs aligned concordantly 0 times; of these: 39298 (11.59%) aligned discordantly 1 time ---- 299648 pairs aligned 0 times concordantly or discordantly; of these: 599296 mates make up the pairs; of these: 496228 (82.80%) aligned 0 times 20916 (3.49%) aligned exactly 1 time 82152 (13.71%) aligned >1 times 94.87% overall alignment rate [samopen] SAM header is present: 17 sequences. 5214554 reads; of these: 5214554 (100.00%) were paired; of these: 343663 (6.59%) aligned concordantly 0 times 2410902 (46.23%) aligned concordantly exactly 1 time 2459989 (47.18%) aligned concordantly >1 times ---- 343663 pairs aligned concordantly 0 times; of these: 45537 (13.25%) aligned discordantly 1 time ---- 298126 pairs aligned 0 times concordantly or discordantly; of these: 596252 mates make up the pairs; of these: 466541 (78.25%) aligned 0 times 25543 (4.28%) aligned exactly 1 time 104168 (17.47%) aligned >1 times 95.53% overall alignment rate [samopen] SAM header is present: 17 sequences. 4121779 reads; of these: 4121779 (100.00%) were paired; of these: 251510 (6.10%) aligned concordantly 0 times 2110313 (51.20%) aligned concordantly exactly 1 time 1759956 (42.70%) aligned concordantly >1 times ---- 251510 pairs aligned concordantly 0 times; of these: 28803 (11.45%) aligned discordantly 1 time ---- 222707 pairs aligned 0 times concordantly or discordantly; of these: 445414 mates make up the pairs; of these: 361999 (81.27%) aligned 0 times 20948 (4.70%) aligned exactly 1 time 62467 (14.02%) aligned >1 times 95.61% overall alignment rate [samopen] SAM header is present: 17 sequences. 2647651 reads; of these: 2647651 (100.00%) were paired; of these: 170893 (6.45%) aligned concordantly 0 times 1362003 (51.44%) aligned concordantly exactly 1 time 1114755 (42.10%) aligned concordantly >1 times ---- 170893 pairs aligned concordantly 0 times; of these: 20633 (12.07%) aligned discordantly 1 time ---- 150260 pairs aligned 0 times concordantly or discordantly; of these: 300520 mates make up the pairs; of these: 244439 (81.34%) aligned 0 times 11547 (3.84%) aligned exactly 1 time 44534 (14.82%) aligned >1 times 95.38% overall alignment rate [samopen] SAM header is present: 17 sequences. 1672478 reads; of these: 1672478 (100.00%) were paired; of these: 105539 (6.31%) aligned concordantly 0 times 801738 (47.94%) aligned concordantly exactly 1 time 765201 (45.75%) aligned concordantly >1 times ---- 105539 pairs aligned concordantly 0 times; of these: 13070 (12.38%) aligned discordantly 1 time ---- 92469 pairs aligned 0 times concordantly or discordantly; of these: 184938 mates make up the pairs; of these: 147183 (79.59%) aligned 0 times 6447 (3.49%) aligned exactly 1 time 31308 (16.93%) aligned >1 times 95.60% overall alignment rate [samopen] SAM header is present: 17 sequences. 5226781 reads; of these: 5226781 (100.00%) were paired; of these: 324036 (6.20%) aligned concordantly 0 times 2518219 (48.18%) aligned concordantly exactly 1 time 2384526 (45.62%) aligned concordantly >1 times ---- 324036 pairs aligned concordantly 0 times; of these: 38437 (11.86%) aligned discordantly 1 time ---- 285599 pairs aligned 0 times concordantly or discordantly; of these: 571198 mates make up the pairs; of these: 458937 (80.35%) aligned 0 times 24576 (4.30%) aligned exactly 1 time 87685 (15.35%) aligned >1 times 95.61% overall alignment rate [samopen] SAM header is present: 17 sequences. 1558767 reads; of these: 1558767 (100.00%) were paired; of these: 98538 (6.32%) aligned concordantly 0 times 770113 (49.41%) aligned concordantly exactly 1 time 690116 (44.27%) aligned concordantly >1 times ---- 98538 pairs aligned concordantly 0 times; of these: 13586 (13.79%) aligned discordantly 1 time ---- 84952 pairs aligned 0 times concordantly or discordantly; of these: 169904 mates make up the pairs; of these: 134340 (79.07%) aligned 0 times 5663 (3.33%) aligned exactly 1 time 29901 (17.60%) aligned >1 times 95.69% overall alignment rate [samopen] SAM header is present: 17 sequences. 4325623 reads; of these: 4325623 (100.00%) were paired; of these: 259095 (5.99%) aligned concordantly 0 times 2212008 (51.14%) aligned concordantly exactly 1 time 1854520 (42.87%) aligned concordantly >1 times ---- 259095 pairs aligned concordantly 0 times; of these: 31029 (11.98%) aligned discordantly 1 time ---- 228066 pairs aligned 0 times concordantly or discordantly; of these: 456132 mates make up the pairs; of these: 369996 (81.12%) aligned 0 times 20944 (4.59%) aligned exactly 1 time 65192 (14.29%) aligned >1 times 95.72% overall alignment rate [samopen] SAM header is present: 17 sequences. 1915445 reads; of these: 1915445 (100.00%) were paired; of these: 135333 (7.07%) aligned concordantly 0 times 1061050 (55.39%) aligned concordantly exactly 1 time 719062 (37.54%) aligned concordantly >1 times ---- 135333 pairs aligned concordantly 0 times; of these: 16205 (11.97%) aligned discordantly 1 time ---- 119128 pairs aligned 0 times concordantly or discordantly; of these: 238256 mates make up the pairs; of these: 199821 (83.87%) aligned 0 times 10413 (4.37%) aligned exactly 1 time 28022 (11.76%) aligned >1 times 94.78% overall alignment rate [samopen] SAM header is present: 17 sequences. 4920539 reads; of these: 4920539 (100.00%) were paired; of these: 292247 (5.94%) aligned concordantly 0 times 2455727 (49.91%) aligned concordantly exactly 1 time 2172565 (44.15%) aligned concordantly >1 times ---- 292247 pairs aligned concordantly 0 times; of these: 37831 (12.94%) aligned discordantly 1 time ---- 254416 pairs aligned 0 times concordantly or discordantly; of these: 508832 mates make up the pairs; of these: 403061 (79.21%) aligned 0 times 24560 (4.83%) aligned exactly 1 time 81211 (15.96%) aligned >1 times 95.90% overall alignment rate [samopen] SAM header is present: 17 sequences. 2474735 reads; of these: 2474735 (100.00%) were paired; of these: 162967 (6.59%) aligned concordantly 0 times 1192526 (48.19%) aligned concordantly exactly 1 time 1119242 (45.23%) aligned concordantly >1 times ---- 162967 pairs aligned concordantly 0 times; of these: 17612 (10.81%) aligned discordantly 1 time ---- 145355 pairs aligned 0 times concordantly or discordantly; of these: 290710 mates make up the pairs; of these: 236342 (81.30%) aligned 0 times 11767 (4.05%) aligned exactly 1 time 42601 (14.65%) aligned >1 times 95.22% overall alignment rate
During sequencing, we perform PCR, which can lead to duplicate reads. In many kinds of DNA sequencing, we want to remove duplicates so that we don't double-count signal originating from the same molecule.
To do so, we use an algorithm called sambamba
that looks for reads that mapped to exactly the same places in the genome. We also need to sort the aligned files before we can mark duplicates, since we need reads aligned to the same position to be next to each other in the file.
Bowtie2 also sets certian labels (or "flags") in the resulting alignment file to indicate information like the score of the alignment, the orientation of both mates of the fragment, and other details.
We can use these flags as a way to discard low-quality reads. This website provides a convenient way to interpret the meaning of these bitwise flags; for conveninece they can be encoded as numbers.
Here, we want to filter reads that fall into any of the following categories:
for bam_file in ${ALIGNMENT_DIR}*.trimmed.bam; do
bam_file_sorted=$(echo $bam_file | sed -e 's/.bam/.sorted.bam/')
bam_file_dup=$(echo $bam_file | sed -e 's/.bam/.sorted.dup.bam/')
nodup_bam_file=$(echo $bam_file | sed -e 's/.bam/.nodup.bam/')
# Sort and remove duplicates
sambamba sort -m 4G -t 35 -u $bam_file
sambamba markdup -l 0 -t 35 $bam_file_sorted $bam_file_dup
samtools view -F 1804 -f 2 -q 30 -b $bam_file_dup > $nodup_bam_file
done
finding positions of the duplicate reads in the file... sorted 3708826 end pairs and 31512 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 554 ms found 2776275 duplicates collected list of positions in 0 min 15 sec marking duplicates... total time elapsed: 0 min 26 sec finding positions of the duplicate reads in the file... sorted 2865285 end pairs and 23963 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 454 ms found 1966322 duplicates collected list of positions in 0 min 11 sec marking duplicates... total time elapsed: 0 min 21 sec finding positions of the duplicate reads in the file... sorted 5022160 end pairs and 44135 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 747 ms found 3827108 duplicates collected list of positions in 0 min 22 sec marking duplicates... total time elapsed: 0 min 38 sec finding positions of the duplicate reads in the file... sorted 2523868 end pairs and 20327 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 382 ms found 1700748 duplicates collected list of positions in 0 min 10 sec marking duplicates... total time elapsed: 0 min 19 sec finding positions of the duplicate reads in the file... sorted 3480866 end pairs and 31935 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 500 ms found 2030581 duplicates collected list of positions in 0 min 14 sec marking duplicates... total time elapsed: 0 min 25 sec finding positions of the duplicate reads in the file... sorted 2520482 end pairs and 18393 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 369 ms found 1622875 duplicates collected list of positions in 0 min 12 sec marking duplicates... total time elapsed: 0 min 20 sec finding positions of the duplicate reads in the file... sorted 4881409 end pairs and 41570 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 720 ms found 3611821 duplicates collected list of positions in 0 min 22 sec marking duplicates... total time elapsed: 0 min 38 sec finding positions of the duplicate reads in the file... sorted 4912698 end pairs and 41678 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 744 ms found 3705367 duplicates collected list of positions in 0 min 21 sec marking duplicates... total time elapsed: 0 min 37 sec finding positions of the duplicate reads in the file... sorted 2683945 end pairs and 23058 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 396 ms found 1533235 duplicates collected list of positions in 0 min 10 sec marking duplicates... total time elapsed: 0 min 21 sec finding positions of the duplicate reads in the file... sorted 3275693 end pairs and 30186 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 493 ms found 2003168 duplicates collected list of positions in 0 min 14 sec marking duplicates... total time elapsed: 0 min 25 sec finding positions of the duplicate reads in the file... sorted 5022433 end pairs and 41150 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 807 ms found 4065974 duplicates collected list of positions in 0 min 21 sec marking duplicates... total time elapsed: 0 min 36 sec finding positions of the duplicate reads in the file... sorted 4682094 end pairs and 42060 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 671 ms found 2886654 duplicates collected list of positions in 0 min 19 sec marking duplicates... total time elapsed: 0 min 35 sec finding positions of the duplicate reads in the file... sorted 3560579 end pairs and 34232 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 525 ms found 2277758 duplicates collected list of positions in 0 min 15 sec marking duplicates... total time elapsed: 0 min 26 sec finding positions of the duplicate reads in the file... sorted 5426962 end pairs and 44899 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 841 ms found 4303588 duplicates collected list of positions in 0 min 24 sec marking duplicates... total time elapsed: 0 min 42 sec finding positions of the duplicate reads in the file... sorted 5112141 end pairs and 39870 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 768 ms found 3738001 duplicates collected list of positions in 0 min 20 sec marking duplicates... total time elapsed: 0 min 36 sec finding positions of the duplicate reads in the file... sorted 2573575 end pairs and 19567 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 396 ms found 1656756 duplicates collected list of positions in 0 min 10 sec marking duplicates... total time elapsed: 0 min 19 sec finding positions of the duplicate reads in the file... sorted 4831273 end pairs and 42000 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 720 ms found 3466360 duplicates collected list of positions in 0 min 21 sec marking duplicates... total time elapsed: 0 min 36 sec finding positions of the duplicate reads in the file... sorted 1981568 end pairs and 18131 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 309 ms found 1107574 duplicates collected list of positions in 0 min 7 sec marking duplicates... total time elapsed: 0 min 14 sec finding positions of the duplicate reads in the file... sorted 3670288 end pairs and 32786 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 531 ms found 2220120 duplicates collected list of positions in 0 min 16 sec marking duplicates... total time elapsed: 0 min 28 sec finding positions of the duplicate reads in the file... sorted 4737370 end pairs and 40952 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 734 ms found 3499777 duplicates collected list of positions in 0 min 19 sec marking duplicates... total time elapsed: 0 min 37 sec finding positions of the duplicate reads in the file... sorted 3067807 end pairs and 27174 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 454 ms found 2142556 duplicates collected list of positions in 0 min 14 sec marking duplicates... total time elapsed: 0 min 24 sec finding positions of the duplicate reads in the file... sorted 4573445 end pairs and 35114 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 683 ms found 3228321 duplicates collected list of positions in 0 min 19 sec marking duplicates... total time elapsed: 0 min 34 sec finding positions of the duplicate reads in the file... sorted 4959133 end pairs and 44301 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 742 ms found 3804377 duplicates collected list of positions in 0 min 21 sec marking duplicates... total time elapsed: 0 min 38 sec finding positions of the duplicate reads in the file... sorted 3922938 end pairs and 35683 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 572 ms found 2420679 duplicates collected list of positions in 0 min 16 sec marking duplicates... total time elapsed: 0 min 28 sec finding positions of the duplicate reads in the file... sorted 2515406 end pairs and 20051 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 369 ms found 1395162 duplicates collected list of positions in 0 min 9 sec marking duplicates... total time elapsed: 0 min 18 sec finding positions of the duplicate reads in the file... sorted 1593007 end pairs and 11759 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 241 ms found 863095 duplicates collected list of positions in 0 min 8 sec marking duplicates... total time elapsed: 0 min 14 sec finding positions of the duplicate reads in the file... sorted 4975314 end pairs and 43997 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 767 ms found 3507605 duplicates collected list of positions in 0 min 22 sec marking duplicates... total time elapsed: 0 min 40 sec finding positions of the duplicate reads in the file... sorted 1486480 end pairs and 10234 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 220 ms found 786623 duplicates collected list of positions in 0 min 6 sec marking duplicates... total time elapsed: 0 min 11 sec finding positions of the duplicate reads in the file... sorted 4122374 end pairs and 36502 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 601 ms found 2518389 duplicates collected list of positions in 0 min 17 sec marking duplicates... total time elapsed: 0 min 32 sec finding positions of the duplicate reads in the file... sorted 1806945 end pairs and 17179 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 264 ms found 742469 duplicates collected list of positions in 0 min 6 sec marking duplicates... total time elapsed: 0 min 13 sec finding positions of the duplicate reads in the file... sorted 4697951 end pairs and 42115 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 689 ms found 3255644 duplicates collected list of positions in 0 min 19 sec marking duplicates... total time elapsed: 0 min 35 sec finding positions of the duplicate reads in the file... sorted 2346096 end pairs and 20936 single ends (among them 0 unmatched pairs) collecting indices of duplicate reads... done in 346 ms found 1381321 duplicates collected list of positions in 0 min 9 sec marking duplicates... total time elapsed: 0 min 16 sec
Now that we've aligned our reads to the genome and filtered the alignments, we want to identify areas of locally enriched signals, or "peaks".
For ATAC-seq, peaks correspond to accessible regions. They can include promoters, enhancers, and other regulatory regions.
We'll call peaks using MACS2
#create a directory to store the tagAlign data
TAGALIGN_DIR="${ANALYSIS_DIR}tagAlign/"
[[ ! -d $TAGALIGN_DIR ]] && mkdir -p "$TAGALIGN_DIR"
#create a directory to store the MACS peaks
PEAKS_DIR="${ANALYSIS_DIR}peaks/"
[[ ! -d $PEAKS_DIR ]] && mkdir -p "$PEAKS_DIR"
echo $PEAKS_DIR
/srv/scratch/training_camp/tc2016/user23/analysis/peaks/
SacCer3GenSz=12157105 # The sum of the sizes of the chromosomes in the SacCer3 genome
Macs2PvalThresh="0.05" # The p-value threshold for calling peaks
Macs2SmoothWindow=150 # The window size to smooth alignment signal over
Macs2ShiftSize=$(python -c "print(int(${Macs2SmoothWindow}/2))")
for nodup_bam_file in ${ALIGNMENT_DIR}*.nodup.bam; do
# First, we need to convert each bam to a .tagAlign,
# which just contains the start/end positions of each read:
tagalign_file=$TAGALIGN_DIR$(echo $(basename $nodup_bam_file) | sed -e 's/.bam/.tagAlign.gz/')
#bedtools bamtobed -i $nodup_bam_file | awk 'BEGIN{OFS="\t"}{$4="N";$5="1000";print $0}' | gzip -c > $tagalign_file
# Now, we can run MACS:
output_prefix=$PEAKS_DIR$(echo $(basename $tagalign_file)| sed -e 's/.tagAlign.gz//')
macs2 callpeak \
-t $tagalign_file -f BED -n $output_prefix -g "$SacCer3GenSz" -p $Macs2PvalThresh \
--nomodel --shift -$Macs2ShiftSize --extsize $Macs2SmoothWindow -B --SPMR --keep-dup all --call-summits
#We also generate a fold change file comparing the sample to the control(DMSO)
macs2 bdgcmp -t $output_prefix\_treat_pileup.bdg -c $output_prefix\_control_lambda.bdg -o $output_prefix\_FE.bdg -m FE
done
INFO @ Tue, 13 Sep 2016 15:11:45: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Ct_1_S22_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_1_S22_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_1_S22_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Ct_1_S22_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:11:45: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:11:45: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:11:47: 1000000 INFO @ Tue, 13 Sep 2016 15:11:48: 2000000 INFO @ Tue, 13 Sep 2016 15:11:50: 3000000 INFO @ Tue, 13 Sep 2016 15:11:50: #1 tag size is determined as 60 bps INFO @ Tue, 13 Sep 2016 15:11:50: #1 tag size = 60 INFO @ Tue, 13 Sep 2016 15:11:50: #1 total tags in treatment: 3134658 INFO @ Tue, 13 Sep 2016 15:11:50: #1 finished! INFO @ Tue, 13 Sep 2016 15:11:50: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:11:50: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:11:50: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:11:50: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:11:50: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:11:50: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:11:50: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:11:50: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:12:06: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:12:06: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_1_S22_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:12:06: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_1_S22_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:12:06: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:12:06: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:12:16: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_1_S22_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:12:16: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_1_S22_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:12:16: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_1_S22_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:12:16: Done! INFO @ Tue, 13 Sep 2016 15:12:17: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:12:20: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:12:22: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:12:23: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:13:04: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:13:09: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_1_S22_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:13:11: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Ct_2_S23_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_2_S23_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_2_S23_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Ct_2_S23_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:13:11: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:13:11: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:13:13: 1000000 INFO @ Tue, 13 Sep 2016 15:13:14: 2000000 INFO @ Tue, 13 Sep 2016 15:13:15: #1 tag size is determined as 58 bps INFO @ Tue, 13 Sep 2016 15:13:15: #1 tag size = 58 INFO @ Tue, 13 Sep 2016 15:13:15: #1 total tags in treatment: 2487090 INFO @ Tue, 13 Sep 2016 15:13:15: #1 finished! INFO @ Tue, 13 Sep 2016 15:13:15: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:13:15: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:13:15: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:13:15: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:13:15: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:13:15: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:13:15: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:13:15: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:13:27: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:13:27: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_2_S23_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:13:27: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_2_S23_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:13:27: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:13:27: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:13:36: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_2_S23_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:13:36: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_2_S23_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:13:36: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_2_S23_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:13:36: Done! INFO @ Tue, 13 Sep 2016 15:13:37: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:13:40: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:13:41: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:13:42: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:14:19: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:14:24: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_2_S23_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:14:26: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Ct_300_S3_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_300_S3_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_300_S3_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Ct_300_S3_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:14:26: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:14:26: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:14:27: 1000000 INFO @ Tue, 13 Sep 2016 15:14:29: 2000000 INFO @ Tue, 13 Sep 2016 15:14:31: 3000000 INFO @ Tue, 13 Sep 2016 15:14:32: 4000000 INFO @ Tue, 13 Sep 2016 15:14:33: #1 tag size is determined as 62 bps INFO @ Tue, 13 Sep 2016 15:14:33: #1 tag size = 62 INFO @ Tue, 13 Sep 2016 15:14:33: #1 total tags in treatment: 4276010 INFO @ Tue, 13 Sep 2016 15:14:33: #1 finished! INFO @ Tue, 13 Sep 2016 15:14:33: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:14:33: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:14:33: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:14:33: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:14:33: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:14:33: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:14:33: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:14:33: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:14:55: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:14:55: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_300_S3_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:14:55: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_300_S3_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:14:55: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:14:55: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:15:08: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_300_S3_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:15:08: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_300_S3_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:15:08: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_300_S3_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:15:08: Done! INFO @ Tue, 13 Sep 2016 15:15:09: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:15:13: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:15:15: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:15:16: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:16:07: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:16:14: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_300_S3_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:16:16: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Ct_3_S24_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_3_S24_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_3_S24_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Ct_3_S24_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:16:16: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:16:16: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:16:17: 1000000 INFO @ Tue, 13 Sep 2016 15:16:19: 2000000 INFO @ Tue, 13 Sep 2016 15:16:19: #1 tag size is determined as 56 bps INFO @ Tue, 13 Sep 2016 15:16:19: #1 tag size = 56 INFO @ Tue, 13 Sep 2016 15:16:19: #1 total tags in treatment: 2141156 INFO @ Tue, 13 Sep 2016 15:16:19: #1 finished! INFO @ Tue, 13 Sep 2016 15:16:19: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:16:19: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:16:19: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:16:19: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:16:19: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:16:19: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:16:19: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:16:19: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:16:30: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:16:30: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_3_S24_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:16:30: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_3_S24_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:16:30: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:16:30: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:16:37: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_3_S24_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:16:37: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_3_S24_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:16:37: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_3_S24_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:16:37: Done! INFO @ Tue, 13 Sep 2016 15:16:39: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:16:41: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:16:42: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:16:43: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:17:13: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:17:18: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_3_S24_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:17:19: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Ct_800_S9_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_800_S9_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_800_S9_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Ct_800_S9_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:17:19: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:17:19: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:17:21: 1000000 INFO @ Tue, 13 Sep 2016 15:17:22: 2000000 INFO @ Tue, 13 Sep 2016 15:17:24: 3000000 INFO @ Tue, 13 Sep 2016 15:17:25: #1 tag size is determined as 66 bps INFO @ Tue, 13 Sep 2016 15:17:25: #1 tag size = 66 INFO @ Tue, 13 Sep 2016 15:17:25: #1 total tags in treatment: 3428526 INFO @ Tue, 13 Sep 2016 15:17:25: #1 finished! INFO @ Tue, 13 Sep 2016 15:17:25: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:17:25: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:17:25: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:17:25: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:17:25: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:17:25: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:17:25: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:17:25: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:17:41: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:17:41: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_800_S9_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:17:41: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_800_S9_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:17:41: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:17:41: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:17:52: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_800_S9_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:17:52: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_800_S9_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:17:52: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_800_S9_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:17:52: Done! INFO @ Tue, 13 Sep 2016 15:17:54: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:17:57: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:17:59: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:18:00: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:18:48: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:18:55: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Ct_800_S9_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:18:56: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Cz_1_S16_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_1_S16_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_1_S16_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Cz_1_S16_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:18:56: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:18:56: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:18:58: 1000000 INFO @ Tue, 13 Sep 2016 15:18:59: 2000000 INFO @ Tue, 13 Sep 2016 15:19:00: #1 tag size is determined as 70 bps INFO @ Tue, 13 Sep 2016 15:19:00: #1 tag size = 70 INFO @ Tue, 13 Sep 2016 15:19:00: #1 total tags in treatment: 2313296 INFO @ Tue, 13 Sep 2016 15:19:00: #1 finished! INFO @ Tue, 13 Sep 2016 15:19:00: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:19:00: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:19:00: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:19:00: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:19:00: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:19:00: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:19:00: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:19:00: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:19:10: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:19:10: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_1_S16_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:19:10: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_1_S16_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:19:10: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:19:10: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:19:19: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_1_S16_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:19:19: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_1_S16_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:19:19: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_1_S16_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:19:19: Done! INFO @ Tue, 13 Sep 2016 15:19:21: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:19:23: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:19:24: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:19:25: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:19:59: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:20:05: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_1_S16_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:20:06: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Cz_2_S17_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_2_S17_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_2_S17_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Cz_2_S17_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:20:06: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:20:06: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:20:08: 1000000 INFO @ Tue, 13 Sep 2016 15:20:10: 2000000 INFO @ Tue, 13 Sep 2016 15:20:11: 3000000 INFO @ Tue, 13 Sep 2016 15:20:13: 4000000 INFO @ Tue, 13 Sep 2016 15:20:13: #1 tag size is determined as 69 bps INFO @ Tue, 13 Sep 2016 15:20:13: #1 tag size = 69 INFO @ Tue, 13 Sep 2016 15:20:13: #1 total tags in treatment: 4261284 INFO @ Tue, 13 Sep 2016 15:20:13: #1 finished! INFO @ Tue, 13 Sep 2016 15:20:13: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:20:13: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:20:13: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:20:13: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:20:13: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:20:13: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:20:13: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:20:13: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:20:34: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:20:34: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_2_S17_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:20:34: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_2_S17_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:20:34: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:20:34: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:20:47: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_2_S17_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:20:48: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_2_S17_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:20:48: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_2_S17_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:20:48: Done! INFO @ Tue, 13 Sep 2016 15:20:49: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:20:53: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:20:55: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:20:57: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:21:47: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:21:55: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_2_S17_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:21:56: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Cz_300_S1_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_300_S1_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_300_S1_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Cz_300_S1_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:21:56: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:21:56: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:21:58: 1000000 INFO @ Tue, 13 Sep 2016 15:21:59: 2000000 INFO @ Tue, 13 Sep 2016 15:22:01: 3000000 INFO @ Tue, 13 Sep 2016 15:22:03: 4000000 INFO @ Tue, 13 Sep 2016 15:22:03: #1 tag size is determined as 67 bps INFO @ Tue, 13 Sep 2016 15:22:03: #1 tag size = 67 INFO @ Tue, 13 Sep 2016 15:22:03: #1 total tags in treatment: 4203982 INFO @ Tue, 13 Sep 2016 15:22:03: #1 finished! INFO @ Tue, 13 Sep 2016 15:22:03: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:22:03: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:22:03: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:22:03: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:22:03: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:22:03: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:22:03: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:22:03: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:22:24: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:22:24: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_300_S1_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:22:24: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_300_S1_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:22:24: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:22:24: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:22:36: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_300_S1_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:22:36: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_300_S1_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:22:36: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_300_S1_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:22:36: Done! INFO @ Tue, 13 Sep 2016 15:22:38: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:22:42: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:22:43: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:22:45: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:23:38: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:23:45: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_300_S1_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:23:47: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Cz_3_S18_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_3_S18_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_3_S18_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Cz_3_S18_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:23:47: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:23:47: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:23:49: 1000000 INFO @ Tue, 13 Sep 2016 15:23:50: 2000000 INFO @ Tue, 13 Sep 2016 15:23:51: #1 tag size is determined as 64 bps INFO @ Tue, 13 Sep 2016 15:23:51: #1 tag size = 64 INFO @ Tue, 13 Sep 2016 15:23:51: #1 total tags in treatment: 2600876 INFO @ Tue, 13 Sep 2016 15:23:51: #1 finished! INFO @ Tue, 13 Sep 2016 15:23:51: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:23:51: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:23:51: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:23:51: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:23:51: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:23:51: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:23:51: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:23:51: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:24:01: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:24:01: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_3_S18_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:24:01: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_3_S18_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:24:01: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:24:01: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:24:12: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_3_S18_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:24:12: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_3_S18_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:24:12: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_3_S18_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:24:12: Done! INFO @ Tue, 13 Sep 2016 15:24:13: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:24:16: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:24:18: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:24:19: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:25:02: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:25:09: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_3_S18_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:25:10: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Cz_800_S7_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_800_S7_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_800_S7_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Cz_800_S7_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:25:10: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:25:10: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:25:12: 1000000 INFO @ Tue, 13 Sep 2016 15:25:14: 2000000 INFO @ Tue, 13 Sep 2016 15:25:15: 3000000 INFO @ Tue, 13 Sep 2016 15:25:16: #1 tag size is determined as 68 bps INFO @ Tue, 13 Sep 2016 15:25:16: #1 tag size = 68 INFO @ Tue, 13 Sep 2016 15:25:16: #1 total tags in treatment: 3080562 INFO @ Tue, 13 Sep 2016 15:25:16: #1 finished! INFO @ Tue, 13 Sep 2016 15:25:16: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:25:16: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:25:16: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:25:16: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:25:16: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:25:16: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:25:16: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:25:16: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:25:30: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:25:30: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_800_S7_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:25:30: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_800_S7_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:25:30: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:25:30: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:25:41: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_800_S7_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:25:41: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_800_S7_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:25:41: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_800_S7_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:25:41: Done! INFO @ Tue, 13 Sep 2016 15:25:42: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:25:45: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:25:47: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:25:48: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:26:31: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:26:37: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Cz_800_S7_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:26:39: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/DMSO_1_S31_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S31_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S31_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/DMSO_1_S31_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:26:39: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:26:39: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:26:41: 1000000 INFO @ Tue, 13 Sep 2016 15:26:42: 2000000 INFO @ Tue, 13 Sep 2016 15:26:44: 3000000 INFO @ Tue, 13 Sep 2016 15:26:45: 4000000 INFO @ Tue, 13 Sep 2016 15:26:46: #1 tag size is determined as 54 bps INFO @ Tue, 13 Sep 2016 15:26:46: #1 tag size = 54 INFO @ Tue, 13 Sep 2016 15:26:46: #1 total tags in treatment: 4070582 INFO @ Tue, 13 Sep 2016 15:26:46: #1 finished! INFO @ Tue, 13 Sep 2016 15:26:46: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:26:46: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:26:46: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:26:46: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:26:46: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:26:46: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:26:46: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:26:46: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:27:08: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:27:08: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S31_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:27:08: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S31_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:27:08: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:27:08: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:27:20: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S31_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:27:20: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S31_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:27:20: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S31_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:27:20: Done! INFO @ Tue, 13 Sep 2016 15:27:21: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:27:24: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:27:26: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:27:28: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:28:11: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:28:17: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S31_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:28:19: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/DMSO_1_S6_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S6_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S6_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/DMSO_1_S6_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:28:19: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:28:19: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:28:21: 1000000 INFO @ Tue, 13 Sep 2016 15:28:22: 2000000 INFO @ Tue, 13 Sep 2016 15:28:24: 3000000 INFO @ Tue, 13 Sep 2016 15:28:26: 4000000 INFO @ Tue, 13 Sep 2016 15:28:27: #1 tag size is determined as 55 bps INFO @ Tue, 13 Sep 2016 15:28:27: #1 tag size = 55 INFO @ Tue, 13 Sep 2016 15:28:27: #1 total tags in treatment: 4729346 INFO @ Tue, 13 Sep 2016 15:28:27: #1 finished! INFO @ Tue, 13 Sep 2016 15:28:27: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:28:27: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:28:27: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:28:27: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:28:27: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:28:27: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:28:27: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:28:27: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:28:50: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:28:50: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S6_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:28:50: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S6_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:28:50: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:28:50: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:29:04: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S6_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:29:04: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S6_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:29:04: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S6_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:29:04: Done! INFO @ Tue, 13 Sep 2016 15:29:06: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:29:10: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:29:12: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:29:14: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:30:09: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:30:17: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_1_S6_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:30:18: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/DMSO_2_S12_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S12_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S12_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/DMSO_2_S12_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:30:18: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:30:18: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:30:20: 1000000 INFO @ Tue, 13 Sep 2016 15:30:21: 2000000 INFO @ Tue, 13 Sep 2016 15:30:23: 3000000 INFO @ Tue, 13 Sep 2016 15:30:24: #1 tag size is determined as 70 bps INFO @ Tue, 13 Sep 2016 15:30:24: #1 tag size = 70 INFO @ Tue, 13 Sep 2016 15:30:24: #1 total tags in treatment: 3280996 INFO @ Tue, 13 Sep 2016 15:30:24: #1 finished! INFO @ Tue, 13 Sep 2016 15:30:24: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:30:24: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:30:24: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:30:24: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:30:24: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:30:24: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:30:24: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:30:24: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:30:40: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:30:40: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S12_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:30:40: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S12_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:30:40: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:30:40: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:30:51: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S12_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:30:51: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S12_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:30:51: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S12_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:30:51: Done! INFO @ Tue, 13 Sep 2016 15:30:52: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:30:56: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:30:57: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:30:58: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:31:40: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:31:46: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S12_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:31:47: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/DMSO_2_S32_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S32_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S32_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/DMSO_2_S32_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:31:47: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:31:47: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:31:49: 1000000 INFO @ Tue, 13 Sep 2016 15:31:51: 2000000 INFO @ Tue, 13 Sep 2016 15:31:52: 3000000 INFO @ Tue, 13 Sep 2016 15:31:54: 4000000 INFO @ Tue, 13 Sep 2016 15:31:55: #1 tag size is determined as 65 bps INFO @ Tue, 13 Sep 2016 15:31:55: #1 tag size = 65 INFO @ Tue, 13 Sep 2016 15:31:55: #1 total tags in treatment: 4567168 INFO @ Tue, 13 Sep 2016 15:31:55: #1 finished! INFO @ Tue, 13 Sep 2016 15:31:55: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:31:55: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:31:55: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:31:55: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:31:55: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:31:55: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:31:55: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:31:55: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:32:19: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:32:19: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S32_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:32:19: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S32_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:32:19: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:32:19: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:32:32: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S32_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:32:32: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S32_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:32:32: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S32_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:32:32: Done! INFO @ Tue, 13 Sep 2016 15:32:34: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:32:38: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:32:40: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:32:41: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:33:33: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:33:41: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/DMSO_2_S32_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:33:42: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/It_1_S25_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_1_S25_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_1_S25_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/It_1_S25_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:33:42: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:33:42: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:33:44: 1000000 INFO @ Tue, 13 Sep 2016 15:33:45: 2000000 INFO @ Tue, 13 Sep 2016 15:33:47: 3000000 INFO @ Tue, 13 Sep 2016 15:33:49: 4000000 INFO @ Tue, 13 Sep 2016 15:33:50: #1 tag size is determined as 63 bps INFO @ Tue, 13 Sep 2016 15:33:50: #1 tag size = 63 INFO @ Tue, 13 Sep 2016 15:33:50: #1 total tags in treatment: 4712826 INFO @ Tue, 13 Sep 2016 15:33:50: #1 finished! INFO @ Tue, 13 Sep 2016 15:33:50: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:33:50: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:33:50: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:33:50: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:33:50: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:33:50: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:33:50: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:33:50: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:34:15: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:34:15: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_1_S25_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:34:15: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_1_S25_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:34:15: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:34:15: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:34:29: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_1_S25_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:34:29: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_1_S25_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:34:29: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_1_S25_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:34:29: Done! INFO @ Tue, 13 Sep 2016 15:34:30: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:34:34: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:34:36: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:34:38: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:34:49: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:34:57: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_1_S25_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:34:59: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/It_2_S26_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_2_S26_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_2_S26_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/It_2_S26_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:34:59: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:34:59: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:35:00: 1000000 INFO @ Tue, 13 Sep 2016 15:35:02: 2000000 INFO @ Tue, 13 Sep 2016 15:35:02: #1 tag size is determined as 70 bps INFO @ Tue, 13 Sep 2016 15:35:02: #1 tag size = 70 INFO @ Tue, 13 Sep 2016 15:35:02: #1 total tags in treatment: 2334276 INFO @ Tue, 13 Sep 2016 15:35:02: #1 finished! INFO @ Tue, 13 Sep 2016 15:35:02: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:35:02: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:35:02: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:35:02: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:35:02: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:35:02: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:35:02: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:35:02: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:35:13: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:35:13: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_2_S26_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:35:13: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_2_S26_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:35:13: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:35:13: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:35:22: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_2_S26_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:35:22: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_2_S26_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:35:22: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_2_S26_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:35:22: Done! INFO @ Tue, 13 Sep 2016 15:35:23: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:35:26: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:35:27: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:35:28: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:35:35: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:35:40: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_2_S26_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:35:42: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/It_300_S5_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_300_S5_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_300_S5_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/It_300_S5_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:35:42: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:35:42: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:35:43: 1000000 INFO @ Tue, 13 Sep 2016 15:35:45: 2000000 INFO @ Tue, 13 Sep 2016 15:35:47: 3000000 INFO @ Tue, 13 Sep 2016 15:35:48: 4000000 INFO @ Tue, 13 Sep 2016 15:35:49: #1 tag size is determined as 67 bps INFO @ Tue, 13 Sep 2016 15:35:49: #1 tag size = 67 INFO @ Tue, 13 Sep 2016 15:35:49: #1 total tags in treatment: 4322888 INFO @ Tue, 13 Sep 2016 15:35:49: #1 finished! INFO @ Tue, 13 Sep 2016 15:35:49: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:35:49: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:35:49: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:35:49: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:35:49: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:35:49: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:35:49: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:35:49: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:36:11: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:36:11: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_300_S5_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:36:11: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_300_S5_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:36:11: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:36:11: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:36:24: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_300_S5_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:36:24: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_300_S5_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:36:24: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_300_S5_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:36:24: Done! INFO @ Tue, 13 Sep 2016 15:36:25: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:36:29: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:36:31: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:36:32: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:36:43: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:36:51: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_300_S5_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:36:52: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/It_3_S27_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_3_S27_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_3_S27_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/It_3_S27_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:36:52: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:36:52: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:36:54: 1000000 INFO @ Tue, 13 Sep 2016 15:36:55: #1 tag size is determined as 63 bps INFO @ Tue, 13 Sep 2016 15:36:55: #1 tag size = 63 INFO @ Tue, 13 Sep 2016 15:36:55: #1 total tags in treatment: 1877922 INFO @ Tue, 13 Sep 2016 15:36:55: #1 finished! INFO @ Tue, 13 Sep 2016 15:36:55: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:36:55: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:36:55: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:36:55: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:36:55: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:36:55: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:36:55: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:36:55: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:37:04: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:37:04: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_3_S27_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:37:04: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_3_S27_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:37:04: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:37:04: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:37:11: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_3_S27_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:37:11: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_3_S27_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:37:11: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_3_S27_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:37:11: Done! INFO @ Tue, 13 Sep 2016 15:37:13: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:37:15: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:37:16: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:37:16: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:37:22: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:37:26: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_3_S27_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:37:28: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/It_800_S11_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_800_S11_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_800_S11_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/It_800_S11_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:37:28: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:37:28: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:37:29: 1000000 INFO @ Tue, 13 Sep 2016 15:37:31: 2000000 INFO @ Tue, 13 Sep 2016 15:37:33: 3000000 INFO @ Tue, 13 Sep 2016 15:37:34: #1 tag size is determined as 67 bps INFO @ Tue, 13 Sep 2016 15:37:34: #1 tag size = 67 INFO @ Tue, 13 Sep 2016 15:37:34: #1 total tags in treatment: 3547310 INFO @ Tue, 13 Sep 2016 15:37:34: #1 finished! INFO @ Tue, 13 Sep 2016 15:37:34: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:37:34: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:37:34: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:37:34: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:37:34: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:37:34: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:37:34: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:37:34: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:37:51: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:37:51: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_800_S11_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:37:51: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_800_S11_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:37:51: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:37:51: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:38:03: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_800_S11_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:38:03: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_800_S11_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:38:03: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_800_S11_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:38:03: Done! INFO @ Tue, 13 Sep 2016 15:38:04: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:38:08: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:38:09: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:38:10: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:38:20: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:38:27: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/It_800_S11_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:38:29: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Kt_1_S13_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_1_S13_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_1_S13_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Kt_1_S13_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:38:29: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:38:29: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:38:30: 1000000 INFO @ Tue, 13 Sep 2016 15:38:32: 2000000 INFO @ Tue, 13 Sep 2016 15:38:33: 3000000 INFO @ Tue, 13 Sep 2016 15:38:35: 4000000 INFO @ Tue, 13 Sep 2016 15:38:35: #1 tag size is determined as 60 bps INFO @ Tue, 13 Sep 2016 15:38:35: #1 tag size = 60 INFO @ Tue, 13 Sep 2016 15:38:35: #1 total tags in treatment: 4188178 INFO @ Tue, 13 Sep 2016 15:38:35: #1 finished! INFO @ Tue, 13 Sep 2016 15:38:35: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:38:35: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:38:35: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:38:35: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:38:35: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:38:35: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:38:35: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:38:35: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:38:55: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:38:55: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_1_S13_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:38:55: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_1_S13_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:38:55: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:38:55: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:39:08: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_1_S13_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:39:08: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_1_S13_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:39:08: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_1_S13_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:39:08: Done! INFO @ Tue, 13 Sep 2016 15:39:10: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:39:14: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:39:16: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:39:17: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:39:29: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:39:36: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_1_S13_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:39:38: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Kt_2_S14_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_2_S14_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_2_S14_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Kt_2_S14_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:39:38: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:39:38: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:39:39: 1000000 INFO @ Tue, 13 Sep 2016 15:39:41: 2000000 INFO @ Tue, 13 Sep 2016 15:39:42: #1 tag size is determined as 63 bps INFO @ Tue, 13 Sep 2016 15:39:42: #1 tag size = 63 INFO @ Tue, 13 Sep 2016 15:39:42: #1 total tags in treatment: 2566722 INFO @ Tue, 13 Sep 2016 15:39:42: #1 finished! INFO @ Tue, 13 Sep 2016 15:39:42: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:39:42: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:39:42: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:39:42: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:39:42: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:39:42: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:39:42: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:39:42: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:39:54: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:39:54: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_2_S14_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:39:54: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_2_S14_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:39:54: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:39:54: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:40:03: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_2_S14_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:40:03: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_2_S14_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:40:03: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_2_S14_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:40:03: Done! INFO @ Tue, 13 Sep 2016 15:40:04: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:40:07: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:40:08: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:40:09: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:40:16: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:40:21: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_2_S14_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:40:22: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Kt_3_S15_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_3_S15_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_3_S15_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Kt_3_S15_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:40:22: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:40:22: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:40:24: 1000000 INFO @ Tue, 13 Sep 2016 15:40:25: 2000000 INFO @ Tue, 13 Sep 2016 15:40:27: 3000000 INFO @ Tue, 13 Sep 2016 15:40:29: 4000000 INFO @ Tue, 13 Sep 2016 15:40:29: #1 tag size is determined as 65 bps INFO @ Tue, 13 Sep 2016 15:40:29: #1 tag size = 65 INFO @ Tue, 13 Sep 2016 15:40:29: #1 total tags in treatment: 4264092 INFO @ Tue, 13 Sep 2016 15:40:29: #1 finished! INFO @ Tue, 13 Sep 2016 15:40:29: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:40:29: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:40:29: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:40:29: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:40:29: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:40:29: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:40:29: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:40:29: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:40:51: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:40:51: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_3_S15_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:40:51: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_3_S15_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:40:51: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:40:51: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:41:04: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_3_S15_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:41:04: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_3_S15_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:41:04: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_3_S15_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:41:04: Done! INFO @ Tue, 13 Sep 2016 15:41:05: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:41:09: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:41:11: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:41:12: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:41:24: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:41:31: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kt_3_S15_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:41:33: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Kz_300_S4_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_300_S4_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_300_S4_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Kz_300_S4_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:41:33: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:41:33: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:41:34: 1000000 INFO @ Tue, 13 Sep 2016 15:41:36: 2000000 INFO @ Tue, 13 Sep 2016 15:41:37: 3000000 INFO @ Tue, 13 Sep 2016 15:41:39: 4000000 INFO @ Tue, 13 Sep 2016 15:41:40: #1 tag size is determined as 61 bps INFO @ Tue, 13 Sep 2016 15:41:40: #1 tag size = 61 INFO @ Tue, 13 Sep 2016 15:41:40: #1 total tags in treatment: 4235332 INFO @ Tue, 13 Sep 2016 15:41:40: #1 finished! INFO @ Tue, 13 Sep 2016 15:41:40: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:41:40: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:41:40: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:41:40: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:41:40: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:41:40: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:41:40: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:41:40: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:42:02: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:42:02: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_300_S4_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:42:02: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_300_S4_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:42:02: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:42:02: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:42:15: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_300_S4_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:42:15: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_300_S4_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:42:15: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_300_S4_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:42:15: Done! INFO @ Tue, 13 Sep 2016 15:42:16: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:42:20: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:42:22: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:42:23: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:42:34: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:42:41: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_300_S4_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:42:42: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Kz_800_S10_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_800_S10_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_800_S10_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Kz_800_S10_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:42:42: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:42:42: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:42:44: 1000000 INFO @ Tue, 13 Sep 2016 15:42:46: 2000000 INFO @ Tue, 13 Sep 2016 15:42:47: 3000000 INFO @ Tue, 13 Sep 2016 15:42:49: #1 tag size is determined as 67 bps INFO @ Tue, 13 Sep 2016 15:42:49: #1 tag size = 67 INFO @ Tue, 13 Sep 2016 15:42:49: #1 total tags in treatment: 3843008 INFO @ Tue, 13 Sep 2016 15:42:49: #1 finished! INFO @ Tue, 13 Sep 2016 15:42:49: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:42:49: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:42:49: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:42:49: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:42:49: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:42:49: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:42:49: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:42:49: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:43:08: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:43:08: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_800_S10_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:43:08: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_800_S10_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:43:08: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:43:08: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:43:20: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_800_S10_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:43:20: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_800_S10_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:43:20: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_800_S10_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:43:20: Done! INFO @ Tue, 13 Sep 2016 15:43:22: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:43:26: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:43:27: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:43:29: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:43:40: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:43:46: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Kz_800_S10_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:43:48: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Mz_1_S19_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_1_S19_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_1_S19_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Mz_1_S19_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:43:48: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:43:48: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:43:50: 1000000 INFO @ Tue, 13 Sep 2016 15:43:51: 2000000 INFO @ Tue, 13 Sep 2016 15:43:52: #1 tag size is determined as 66 bps INFO @ Tue, 13 Sep 2016 15:43:52: #1 tag size = 66 INFO @ Tue, 13 Sep 2016 15:43:52: #1 total tags in treatment: 2511910 INFO @ Tue, 13 Sep 2016 15:43:52: #1 finished! INFO @ Tue, 13 Sep 2016 15:43:52: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:43:52: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:43:52: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:43:52: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:43:52: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:43:52: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:43:52: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:43:52: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:44:04: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:44:04: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_1_S19_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:44:04: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_1_S19_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:44:04: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:44:04: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:44:13: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_1_S19_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:44:13: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_1_S19_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:44:13: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_1_S19_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:44:13: Done! INFO @ Tue, 13 Sep 2016 15:44:14: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:44:17: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:44:18: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:44:19: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:44:27: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:44:32: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_1_S19_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:44:34: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Mz_2_S20_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_2_S20_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_2_S20_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Mz_2_S20_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:44:34: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:44:34: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:44:35: 1000000 INFO @ Tue, 13 Sep 2016 15:44:36: #1 tag size is determined as 69 bps INFO @ Tue, 13 Sep 2016 15:44:36: #1 tag size = 69 INFO @ Tue, 13 Sep 2016 15:44:36: #1 total tags in treatment: 1493892 INFO @ Tue, 13 Sep 2016 15:44:36: #1 finished! INFO @ Tue, 13 Sep 2016 15:44:36: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:44:36: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:44:36: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:44:36: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:44:36: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:44:36: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:44:36: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:44:36: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:44:42: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:44:42: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_2_S20_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:44:42: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_2_S20_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:44:42: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:44:42: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:44:49: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_2_S20_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:44:49: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_2_S20_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:44:49: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_2_S20_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:44:49: Done! INFO @ Tue, 13 Sep 2016 15:44:50: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:44:52: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:44:53: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:44:53: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:44:59: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:45:03: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_2_S20_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:45:04: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Mz_300_S2_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_300_S2_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_300_S2_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Mz_300_S2_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:45:04: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:45:04: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:45:06: 1000000 INFO @ Tue, 13 Sep 2016 15:45:07: 2000000 INFO @ Tue, 13 Sep 2016 15:45:09: 3000000 INFO @ Tue, 13 Sep 2016 15:45:11: 4000000 INFO @ Tue, 13 Sep 2016 15:45:12: #1 tag size is determined as 68 bps INFO @ Tue, 13 Sep 2016 15:45:12: #1 tag size = 68 INFO @ Tue, 13 Sep 2016 15:45:12: #1 total tags in treatment: 4540092 INFO @ Tue, 13 Sep 2016 15:45:12: #1 finished! INFO @ Tue, 13 Sep 2016 15:45:12: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:45:12: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:45:12: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:45:12: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:45:12: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:45:12: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:45:12: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:45:12: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:45:34: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:45:34: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_300_S2_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:45:34: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_300_S2_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:45:34: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:45:34: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:45:47: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_300_S2_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:45:47: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_300_S2_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:45:48: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_300_S2_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:45:48: Done! INFO @ Tue, 13 Sep 2016 15:45:49: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:45:53: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:45:55: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:45:57: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:46:09: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:46:17: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_300_S2_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:46:19: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Mz_3_S21_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_3_S21_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_3_S21_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Mz_3_S21_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:46:19: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:46:19: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:46:20: 1000000 INFO @ Tue, 13 Sep 2016 15:46:21: #1 tag size is determined as 61 bps INFO @ Tue, 13 Sep 2016 15:46:21: #1 tag size = 61 INFO @ Tue, 13 Sep 2016 15:46:21: #1 total tags in treatment: 1428896 INFO @ Tue, 13 Sep 2016 15:46:21: #1 finished! INFO @ Tue, 13 Sep 2016 15:46:21: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:46:21: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:46:21: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:46:21: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:46:21: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:46:21: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:46:21: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:46:21: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:46:27: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:46:27: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_3_S21_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:46:27: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_3_S21_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:46:27: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:46:27: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:46:33: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_3_S21_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:46:33: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_3_S21_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:46:33: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_3_S21_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:46:33: Done! INFO @ Tue, 13 Sep 2016 15:46:35: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:46:36: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:46:37: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:46:38: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:46:43: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:46:47: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_3_S21_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:46:48: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Mz_800_S8_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_800_S8_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_800_S8_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/Mz_800_S8_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:46:48: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:46:48: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:46:50: 1000000 INFO @ Tue, 13 Sep 2016 15:46:51: 2000000 INFO @ Tue, 13 Sep 2016 15:46:53: 3000000 INFO @ Tue, 13 Sep 2016 15:46:55: 4000000 INFO @ Tue, 13 Sep 2016 15:46:55: #1 tag size is determined as 70 bps INFO @ Tue, 13 Sep 2016 15:46:55: #1 tag size = 70 INFO @ Tue, 13 Sep 2016 15:46:55: #1 total tags in treatment: 4051824 INFO @ Tue, 13 Sep 2016 15:46:55: #1 finished! INFO @ Tue, 13 Sep 2016 15:46:55: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:46:55: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:46:55: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:46:55: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:46:55: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:46:55: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:46:55: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:46:55: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:47:15: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:47:15: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_800_S8_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:47:15: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_800_S8_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:47:15: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:47:15: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:47:28: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_800_S8_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:47:28: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_800_S8_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:47:28: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_800_S8_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:47:28: Done! INFO @ Tue, 13 Sep 2016 15:47:29: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:47:33: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:47:35: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:47:36: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:47:48: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:47:55: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/Mz_800_S8_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:47:57: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/U_1_S28_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_1_S28_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_1_S28_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/U_1_S28_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:47:57: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:47:57: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:47:59: 1000000 INFO @ Tue, 13 Sep 2016 15:48:00: 2000000 INFO @ Tue, 13 Sep 2016 15:48:00: #1 tag size is determined as 65 bps INFO @ Tue, 13 Sep 2016 15:48:00: #1 tag size = 65 INFO @ Tue, 13 Sep 2016 15:48:00: #1 total tags in treatment: 2002674 INFO @ Tue, 13 Sep 2016 15:48:00: #1 finished! INFO @ Tue, 13 Sep 2016 15:48:00: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:48:00: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:48:00: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:48:00: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:48:00: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:48:00: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:48:00: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:48:00: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:48:09: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:48:09: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_1_S28_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:48:09: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_1_S28_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:48:09: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:48:09: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:48:17: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_1_S28_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:48:17: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_1_S28_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:48:17: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_1_S28_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:48:17: Done! INFO @ Tue, 13 Sep 2016 15:48:19: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:48:21: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:48:22: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:48:23: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:48:31: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:48:36: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_1_S28_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:48:37: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/U_2_S29_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_2_S29_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_2_S29_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/U_2_S29_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:48:37: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:48:37: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:48:39: 1000000 INFO @ Tue, 13 Sep 2016 15:48:40: 2000000 INFO @ Tue, 13 Sep 2016 15:48:42: 3000000 INFO @ Tue, 13 Sep 2016 15:48:44: 4000000 INFO @ Tue, 13 Sep 2016 15:48:44: #1 tag size is determined as 67 bps INFO @ Tue, 13 Sep 2016 15:48:44: #1 tag size = 67 INFO @ Tue, 13 Sep 2016 15:48:44: #1 total tags in treatment: 4330064 INFO @ Tue, 13 Sep 2016 15:48:44: #1 finished! INFO @ Tue, 13 Sep 2016 15:48:44: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:48:44: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:48:44: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:48:44: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:48:44: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:48:44: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:48:44: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:48:44: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:49:08: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:49:08: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_2_S29_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:49:08: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_2_S29_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:49:08: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:49:08: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:49:20: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_2_S29_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:49:20: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_2_S29_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:49:20: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_2_S29_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:49:20: Done! INFO @ Tue, 13 Sep 2016 15:49:22: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:49:25: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:49:27: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:49:28: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:49:39: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:49:46: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_2_S29_R1.trimmed.nodup_FE.bdg'! INFO @ Tue, 13 Sep 2016 15:49:47: # Command line: callpeak -t /srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/U_3_S30_R1.trimmed.nodup.tagAlign.gz -f BED -n /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_3_S30_R1.trimmed.nodup -g 12157105 -p 0.05 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup all --call-summits # ARGUMENTS LIST: # name = /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_3_S30_R1.trimmed.nodup # format = BED # ChIP-seq file = ['/srv/scratch/training_camp/tc2016/user23/analysis/tagAlign/U_3_S30_R1.trimmed.nodup.tagAlign.gz'] # control file = None # effective genome size = 1.22e+07 # band width = 300 # model fold = [5, 50] # pvalue cutoff = 5.00e-02 # qvalue will not be calculated and reported as -1 in the final output. # Larger dataset will be scaled towards smaller dataset. # Range for calculating regional lambda is: 10000 bps # Broad region calling is off # Searching for subpeak summits is on # MACS will save fragment pileup signal per million reads INFO @ Tue, 13 Sep 2016 15:49:47: #1 read tag files... INFO @ Tue, 13 Sep 2016 15:49:47: #1 read treatment tags... INFO @ Tue, 13 Sep 2016 15:49:49: 1000000 INFO @ Tue, 13 Sep 2016 15:49:51: 2000000 INFO @ Tue, 13 Sep 2016 15:49:51: #1 tag size is determined as 66 bps INFO @ Tue, 13 Sep 2016 15:49:51: #1 tag size = 66 INFO @ Tue, 13 Sep 2016 15:49:51: #1 total tags in treatment: 2183346 INFO @ Tue, 13 Sep 2016 15:49:51: #1 finished! INFO @ Tue, 13 Sep 2016 15:49:51: #2 Build Peak Model... INFO @ Tue, 13 Sep 2016 15:49:51: #2 Skipped... INFO @ Tue, 13 Sep 2016 15:49:51: #2 Sequencing ends will be shifted towards 5' by 75 bp(s) INFO @ Tue, 13 Sep 2016 15:49:51: #2 Use 150 as fragment length INFO @ Tue, 13 Sep 2016 15:49:51: #3 Call peaks... INFO @ Tue, 13 Sep 2016 15:49:51: #3 Going to call summits inside each peak ... INFO @ Tue, 13 Sep 2016 15:49:51: #3 Call peaks with given -log10pvalue cutoff: 1.30103 ... INFO @ Tue, 13 Sep 2016 15:49:51: #3 Pre-compute pvalue-qvalue table... INFO @ Tue, 13 Sep 2016 15:50:01: #3 In the peak calling step, the following will be performed simultaneously: INFO @ Tue, 13 Sep 2016 15:50:01: #3 Write bedGraph files for treatment pileup (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_3_S30_R1.trimmed.nodup_treat_pileup.bdg INFO @ Tue, 13 Sep 2016 15:50:01: #3 Write bedGraph files for control lambda (after scaling if necessary)... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_3_S30_R1.trimmed.nodup_control_lambda.bdg INFO @ Tue, 13 Sep 2016 15:50:01: #3 --SPMR is requested, so pileup will be normalized by sequencing depth in million reads. INFO @ Tue, 13 Sep 2016 15:50:01: #3 Call peaks for each chromosome... INFO @ Tue, 13 Sep 2016 15:50:09: #4 Write output xls file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_3_S30_R1.trimmed.nodup_peaks.xls INFO @ Tue, 13 Sep 2016 15:50:09: #4 Write peak in narrowPeak format file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_3_S30_R1.trimmed.nodup_peaks.narrowPeak INFO @ Tue, 13 Sep 2016 15:50:09: #4 Write summits bed file... /srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_3_S30_R1.trimmed.nodup_summits.bed INFO @ Tue, 13 Sep 2016 15:50:09: Done! INFO @ Tue, 13 Sep 2016 15:50:11: Read and build treatment bedGraph... INFO @ Tue, 13 Sep 2016 15:50:13: Read and build control bedGraph... INFO @ Tue, 13 Sep 2016 15:50:14: Build scoreTrackII... INFO @ Tue, 13 Sep 2016 15:50:15: Calculate scores comparing treatment and control by 'FE'... INFO @ Tue, 13 Sep 2016 15:50:21: Write bedGraph of scores... INFO @ Tue, 13 Sep 2016 15:50:26: Finished 'FE'! Please check '/srv/scratch/training_camp/tc2016/user23/analysis/peaks/U_3_S30_R1.trimmed.nodup_FE.bdg'!
Finally, we merge the peaks across all conditions to create a master list of peaks for analysis.
cd $PEAKS_DIR
#concatenate all .narrowPeak files together
cat *narrowPeak > all.peaks.bed
#sort the concatenated file
bedtools sort -i all.peaks.bed > all.peaks.sorted.bed
#merge the sorted, concatenated fileto join overlapping peaks
bedtools merge -i all.peaks.sorted.bed | sed -n 'p;=' | paste -d"\t" - - > all_merged.peaks.bed
gzip -f all_merged.peaks.bed