2. GS Reporter
:
2.2 GS Reporter output
: 2.2.3 Organization of a Data Processing Directory
2.2.3
Organization of a Data Processing Directory
Once the data processing has been completed and metrics and report files have been generated, these data structure can be quite complex.
Table 12
below can be used to locate specific files within their expected directory if default locations were used.
Organization of a Data Processing Directory ( "D_" )
File name
A
B
C
D
E
Notes
gsRunProcessor.log
gsRunProcessor_err.log
1
454DataProcessingDir.xml
2
region.KEYL.454Reads.fna
3,7
region.KEYL.454Reads.qual
3,7
region.KEYC.454Reads.fna
3,6,7
region.KEYC.454Reads.qual
3,6,7
454BaseCallerMetrics.csv
3,4,7
454BaseCallerMetrics.txt
3,7
454QualityFilterMetrics.csv
3,4,7
454QualityFilterMetrics.txt
3,7
454RuntimeMetricsAll.csv
3,4,7
454RuntimeMetricsAll.txt
3,7
454AllControlMetrics.txt
3,7
454AllControlMetrics.txt
3,7
analysisParms.parse
3,5
revisedRegions.parse
4
454BaseCallerThresholds.txt
error.baseCaller2
error.bbcSelfTrain
error.cafieCorrection
regions/
region0X.cwf
region0X.metrics.xml
region0X.meta.xml
region0X.processingHistory.xml
region0X.wells
region0X.wells.KEY.454RuntimeMetrics.csv
3,4,7
region0X.wells.KEY.454RuntimeMetrics.txt
3,7
region0X.wells.cafieMetrics.csv
3,4,7
region0X.wells.cfValues
3,7
region0X.wells.droopEstimate
3,7
region0X.wells.incValues
3,7
region0X.wells.mleCorrectionInfo
3,7
region0X.wells.trimInfo
3,7
sff/
ACCNOPREX.sff
Table 12: Organization, in the ‘D_’ directory, of the files output by the GS Run Processor and the GS Reporter applications. The following codes are used in the file names: ‘region’ is the region number, ‘ØX’ is the zero-padded region number, ‘KEYL’ is the 3 letter library sequencing key, ‘KEYC’ is the 3 letter library sequencing key, ‘ACCNOPRE’ is the accession number prefix.
A: Generated by gsRun Processor During Processing
B: Generated by Default
C: Generated Directly by gsReporter From CWF Files
D: Generated by gsReporter’s “—legacy” Option
E: No Longer Generated
Notes:
1 - This file is only generated if there are warnings or errors generated during processing.
2 - This file is only created after the basecalling step is run and signifies to the data analysis software that the sff directory contains files suitable for further data analysis.
3- The default generation of these files can be controlled by adjusting the gsReporter options in the pipeline processing scripts.
4 - These files are generated for legacy purposes only. It is recommended that new applications use X.metrics.xml and X.meta.xml extracted from the CWF file via gsRunProcessor for report generating purposes.
5 - This file is deprecated and generated for legacy purposes only. X.processingHistory.xml and X.meta.xml are the canonical record of parameters used to process a Run and a Run's metadata, respectively.
6 – These .fna and .qual files were generated by default in previous versions of the software.
7 - These files may only be generated after all processing is complete. Specifically, the data to create these files are not available after the image processing only step of gsRunProcessor.