skip to main content
Roche logo
1. Overview of the 454 Sequencing System Software : 1.3 Data Output and Folder Structure : 1.3.2 Data Processing (GS Run Processor) Results: the Data Processing Folder
The organization of a generic Data Processing folder (‘D_’) is depicted in Figure 3. D_ folders are created by the GS Run Processor application, within the R_ folder of the sequencing Run whose data is being processed. Since a data set can be re-processed multiple times (via the GS Run Browser; see Part B, Section 3 of this manual), a given R_ folder can contain multiple D_ folders. To the extent that they are generated on-instrument, per the processing type selected (see section 1.2), all the processed data (basecalls and quality scores, Run metrics, log files, etc.) remain in temporary local storage on the GS FLX+ Instrument or the GS Junior Attendant PC. In addition, if “Backup” is selected during Run set up, raw and processed data files from the Run can be transferred to a network location specified by the System Administrator, for long-term storage. See Part B, Section 1 for full file descriptions and for more information on the GS Run Processor application.
The GS FLX+ Instrument or the GS Junior Attendant PC processes the sequencing data “on-the-fly,” i.e. the data is processed (to the extent specified in the processing type selected) and deposited in the Data Processing folder, concurrently with the Run. When the “Run Completed” window appears on the screen, the processing of the sequencing Run has completed and the results are ready for further processing or transfer.
   R_yyyy_mm_dd_hh_min_sec_machineName_userName_uniqueRunName/1
[…]
D_yyyy_mm_dd_hh_min_sec_machineName_analysisType/2
gsRunProcessor.log
gsRunProcessor_err.log
dataRunParams.xml
regions/2
region.cwf2
sff/2
uaccnoRegion.sff2
Figure 3: General organization of a “Data Processing” folder. Data Processing (‘D_’) folders are created within the corresponding Run’s ‘R_’ folder; an R_ folder can contain multiple D_ folders, if the data set was re-processed. Words in italics are generic. The superscripts indicate the application by which the folders and files are generated: GS Sequencer or GS Junior Sequencer1, GS Run Processor2. The set of SFF files are generated during the signal processing step of the GS Run Processor, using the “universal” accession prefix described in section 2.3.7. See Part B, Section 1 of this manual for full file descriptions.