2. GS Reference Mapper
:
2.15 GS Reference Mapper Command Line Interface
: 2.15.1 Working with Project Folders and Data Files
2.15.1
Working with Project Folders and Data Files
Since the mapping computation is often performed on a pool of sequencing Runs (or Read Data files) rather than on any single Run, the result files it generates are not deposited in a Run folder. Two general cases exist.
•
If the mapping is performed using the “one-step” command runMapping, a folder with a ‘P_’ prefix (for ‘P’ost-Run Analysis) is created in the user’s current working directory on the DataRig at the time the application is launched, or written to a directory specified by the user on the command line (or its GUI equivalent), to contain these files. The name structure for this folder is as follows:
P_yyyy_mm_dd_hh_min_sec_runMapping
•
For “incremental”, or “project-based” mapping, using the GUI application or using the newMapping and related commands, the output is placed in a “project” folder. A user can specify any name for a project folder; it will be recognized as a project folder by virtue of the “454Project.xml” file that will be automatically created within it. If a directory name is not specified using the newMapping command line, the software will use the same default name as the runMapping command (as above).
The incremental mapping folder contains additional folders and files that mark the folder as a project folder and that store configuration information and internal data for the GS Reference Mapper application. A project folder is comprised of two sub-folders: a ‘mapping’ sub-folder which contains the project state and output files; and an ‘sff’ sub-folder containing the copies and/or symbolic links for the SFF files used as input to the project. The 454Project.xml file identifies the folder as a 454 project folder.
•
The “-o” option can be specified on the command line to change the directory where the output files should be written. If the specified directory exists, the output files will be written to that directory. If the specified directory does not exist, the program will create it, if possible.
When using the –o option with the runMapping or newMapping command, the mapper will not overwrite the contents of the specified project directory if any exist To force an overwrite of an existing directory, use the –force option with the runMapping or newMapping commands.
External Files
•
GS Read Data files (SFF files) are not actually copied to the project directory. Rather, a symbolic link to the file is created and placed in the project’s sff directory. Consequently, if the original file is moved or erased from the file system, the project will not operate correctly. This applies whether the files are added via the GUI or the command line.
•
FASTA/FASTQ files are not actually copied to the project directory. Rather, the file path is simply recorded in the project setup. Consequently, if the original file is moved or erased from the file system, the project will not operate correctly. This applies to all instances of FASTA/FASTQ files that can be added to a Mapping project, whether done via the GUI or the command line:
◦
Reference files
◦
FASTA/FASTQ reads files, including Sanger reads
◦
Trimming database files
◦
Screening database files
Unlike with the GS Read Data files, no symbolic link to the file is created for FASTA/FASTQ files.