skip to main content
Roche logo
1. GS De Novo Assembler : 1.13 The GS De Novo Assembler Command Line Interface : 1.13.2 One-Step Assembly: the runAssembly Command
If all the reads to be assembled are available at once, the GS De Novo Assembler application can process them with a single command, which has the following command line structure:
runAssembly [options] [MIDList@]filedesc…
and each “MIDList” is a multiplexing information string used to filter the set of file reads to be used in the assembly (see section 4.7 for the format of the MIDList information). If MIDs were used in the generation of the data file, then an MIDList string must be specified in order for the assembler to properly handle the file’s reads.
The runAssembly command is actually a wrapper program around the newAssembly and related commands used in incremental assembly (see section 1.13.3). After the incremental assembly command is completed, runAssembly then transforms the project files and folders into this one-step form (which more closely matches the output structure of older versions of the “Assembly” application). The actions taken are:
If the –nrm option is given, runAssembly does not perform this transformation, and the resulting folder can be used for further project-based assembly (the structure of the files/folders will match that of an incremental assembly project).
Any combination of explicit SFF files and data directory paths (with optional region lists) may be specified on the command line. For each data directory path given, the runAssembly command will read the existing SFF files in the “sff” sub-directory of the data directory. If SFF files are not present in a data directory (e.g. for a Run whose data has been processed with a version of the 454 Sequencing System software anterior to 1.0.52), the signal processing step of the GS Run Processor application must be rerun on the Data Processing folder.
The data analysis software installation package contains a default MID configuration file, found by default at Installation_path/454/config/MIDConfig.parse. This file is read by the GS De Novo Assembler and used to match MID set names and MID names with their multiplexing information. Users can edit this file to add their own MID sets (following the format and syntax described in the file), or can copy this file to create their own separate MID configuration file (and then use the “-mcf” option to specify that as the MID configuration file to be used).