skip to main content
Roche logo
1. GS De Novo Assembler : 1.13 The GS De Novo Assembler Command Line Interface : 1.13.3 Incremental Assembly: the newAssembly and Related Commands
This section describes the main commands of the GS De Novo Assembler application, to be used when one or more Runs are to be assembled as part of an assembly project. These commands allow you to add and remove Runs over time and incrementally update the assembly. With these commands, the execution of the assembly algorithms and generation of the results can be controlled by command line options and configuration parameters (paralleling the equivalent controls in the GUI application). Such incremental assembly can be useful when, for example, you want to see intermediate results on existing sequencing Runs to determine if you need to carry out further Runs to reach a desired depth of coverage, or just to monitor the project. Incremental assembly is also useful if you simply wish to create output using different output parameter settings.
1.13.3.1
The newAssembly Command
newAssembly [option] [dir]
[option] -cdna may be used; see Section 4.1
[dir] is the optional name of the project directory
1.13.3.2
The addRun Command
addRun [options] [MIDList@]filedesc…
[options]” are zero or more of the command line options described in section 4.3
each “filedesc” is one of the following:
and each “MIDList” is a list of multiplexing information used to filter the set of file reads used in the assembly (see section 4.7 for the format of the MIDList information). If MIDs were used in the generation of the data file, then an MIDList string must be specified in order for the GS Reference De Novo Assembler to properly handle the file’s reads.
Input read size constraints: The reads input for the assembly computation must be shorter than 2000 bases per read (longer sequences are ignored) and longer than 50 bases (shorter sequences are ignored). When Paired End 454 Reads (SFF reads) are part of the project, reads with lengths between the value of the minlen parameter and 50 bp will be mapped onto contigs formed by assembling longer reads during a later stage in the assembly.
Order of addition of Read data may affect assembly results: In general, the reads should be added to an incremental assembly in the following order to achieve the best contigging and scaffolding results:
1.13.3.3
The removeRun Command
removeRun [dir] (sffname or readfastafilepath)… 
This command is more conveniently carried out from the GUI application. When called from the command line, it requires the SFF file name(s) given in the project sff sub-directory or the FASTA/FASTQ file name(s) given in the project configuration file 454AssemblyProject.xml. These names may not match the original names of the corresponding files or may not be known to the user, especially if the software had to rename any of them to ensure name uniqueness. (The filenames of the SFF files in the sff sub-directory are assigned by the GS De Novo Assembler application to ensure uniqueness for all the files, while trying to preserve the original names when possible.)
1.13.3.4
The runProject Command
runProject [options] [dir]