skip to main content
Roche logo
The Input sub-tab for genomic projects is shown on Figure 56 (for cDNA projects, see Section 2.7.4). It allows you to adjust the following settings:
When the Nimblegen sequence capture checkbox is checked, the read data will automatically be primer-trimmed assuming NimbleGen Sequence Capture adapters are present at the beginning of the reads (this option defaults to unchecked). If the NimbleGen gSel protocol has been used, this option must be checked. There is no need to use this option with NimbleGen’s optimized protocol for the GS FLX Titanium chemistry.
If the Automatic trimming checkbox is checked, the ends of new read files added to the project will be trimmed the next time a mapping computation is performed.
Expected depth allows you to specify the number of reads expected to uniquely map to a single position of the reference genome. For high-depth mapping (where the expected coverage of a position in the genome could reach hundreds or thousands of reads), this option can be used to prevent low-frequency variations from being reported. If this option isn’t used, many variations that are actually due to sequencing errors may be reported. The default value for this option is 0. A value of 0 or greater is allowed. If this option is set to a very low value (below 25), any variations that appear in only two reads may be output to the 454AllDiffs.txt file.
The Minimum Read Length option can be used to change the minimum accepted read length to be used in the mapping (default is 20 bp, allowed value range is 15-45). This option is only applied if Paired End data is used.
The input files section of the tab allows you to specify the locations of several files that may be used in the mapping.
An optional Trimming database is used to trim the ends of input reads (for cloning vectors, primers, adapters or other end sequences). Specify the path to a FASTA file of sequences to be used for this trimming (see Section 4.10).
To use the Screening database option, set the path to a FASTA file of sequences to be used to screen the input reads for contaminants. A read that almost completely aligns against a sequence in the screening database is removed so that it is not used in the computation; if at least 15 bases of a read do not align to the screening sequence, no action is taken (see Section 4.10).
The Targeted regions option can be used to direct the GS Reference Mapper to only report output for regions of the reference you specify and the reads / contigs that overlap those regions. The reads are mapped against the complete reference, but the output files will only report on the specified regions. See Section 4.17 for more details on target regions and extended target regions. The string value can be
To use the Genome annotation option, enter the path to an annotation file describing the gene/coding-region annotations for the reference sequences, so that the variation detection (AllDiffs, HCDiffs, and Structural Variations) can report gene names and protein translations of any identified variations in gene regions. The format of this file must match that of the GoldenPath “refGene.txt” file. See appendix 4.10.2 for details of the refGene.txt file format.
The Known SNP option is used to specify the path to an annotation file describing the known SNP information for the reference sequences, so that the variation detection (HCDiffs tab) can link identified variations to the known SNP information. The format of this file must match that of the GoldenPath snp130.txt file. See appendix 4.10.3 for details of the snp130.txt file format.
The Accno Renaming File option can be used to specify a renaming file to incorporate gene and transcript descriptions into the output files and/or to provide more meaningful names for genes and transcripts. For a description of the renaming file see Section 4.9
The Include and exclude filter file options can be used to specify a file of sequences to specifically include or exclude in the computation. The file format and functionality are the same as the –e and –i options of sfffile (see Section 3.1).