|
1.
|
Trim Read Data: for each Read Data Set, the Primer sequences and MIDs, if used, of all the reads are identified for demultiplexing purposes and the trim points are noted for the “Target” sequences. (As mentioned in section 1.1.1.3, trimming the Primers is important because any variations found therein would have no biological significance, and therefore should not be reported by the AVA software.)
|
|
a.
|
splitting the reads of a Read Data Set over multiple Samples, e.g. if the experiment was set up such that one or more Amplicons (which are associated with different Samples either directly, or through the use of MIDs) were present in a PicoTiterPlate Device (GS Junior Instrument) or PicoTiterPlate region (GS FLX+ System) of a sequencing Run; and/or
|
|
b.
|
joining of the reads from multiple Read Data Sets into any given Sample, e.g. if the experiment was set up such that multiple regions of a PicoTiterPlate Device (GS FLX+ System) and/or sequencing Runs contributed reads to the Amplicons that are associated with the Sample.
|
|
5.
|
In addition, the AVA software automatically searches for potential Variants not explicitly entered into the system. These Auto-Detected “Putative” Variants receive “Intelligent Names” that are unique and compact, yet descriptive (see section 4.2); these Auto-Detected Variants are not automatically loaded into the Project, but can be manually loaded based on particular selection criteria on the main Variants Tab (see section 1.5.2.6). The AVA software automatically searches for substitution (SNP) and block deletion (>=3 bp) Variants. (Insertion Variants and block deletion Variants <3 bp are not currently automatically detected, so manual browsing of the alignments may be needed if these types of Variants are anticipated.) Combined with the ability to selectively load and subsequently sort and filter these Variants based on their frequencies and read-orientation support, the vast majority of interesting Variants can be easily discovered and evaluated from the view of the data provided in the Variant tab’s Sample – Variants Table. By providing the ability to both edit the Variant Status, and filter by that Status, the AVA software provides a simple Discovery Workflow to determine which Variants have been evaluated and what the outcome of that evaluation was. See section 1.5.2.7 for more details on the proposed Variant Discovery Workflow process.
|
|
Interrupted computations: If a computation (or re-computation) is interrupted, there is a risk that part of the output may not match the state of the saved Project. While the AVA software withholds the potentially corrupted results from the data that was being processed at the time of the interruption, it also maintains the results from previous computations that had not yet been altered at the time of the interruption. Be aware that those older results may not be consistent with more recent updates to the Project. The outcome of this is similar to the case described in the “Caution” at the end of section 1.3 (editing a Project in a manner that is germane to previously computed results). If you find that the data in these tabs does not reflect the current state of the Project, try re-computing it. The only way to be completely sure that the Project is consistent is to allow the computation to run to completion, without interruption.
|
|
•
|
The message indicating that further messages in the Computation Warning window are based on the Project in its current state, i.e. as computation would see it if it were saved; this gives you warning of problems in a Project even before you save it to disk. If your Project is up to date on the disk, the other messages below may still occur but they would then concern the Project as saved.
|
|
◦
|
the Reference Sequence itself is incompletely defined (e.g. it was given a name, but no actual sequence; in this case the Amplicon wouldn't likely have any Target Start/End set either).
|
|
◦
|
the Variant pattern is inconsistent with the Reference Sequence [e.g. a substitute constraint specifies the substituted nucleotide as the same as the one already in the Reference Sequence (should be a match constraint instead)]
|