We started the project with only one predefined Variant. As part of the computation done to measure our defined Variant, the AVA software also examined the alignments in the Project to propose potential Variants. We can access them via the main Variants tab. If we look again at the view of the Variants tab in Figure 2‑25, we can see that the “Load” button at the bottom left of the Variants Frequency Table filter control box states that there are 12 Variants to load. The automated Variant detector is sensitive and likely to include false positives, so it is wise to use some of the filters to narrow down the potential set of variants rather than just importing them all.

After right-clicking on one of the frequency cells for the new Variant (893:T/G) in the Sample_1 column, we can use the “Global Align” in the menu to load the Global Align tab with the reads covering the Variant position for Sample_1. The global alignment (Figure 2‑32) reveals that the Variant is covered by the EGFR_21_2 Amplicon and that there is an imbalance between forward and reverse read representation for this Amplicon. However, the Variant is present in both forward and reverse reads and has a combined frequency of over 12%, so it could well be a legitimate Variant. By right-clicking on the forward consensus containing the Variant, we can navigate to the consensus alignment.

The Consensus Align tab (see Figure 2‑33) shows the 6 (forward) reads that comprise this Consensus, all of which contain the Variant of interest. However, one of those reads has an additional variation (an “A” to “G” substitution at position 915). The automated Variant detection does not scan for haplotypic variations (except for contiguous deletions), so even if this haplotype is real, we would never see it in the Variants Frequency Table unless we introduce the haplotypic variation to the Project manually (although we might encounter the parts of a haplotype in the table, individually).


The Flowgrams tab view (Figure 2‑35) shows that based on the actual flows of the read, the haplotype appears convincing (or would be if it were supported by multiple reads): the original “893:T/G” substitution Variant exhibits a flow cycle shift (the gray column in the middle flowgram for the read). Furthermore, the flow values for the original Variant and the new Variant (“915:A/G”) are not marginal: the difference flowgram at the bottom shows the reference bases for both Variants decreasing by a solid value of ‘1’ each, and likewise, the replacement bases are each also increased by ‘1’. Although we have seen only one instance of this haplotype we will go ahead and set up the haplotype as a Project Variant so we can see if any other instances are to be found (it is conceivable that the haplotype could be hidden in other consensi).

To add our haplotype to the Project as a Variant, we can return to the Consensus Align tab where we have already made the appropriate filter selections (Figure 2‑34). We click on the “Declare project variant” button to the left of the alignment, and the “Approve new variant” window opens (
Figure 2-36). The automatically created default name for the variant is a sensible concatenation of the two individual Variant names, sorted by position (“893:T/G,915:A/G”), and the rest of the defaults are reasonable so we can click “OK” to define the haplotype as a Variant to be searched for in subsequent rounds of computation.

Clicking on the “OK” button to define the haplotype Variant for the Project is a little anti-climactic because the creation takes place behind the scenes and you remain on the same tab we were on when we submitted the Variant. To view the Variant we just created, we can select the Variants sub-tab of the Project Tab to see the Variant definition table. That view also enables us to edit the Status of individual Variants in the Project. The Var_1 Variant that we entered manually (section 2.2.6) was automatically marked as “Accepted”. The Auto-Detected Variant that we loaded into the Project (“893:T/G”) defaulted to “Putative”. The haplotype Variant that we manually created from filter selections defaulted to “Accepted”. Now that we have had a chance to look at the data, we can make some reasonable Status changes by double-clicking on the Status field of Variants in the table (
Figure 2‑37). We should set the Auto-Detected Variant to “Accepted” rather than “Putative” since we saw ample evidence that it is real. The haplotype, however, is very questionable because it is supported by a single read, so we will demote it to “Putative”. Alternatively, we could have initially created the haplotype with the “Putative” status, changing the default “Accepted” status in the “Status” drop down menu (as seen in
Figure 2-36) to “Putative”, prior to clicking the “OK” button in the “Approve new variant” popup.

The computation should finish very quickly (Figure 2‑39). Note that the computation made use of cached results from the previous computation (section
2.3.1). Except for the demultiplexing step, which is rerun with every computation, the only novel work the computation had to do was the “Search for Variants” step.
After the computation is complete, we can click on the main Variants Tab to see the frequencies of our Variants in our Sample. The haplotype Variant we defined appears to not have been detected at all in the initial view of the Variants Frequency Table (Figure 2‑40; frequency of 0.00% with a total of 65 reads, and grayed out row); this is because the haplotype Variant was defined from an individual read that was buried inside a consensus sequence, but the “Alignment Read Type” filter happens to be set to “Consensus” in the current view.

If we toggle the “Alignment Read Type” to “Individual”, we can see that the haplotype Variant was not missing entirely (Figure 2‑41). The frequency of 1.54% out of 65 reads for this Variant reveals that only one read was found with the haplotype (the very one we used to define it). Without further supporting evidence, this haplotype Variant should probably not be considered legitimate despite the fact that the flowgram evidence was good: it is most likely a read that had a PCR error at position 915.



Eleven Variants is a manageable number that we can reasonably load and examine at one time, so we click the “Load” button to import them. Once we do, the new Variants are all visible as white rows in the Variant Frequencies Table because the “Variant Status” filter is set to “Putative”, the default for Auto-Detected Variants (Figure 2‑44). The frequencies for these Variants are automatically filled out and are valid as of the completion of the last computation: we don’t need to run another round of computation to update the frequencies until we make changes to the Project that would impact the calculation of the frequencies (such as new Samples or Read Data, or any change in the Reference Sequence). This is different from manually defined Variants, which require a round of computation after their definition in order to appear in the Table.

The next phase of workflow control for the newly added Variants is to select the “Compact table” option while we leave the “Variant status” filter set to “Putative”. This hides any Variant rows where the Status is either “Accepted” or “Rejected”. In this case the immediate effect is to hide the rows of the two Variants that we have already validated and set to “Accepted” (Figure 2‑45). Under this configuration of the Variants Frequency Table, we can right-click any Sample-Variant frequency cell to expose the “Global Align” navigation link as we did before for the first Auto-Detected Variant we loaded. After investigating the “Putative” Variants visible in the table and editing their status to either “Accepted” or “Rejected”, they will drop out of view. In this case, we have already decided that the haplotype Variant probably isn’t real, so we can go ahead and mark it as “Rejected”. The Status of a Variant can be changed via a sub-menu available when you right-click on a Variant cell in the Variants Frequency Table (this is shown for the haplotype Variant in
Figure 2‑45), or by editing the Status field of the Variant in the definition table, in the Variants sub-tab of the Project Tab.

After marking the haplotype Variant as “Rejected” it immediately disappears from view (Figure 2‑46). Note that marking a Variant as “Rejected” rather than deleting it outright from the Project can be useful because this keeps the system from subsequently re-proposing it and forcing you to validate it more than once. Similarly, if we investigate one of the Auto-Detected Variants and determine that it is valid, we can change its Status to “Accepted” and it will also drop from view. In this way we can continue to work through Variants until all have been evaluated and the table is empty. At that point we could set the “Variant Status” filter to “Accepted” to display only the Variants in which we are confident, generating a convenient report table that we can export.