Instructions

Welcome

You have successfully loaded and selected a dataset and you are now ready to explore it using CompClust Web. The Navigation Menu on the left provides links to the analysis tools available for this dataset.

Overview

Edit: Edit allows modify the set of Annotations (labelings) attached to the dataset. Additionally it allows one to select the "Primary" and "Secondary" row annotations. These are special annotations used by the various plots that allow one to uniquely identify a dataset row.
Clustering: Clustering allows one to run various clustering algorithms on the dataset such as KMeans, KMedians EM Mixture of Gaussian. These clustering algorithms will attempt to group similarly acting rows (genes) together.
Cluster Trajectories: Allows one to see the trajectories for any clustering (or other annotation) that groups various rows together.
Confusion Matrix: Confusion Matrices allow one to compare the similarities and differences between two different clusterings.
ROC Analysis: The Receiver Operator Curve allows one to visualize the overlap between a particular cluster of a clustering with the surrounding vectors.
PCA Eigen Vectors: Allows one to observe all of the eigen (basis) vectors for each principal component. The basis vectors are ordered by the percentage of variance captured by each principal component.
PC Extreme Gene Lists: Display a list of the most extreme "high" and "low" data points (e.g. genes) for a principal component of the user's choosing, indicating the relative position of each extreme gene with respect to that principal component's axis.
PCA Projection Plots: Creates a principal component projection scatter plot of the dataset using two user selected principal components. The Y axis has the most extreme data points (e.g. genes) highlighted.
PC Condition Lists: Displays a list of conditions that show significantly different (e.g. "up", or "down") expression when comparing the "high" data points to the "low" data points for a particular principal component. The conditions are ordered by the difference of the mean "high" data points and the mean "low" data points, in order to emphasize the conditions that most affect that principal component.
PC Condition Covariate Scores: For a given principal component PCn, display a score for each condition (e.g. tissue or sample) covariate indicating the degree to which that covariate is correlated with that PCn's significant condition grouping into Up/Flat/Down conditions.
PCEG Trajectories in Native Order: Show the data trajectories for the "high" and "low" extreme gene vectors. The trajectories are shown by plotting across the conditions in their original / native column ordering.
PCEG Trajectories in Significance Order: Show the data trajectories for the "high" and "low" extreme gene vectors. The trajectories are shown by plotting across the conditions after reordering them by the difference of mean "high" and mean "low" expression, in order to emphasize the conditions that most affect that principal component.