<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
  <title>CompClust</title>
  <link rel="stylesheet" href="/compclust_css" type="text/css"
  	tal:attributes="href compclust_css" />
</head>
<body>
<span id="compclustMenu" tal:replace="structure menu"/>
<div id="compclustBody">


<h1><a name="instructions">Instructions</a></h1>

<h2> Welcome </h2>

<p>
  You have successfully loaded and selected a dataset and you are now ready to explore
  it using CompClust Web.   The <em>Navigation Menu</em> on the left
  provides links to the analysis tools available for this dataset.  
</p>

<h2>Overview</h2>

<dl class="compclustGlossary">
  <dt>Edit</dt>
  <dd>
    Edit allows modify the set of Annotations (labelings) attached to the dataset.
    Additionally it allows one to select the "Primary" and "Secondary" row 
    annotations. These are special annotations used by the various plots that 
    allow one to uniquely identify a dataset row.
  </dd>
  <dt>Clustering</dt>
  <dd>
    Clustering allows one to run various clustering algorithms on the dataset
    such as KMeans, KMedians EM Mixture of Gaussian. These clustering algorithms
    will attempt to group similarly acting rows (genes) together.    
  </dd>
  <dt>Cluster Trajectories</dt>
  <dd>
    Allows one to see the trajectories for any clustering (or other annotation) that
    groups various rows together.
  </dd>
  <dt>Confusion Matrix</dt>
  <dd>
    Confusion Matrices allow one to compare the similarities and differences between 
    two different clusterings.
  </dd>
  <dt>ROC Analysis</dt>
  <dd>
    The Receiver Operator Curve allows one to visualize the overlap between a 
    particular cluster of a clustering with the surrounding vectors. 
  </dd>  
  <!-- this is currently broken
  <dt>PCA Projection</dt>
  <dd>
  Allows one to explore the dataset by choose any two principal to project onto 
  a two dimensional scatter plot.
  </dd>
  -->
  <dt>PCA Eigen Vectors</dt>
  <dd>
    Allows one to observe all of the eigen (basis) vectors for each 
    principal component. The basis vectors are ordered by the percentage of 
    variance captured by each principal component.
  </dd>
  <dt>PC Extreme Gene Lists</dt>
  <dd>
    Display a list of the most extreme "high" and "low" data points (e.g. 
    genes) for a principal component of the user's choosing, indicating the 
    relative position of each extreme gene with respect to that principal 
    component's axis.
  </dd>
  <dt>PCA Projection Plots</dt>
  <dd>
    Creates a principal component projection scatter plot of the dataset using
    two user selected principal components. The Y axis has the most extreme data
    points (e.g. genes) highlighted.
  </dd>
  <dt>PC Condition Lists</dt>
  <dd>
    Displays a list of conditions that show significantly different (e.g. 
    "up", or "down") expression when comparing the "high" data points to
    the "low" data points for a particular principal component. The conditions
    are ordered by the difference of the mean "high" data points and
    the mean "low" data points, in order to emphasize the conditions
    that most affect that principal component.
  </dd>
  <dt>PC Condition Covariate Scores</dt>
  <dd>
    For a given principal component PCn, display a score for each condition 
    (e.g. tissue or sample) covariate indicating the degree to which that 
    covariate is correlated with that PCn's significant condition grouping 
    into Up/Flat/Down conditions.
  </dd>
  <dt>PCEG Trajectories in Native Order</dt>
  <dd>
    Show the data trajectories for the "high" and "low" extreme gene
    vectors. The trajectories are shown by plotting across the conditions in
    their original / native column ordering.
  </dd>
  <dt>PCEG Trajectories in Significance Order</dt>
  <dd>
    Show the data trajectories for the "high" and "low" extreme gene 
    vectors. The trajectories are shown by plotting across the
    conditions after reordering them by the difference of mean "high"
    and mean "low" expression, in order to emphasize the conditions
    that most affect that principal component.
  </dd>
</dl>
</div>
</body>
</html>
