Welcome to PHATE’s documentation¶
Potential of Heat-diffusion for Affinity-based Trajectory Embedding (PHATE)
-
class
phate.phate.PHATE(n_components=2, a=10, k=5, t=30, mds='classic', knn_dist='euclidean', mds_dist='euclidean', njobs=1, random_state=None)¶ Bases:
sklearn.base.BaseEstimatorPotential of Heat-diffusion for Affinity-based Trajectory Embedding (PHATE)
Embeds high dimensional single-cell data into two or three dimensions for visualization of biological progressions.
Parameters: data : ndarray [n, p]
2 dimensional input data array with n cells and p dimensions
n_components : int, optional, default: 2
number of dimensions in which the data will be embedded
a : int, optional, default: 10
sets decay rate of kernel tails
k : int, optional, default: 5
used to set epsilon while autotuning kernel bandwidth
t : int, optional, default: 30
power to which the diffusion operator is powered sets the level of diffusion
mds : string, optional, default: ‘classic’
choose from [‘classic’, ‘metric’, ‘nonmetric’] which MDS algorithm is used for dimensionality reduction
knn_dist : string, optional, default: ‘euclidean’
reccomended values: ‘eucliean’ and ‘cosine’ Any metric from scipy.spatial.distance can be used distance metric for building kNN graph
mds_dist : string, optional, default: ‘euclidean’
reccomended values: ‘eucliean’ and ‘cosine’ Any metric from scipy.spatial.distance can be used distance metric for MDS
njobs : integer, optional, default: 1
The number of jobs to use for the computation. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used
random_state : integer or numpy.RandomState, optional
The generator used to initialize SMACOF (metric, nonmetric) MDS If an integer is given, it fixes the seed Defaults to the global numpy random number generator
References
[R1] Moon KR, van Dijk D, Zheng W, et al. (2017). “PHATE: A Dimensionality Reduction Method for Visualizing Trajectory Structures in High-Dimensional Biological Data”. Biorxiv. Attributes
embedding (array-like, shape [n_samples, n_dimensions]) Stores the position of the dataset in the embedding space diff_op (array-like, shape [n_samples, n_samples]) The diffusion operator fit on the input data diff_potential (array-like, shape [n_samples, n_samples]) Precomputed diffusion potential Methods
fit(X)Computes the position of the cells in the embedding space fit_transform(X)Computes the position of the cells in the embedding space get_params([deep])Get parameters for this estimator. reset_diffusion([t])reset_mds([mds, mds_dist])set_params(\*\*params)Set the parameters of this estimator. -
fit(X)¶ Computes the position of the cells in the embedding space
Parameters: X : array, shape=[n_samples, n_features]
Input data.
diff_op : array, shape=[n_samples, n_samples], optional
Precomputed diffusion operator
-
fit_transform(X)¶ Computes the position of the cells in the embedding space
Parameters: X : array, shape=[n_samples, n_features]
Input data.
diff_op : array, shape=[n_samples, n_samples], optional
Precomputed diffusion operator
Returns: embedding : array, shape=[n_samples, n_dimensions]
The cells embedded in a lower dimensional space using PHATE
-
get_params(deep=True)¶ Get parameters for this estimator.
Parameters: deep : boolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params : mapping of string to any
Parameter names mapped to their values.
-
reset_diffusion(t=30)¶
-
reset_mds(mds='classic', mds_dist='euclidean')¶
-
set_params(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>so that it’s possible to update each component of a nested object.Returns: self
-
-
phate.phate.embed_phate(data, n_components=2, a=10, k=5, t=30, mds='classic', knn_dist='euclidean', mds_dist='euclidean', diff_op=None, diff_potential=None, njobs=1, random_state=None)¶ Embeds high dimensional single-cell data into two or three dimensions for visualization of biological progressions.
Parameters: data : ndarray [n, p]
2 dimensional input data array with n cells and p dimensions
n_components : int, optional, default: 2
number of dimensions in which the data will be embedded
a : int, optional, default: 10
sets decay rate of kernel tails
k : int, optional, default: 5
used to set epsilon while autotuning kernel bandwidth
t : int, optional, default: 30
power to which the diffusion operator is powered sets the level of diffusion
mds : string, optional, default: ‘classic’
choose from [‘classic’, ‘metric’, ‘nonmetric’] which multidimensional scaling algorithm is used for dimensionality reduction
knn_dist : string, optional, default: ‘euclidean’
reccomended values: ‘eucliean’ and ‘cosine’ Any metric from scipy.spatial.distance can be used distance metric for building kNN graph
mds_dist : string, optional, default: ‘euclidean’
reccomended values: ‘eucliean’ and ‘cosine’ Any metric from scipy.spatial.distance can be used distance metric for MDS
diff_op : ndarray, optional [n, n], default: None
Precomputed diffusion operator
diff_potential : ndarray, optional [n, n], default: None
Precomputed diffusion potential
random_state : integer or numpy.RandomState, optional
The generator used to initialize SMACOF (metric, nonmetric) MDS If an integer is given, it fixes the seed Defaults to the global numpy random number generator
Returns: embedding : ndarray [n_samples, n_components]
PHATE embedding in low dimensional space.
diff_op : ndarray [n_samples, n_samples]
PHATE embedding in low dimensional space.
References
[R2] Moon KR, van Dijk D, Zheng W, et al. (2017). “PHATE: A Dimensionality Reduction Method for Visualizing Trajectory Structures in High-Dimensional Biological Data”. Biorxiv.
-
phate.mds.cmdscale(D)¶ Classical multidimensional scaling (MDS) Copyright © 2014-7 Francis Song, New York University http://www.nervouscomputer.com/hfs/cmdscale-in-python/
Parameters: D : (n, n) array
Symmetric distance matrix.
Returns: Y : (n, p) array
Configuration matrix. Each column represents a dimension. Only the p dimensions corresponding to positive eigenvalues of B are returned. Note that each dimension is only determined up to an overall sign, corresponding to a reflection.
e : (n,) array
Eigenvalues of B.
-
phate.mds.embed_MDS(X, ndim=2, how='classic', distance_metric='euclidean', njobs=1, seed=None)¶ Performs classic, metric, and non-metric MDS
Parameters: X: ndarray [n_samples, n_samples]
2 dimensional input data array with n_samples embed_MDS does not check for matrix squareness, but this is nescessary for PHATE
n_dim : int, optional, default: 2
number of dimensions in which the data will be embedded
how : string, optional, default: ‘classic’
choose from [‘classic’, ‘metric’, ‘nonmetric’] which MDS algorithm is used for dimensionality reduction
distance_metric : string, optional, default: ‘euclidean’
choose from [‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘cityblock’, ‘correlation’, ‘cosine’, ‘dice’, ‘euclidean’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘yule’] distance metric for MDS
njobs : integer, optional, default: 1
The number of jobs to use for the computation. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used
seed: integer or numpy.RandomState, optional
The generator used to initialize SMACOF (metric, nonmetric) MDS If an integer is given, it fixes the seed Defaults to the global numpy random number generator
Returns: Y : ndarray [n_samples, n_dim]
low dimensional embedding of X using MDS