Welcome to PHATE’s documentation

Potential of Heat-diffusion for Affinity-based Trajectory Embedding (PHATE)

class phate.phate.PHATE(n_components=2, a=10, k=5, t=30, mds='classic', knn_dist='euclidean', mds_dist='euclidean', njobs=1, random_state=None)

Bases: sklearn.base.BaseEstimator

Potential of Heat-diffusion for Affinity-based Trajectory Embedding (PHATE)

Embeds high dimensional single-cell data into two or three dimensions for visualization of biological progressions.

Parameters:

data : ndarray [n, p]

2 dimensional input data array with n cells and p dimensions

n_components : int, optional, default: 2

number of dimensions in which the data will be embedded

a : int, optional, default: 10

sets decay rate of kernel tails

k : int, optional, default: 5

used to set epsilon while autotuning kernel bandwidth

t : int, optional, default: 30

power to which the diffusion operator is powered sets the level of diffusion

mds : string, optional, default: ‘classic’

choose from [‘classic’, ‘metric’, ‘nonmetric’] which MDS algorithm is used for dimensionality reduction

knn_dist : string, optional, default: ‘euclidean’

reccomended values: ‘eucliean’ and ‘cosine’ Any metric from scipy.spatial.distance can be used distance metric for building kNN graph

mds_dist : string, optional, default: ‘euclidean’

reccomended values: ‘eucliean’ and ‘cosine’ Any metric from scipy.spatial.distance can be used distance metric for MDS

njobs : integer, optional, default: 1

The number of jobs to use for the computation. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used

random_state : integer or numpy.RandomState, optional

The generator used to initialize SMACOF (metric, nonmetric) MDS If an integer is given, it fixes the seed Defaults to the global numpy random number generator

References

[R1]Moon KR, van Dijk D, Zheng W, et al. (2017). “PHATE: A Dimensionality Reduction Method for Visualizing Trajectory Structures in High-Dimensional Biological Data”. Biorxiv.

Attributes

embedding (array-like, shape [n_samples, n_dimensions]) Stores the position of the dataset in the embedding space
diff_op (array-like, shape [n_samples, n_samples]) The diffusion operator fit on the input data
diff_potential (array-like, shape [n_samples, n_samples]) Precomputed diffusion potential

Methods

fit(X) Computes the position of the cells in the embedding space
fit_transform(X) Computes the position of the cells in the embedding space
get_params([deep]) Get parameters for this estimator.
reset_diffusion([t])
reset_mds([mds, mds_dist])
set_params(\*\*params) Set the parameters of this estimator.
fit(X)

Computes the position of the cells in the embedding space

Parameters:

X : array, shape=[n_samples, n_features]

Input data.

diff_op : array, shape=[n_samples, n_samples], optional

Precomputed diffusion operator

fit_transform(X)

Computes the position of the cells in the embedding space

Parameters:

X : array, shape=[n_samples, n_features]

Input data.

diff_op : array, shape=[n_samples, n_samples], optional

Precomputed diffusion operator

Returns:

embedding : array, shape=[n_samples, n_dimensions]

The cells embedded in a lower dimensional space using PHATE

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep : boolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params : mapping of string to any

Parameter names mapped to their values.

reset_diffusion(t=30)
reset_mds(mds='classic', mds_dist='euclidean')
set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:self
phate.phate.embed_phate(data, n_components=2, a=10, k=5, t=30, mds='classic', knn_dist='euclidean', mds_dist='euclidean', diff_op=None, diff_potential=None, njobs=1, random_state=None)

Embeds high dimensional single-cell data into two or three dimensions for visualization of biological progressions.

Parameters:

data : ndarray [n, p]

2 dimensional input data array with n cells and p dimensions

n_components : int, optional, default: 2

number of dimensions in which the data will be embedded

a : int, optional, default: 10

sets decay rate of kernel tails

k : int, optional, default: 5

used to set epsilon while autotuning kernel bandwidth

t : int, optional, default: 30

power to which the diffusion operator is powered sets the level of diffusion

mds : string, optional, default: ‘classic’

choose from [‘classic’, ‘metric’, ‘nonmetric’] which multidimensional scaling algorithm is used for dimensionality reduction

knn_dist : string, optional, default: ‘euclidean’

reccomended values: ‘eucliean’ and ‘cosine’ Any metric from scipy.spatial.distance can be used distance metric for building kNN graph

mds_dist : string, optional, default: ‘euclidean’

reccomended values: ‘eucliean’ and ‘cosine’ Any metric from scipy.spatial.distance can be used distance metric for MDS

diff_op : ndarray, optional [n, n], default: None

Precomputed diffusion operator

diff_potential : ndarray, optional [n, n], default: None

Precomputed diffusion potential

random_state : integer or numpy.RandomState, optional

The generator used to initialize SMACOF (metric, nonmetric) MDS If an integer is given, it fixes the seed Defaults to the global numpy random number generator

Returns:

embedding : ndarray [n_samples, n_components]

PHATE embedding in low dimensional space.

diff_op : ndarray [n_samples, n_samples]

PHATE embedding in low dimensional space.

References

[R2]Moon KR, van Dijk D, Zheng W, et al. (2017). “PHATE: A Dimensionality Reduction Method for Visualizing Trajectory Structures in High-Dimensional Biological Data”. Biorxiv.
phate.mds.cmdscale(D)

Classical multidimensional scaling (MDS) Copyright © 2014-7 Francis Song, New York University http://www.nervouscomputer.com/hfs/cmdscale-in-python/

Parameters:

D : (n, n) array

Symmetric distance matrix.

Returns:

Y : (n, p) array

Configuration matrix. Each column represents a dimension. Only the p dimensions corresponding to positive eigenvalues of B are returned. Note that each dimension is only determined up to an overall sign, corresponding to a reflection.

e : (n,) array

Eigenvalues of B.

phate.mds.embed_MDS(X, ndim=2, how='classic', distance_metric='euclidean', njobs=1, seed=None)

Performs classic, metric, and non-metric MDS

Parameters:

X: ndarray [n_samples, n_samples]

2 dimensional input data array with n_samples embed_MDS does not check for matrix squareness, but this is nescessary for PHATE

n_dim : int, optional, default: 2

number of dimensions in which the data will be embedded

how : string, optional, default: ‘classic’

choose from [‘classic’, ‘metric’, ‘nonmetric’] which MDS algorithm is used for dimensionality reduction

distance_metric : string, optional, default: ‘euclidean’

choose from [‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘cityblock’, ‘correlation’, ‘cosine’, ‘dice’, ‘euclidean’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘yule’] distance metric for MDS

njobs : integer, optional, default: 1

The number of jobs to use for the computation. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used

seed: integer or numpy.RandomState, optional

The generator used to initialize SMACOF (metric, nonmetric) MDS If an integer is given, it fixes the seed Defaults to the global numpy random number generator

Returns:

Y : ndarray [n_samples, n_dim]

low dimensional embedding of X using MDS