paste3.paste.pairwise_align
- paste3.paste.pairwise_align(a_slice, b_slice, overlap_fraction=None, exp_dissim_matrix=None, alpha=0.1, exp_dissim_metric='kl', pi_init=None, a_spots_weight=None, b_spots_weight=None, norm=False, numItermax=200, use_gpu=True, maxIter=1000, optimizeTheta=True, eps=0.0001, do_histology=False)[source]
Returns a mapping
between spots in one slice and spots in another slice while preserving gene expression and spatial distances of mapped spots, where describes the probability that a spot i in the first slice is aligned to a spot j in the second slice.Given slices
and containing and spots, respectively, over the same genes, an expression cost function , and a parameter , this function finds a mapping that minimizes the following transport cost:subject to the regularity constraint that
has to be a probabilistic coupling between and :Where:
and represent the gene expression data for each slice, and represent the spatial distance matrices for each slice, is a cost function applied to expression differences, and is a parameter that balances expression and spatial distance preservation in the mapping. and represent probability distribution over the spots in slice and , respectively
Note
When the value for
is provided, this function solves the by minimizing the same objective function as Equation (1), but with a different set of constraints that allow for unmapped spots. Given a parameter describing the fraction of mass to transport between and , we define a set of - couplings between distributions and as:Where:
is the overlap percentage between the two slices to align. (The constraint ensures that only the fraction of probability mass is transported)
- Parameters:
a_slice (AnnData) -- AnnData object containing data for the first slice.
b_slice (AnnData) -- AnnData object containing data for the second slice.
overlap_fraction (float, optional) -- Fraction of overlap between the two slices, must be between 0 and 1. If None, full alignment is performed.
exp_dissim_matrix (np.ndarray, optional) -- Precomputed expression dissimilarity matrix between two slices. If None, it will be computed.
alpha (float, default=0.1) -- Regularization parameter balancing transcriptional dissimilarity and spatial distance among aligned spots. Setting alpha = 0 uses only transcriptional information, while alpha = 1 uses only spatial coordinates.
exp_dissim_metric (str, default="kl") -- Metric used to compute the expression dissimilarity with the following options: - 'kl' for Kullback-Leibler divergence between slices, - 'euc' for Euclidean distance, - 'gkl' for generalized Kullback-Leibler divergence, - 'selection_kl' for a selection-based KL approach, - 'pca' for Principal Component Analysis, - 'glmpca' for Generalized Linear Model PCA.
pi_init (np.ndarray, optional) -- Initial transport plan. If None, it will be computed.
a_spots_weight (np.ndarray, optional) -- Weight distribution for the spots in the first slice. If None, uniform weights are used.
b_spots_weight (np.ndarray, optional) -- Weight distribution for the spots in the second slice. If None, uniform weights are used.
norm (bool, default=False) -- If True, normalizes spatial distances.
numItermax (int, default=200) -- Maximum number of iterations for the optimization.
use_gpu (bool, default=True) -- Whether to use GPU for computations. If True but no GPU is available, will default to CPU.
maxIter (int, default=1000) -- Maximum number of iterations for the dissimilarity calculation.
optimizeTheta (bool, default=True) -- Whether to optimize theta during dissimilarity calculation.
eps (float, default=1e-4) -- Tolerance level for convergence.
do_histology (bool, default=False) -- If True, incorporates RGB dissimilarity from histology data.
- Returns:
pi : np.ndarray Optimal transport plan for aligning the two slices.
info : Optional[int] Information on the optimization process (if return_obj is True), else None.
- Return type:
Tuple[np.ndarray, Optional[int]]