{ "cells": [ { "cell_type": "code", "execution_count": null, "id": "30d3de80-424b-48e8-abd6-ef22da03611d", "metadata": {}, "outputs": [], "source": [ "# This cell is to allow automatic notebook generation for docs\n", "# You may want to comment this out if you have paste3 installed\n", "import sys\n", "from pathlib import Path\n", "\n", "sys.path.insert(0, str(Path.cwd().parent.parent.parent / \"src\"))" ] }, { "cell_type": "markdown", "id": "0f721a16-a25c-4c32-b562-d0e454e8f4e8", "metadata": {}, "source": [ "# Using the PASTE2 algorithm\n", "\n", "This noteook highlights the creation of slices (Anndata objects), usage of the `pairwise_align` and `center_align` functions of `paste3`, along with stacking and plotting functionalities.\n", "\n", "**This notebook primarily highlights how you would use the `paste3` package in `PASTE2` (i.e. partial alignment) mode, when adjacent slices do not fully overlap in space or have different cell type compositions.**" ] }, { "cell_type": "code", "execution_count": null, "id": "56e5cd44", "metadata": {}, "outputs": [], "source": [ "import matplotlib.patches as mpatches\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import scanpy as sc\n", "import seaborn as sns\n", "\n", "from paste3 import paste, visualization" ] }, { "cell_type": "markdown", "id": "6354bd1e", "metadata": {}, "source": [ "# Read in Spatial Transcriptomics slices as AnnData objects\n", "\n", "We provide four example ST slices from DLPFC patient 3, cropped to form partially overlapping subslices (See Figure 3A of our paper). Each slice is stored in an [AnnData](https://anndata.readthedocs.io/en/latest/) object." ] }, { "cell_type": "code", "execution_count": null, "id": "bda4d8de", "metadata": {}, "outputs": [], "source": [ "sliceA_filename = \"../../../tests/data/151673.h5ad\"\n", "sliceB_filename = \"../../../tests/data/151674.h5ad\"\n", "sliceC_filename = \"../../../tests/data/151675.h5ad\"\n", "sliceD_filename = \"../../../tests/data/151676.h5ad\"\n", "sliceA = sc.read_h5ad(sliceA_filename)\n", "sliceB = sc.read_h5ad(sliceB_filename)\n", "sliceC = sc.read_h5ad(sliceC_filename)\n", "sliceD = sc.read_h5ad(sliceD_filename)" ] }, { "cell_type": "markdown", "id": "e7aeee15", "metadata": {}, "source": [ "Each AnnData object consists of a gene expression matrx and spatial coordinate matrix. The gene expression matrix is stored in the .X field. The spatial coordiante matrix is stored in the .obsm['spatial'] field." ] }, { "cell_type": "code", "execution_count": null, "id": "d5a29610", "metadata": {}, "outputs": [], "source": [ "sliceA.X" ] }, { "cell_type": "code", "execution_count": null, "id": "41776ad1", "metadata": {}, "outputs": [], "source": [ "sliceA.obsm[\"spatial\"]" ] }, { "cell_type": "markdown", "id": "1b9fa55b", "metadata": {}, "source": [ "The rows of the AnnData objects are spots. The columns are genes." ] }, { "cell_type": "code", "execution_count": null, "id": "6e04f35a", "metadata": {}, "outputs": [], "source": [ "sliceA.obs" ] }, { "cell_type": "code", "execution_count": null, "id": "5e11e5e7", "metadata": {}, "outputs": [], "source": [ "sliceA.var" ] }, { "cell_type": "markdown", "id": "d1711fae", "metadata": {}, "source": [ "We can visualize the slices using [scanpy](https://scanpy.readthedocs.io/en/stable/). In this case, the .obs[\"layer_guess_reordered\"] field stores the layer annotation of each slice, so we use this field to color each spot." ] }, { "cell_type": "code", "execution_count": null, "id": "2d937bd4", "metadata": {}, "outputs": [], "source": [ "sc.pl.spatial(sliceA, color=\"layer_guess_reordered\", spot_size=125, frameon=False)\n", "sc.pl.spatial(sliceB, color=\"layer_guess_reordered\", spot_size=125, frameon=False)\n", "sc.pl.spatial(sliceC, color=\"layer_guess_reordered\", spot_size=125, frameon=False)\n", "sc.pl.spatial(sliceD, color=\"layer_guess_reordered\", spot_size=125, frameon=False)" ] }, { "cell_type": "markdown", "id": "304f646a", "metadata": {}, "source": [ "# Compute partial pairwise alignment using PASTE2\n", "\n", "Give a pair of partially overlapping slices, we can use `partial_pairwise_align()` to find an alignment matrix. To call the function, you need to input the AnnData objects of the two slices, as well as a parameter s, which indicates the overlap percentage of the two slices. In this tutorial, each pair of cropped subslices overlap at 70% of the areas, so we set `overlap_fraction=0.7`. For your own datasets you should visualize the slices and manually determine the approximate overlap percentage (this parameter does not have to be very accurate).\n", "\n", "Now we compute an alignment matrix between each pair of slices in our example dataset.\n", "\n", "**In the calls to `pairwise_align` below, we're using maxIter=20 here to specify a maximum of 20 iterations for pairwise_align. This is only to allow this demo to run in a resonable amount of time. In a real alignment scenario, you should not include this argument**." ] }, { "cell_type": "code", "execution_count": null, "id": "c1d136f0", "metadata": {}, "outputs": [], "source": [ "pi_AB, _ = paste.pairwise_align(sliceA, sliceB, overlap_fraction=0.7, maxIter=20)" ] }, { "cell_type": "code", "execution_count": null, "id": "3e6a6ae1", "metadata": {}, "outputs": [], "source": [ "pi_BC, _ = paste.pairwise_align(sliceB, sliceC, overlap_fraction=0.7, maxIter=20)" ] }, { "cell_type": "code", "execution_count": null, "id": "b582c420", "metadata": {}, "outputs": [], "source": [ "pi_CD, _ = paste.pairwise_align(sliceC, sliceD, overlap_fraction=0.7, maxIter=20)" ] }, { "cell_type": "markdown", "id": "8f50f2a0", "metadata": {}, "source": [ "Let's check the shape of each alignment matrix. For aligning a slice with n1 spots and a slice with n2 spots, the alignment matrix should be of shape (n1 * n2)" ] }, { "cell_type": "code", "execution_count": null, "id": "058bd830", "metadata": {}, "outputs": [], "source": [ "print(pi_AB.shape)\n", "print(pi_BC.shape)\n", "print(pi_CD.shape)" ] }, { "cell_type": "markdown", "id": "d09a7834", "metadata": {}, "source": [ "There are other optional parameters to PASTE2.partial_pairwise_align() as well. You can checkout the original function signature in the souce code with documentation.\n", "\n", "Let's visualize the alignment between sliceA and sliceB:" ] }, { "cell_type": "code", "execution_count": null, "id": "aa69f37f", "metadata": {}, "outputs": [], "source": [ "def largest_indices(ary, n):\n", " \"\"\"Returns the n largest indices from a numpy array.\"\"\"\n", " flat = ary.flatten()\n", " indices = np.argpartition(flat, -n)[-n:]\n", " indices = indices[np.argsort(-flat[indices])]\n", " return np.unravel_index(indices, ary.shape)\n", "\n", "\n", "def plot2D_samples_mat(xs, xt, G, alpha=0.2, top=1000, weight_alpha=False, **kwargs):\n", " if (\"color\" not in kwargs) and (\"c\" not in kwargs):\n", " kwargs[\"color\"] = \"k\"\n", " mx = G.max().item()\n", " # idx = np.where(G/mx>=thr)\n", " idx = largest_indices(G.cpu().numpy(), top)\n", " for i in range(len(idx[0])):\n", " plt.plot(\n", " [xs[idx[0][i], 0], xt[idx[1][i], 0]],\n", " [xs[idx[0][i], 1], xt[idx[1][i], 1]],\n", " alpha=alpha * (1 - weight_alpha)\n", " + (weight_alpha * G[idx[0][i], idx[1][i]].item() / mx),\n", " c=\"k\",\n", " )\n", "\n", "\n", "def plot_slice_pairwise_alignment(\n", " slice1, slice2, pi, alpha=0.05, top=1000, weight_alpha=False\n", "):\n", " coordinates1, coordinates2 = slice1.obsm[\"spatial\"], slice2.obsm[\"spatial\"]\n", " offset = (coordinates1[:, 0].max() - coordinates2[:, 0].min()) * 1.1\n", " temp = np.zeros(coordinates2.shape)\n", " temp[:, 0] = offset\n", " plt.figure(figsize=(20, 10))\n", " plot2D_samples_mat(\n", " coordinates1,\n", " coordinates2 + temp,\n", " pi,\n", " c=\"k\",\n", " alpha=alpha,\n", " top=top,\n", " weight_alpha=weight_alpha,\n", " )\n", " plt.scatter(\n", " coordinates1[:, 0],\n", " coordinates1[:, 1],\n", " linewidth=0,\n", " s=100,\n", " marker=\".\",\n", " color=list(\n", " slice1.obs[\"layer_guess_reordered\"].map(\n", " dict(\n", " zip(\n", " slice1.obs[\"layer_guess_reordered\"].cat.categories,\n", " slice1.uns[\"layer_guess_reordered_colors\"],\n", " strict=False,\n", " )\n", " )\n", " )\n", " ),\n", " )\n", " plt.scatter(\n", " coordinates2[:, 0] + offset,\n", " coordinates2[:, 1],\n", " linewidth=0,\n", " s=100,\n", " marker=\".\",\n", " color=list(\n", " slice2.obs[\"layer_guess_reordered\"].map(\n", " dict(\n", " zip(\n", " slice2.obs[\"layer_guess_reordered\"].cat.categories,\n", " slice2.uns[\"layer_guess_reordered_colors\"],\n", " strict=False,\n", " )\n", " )\n", " )\n", " ),\n", " )\n", " plt.gca().invert_yaxis()\n", " plt.axis(\"off\")\n", " plt.show()\n", "\n", "\n", "plot_slice_pairwise_alignment(sliceA, sliceB, pi_AB)" ] }, { "cell_type": "markdown", "id": "99813ce8", "metadata": {}, "source": [ "# Project all slices onto the same coordiante system according to the alignment\n", "\n", "Once the alignment matrix between each pair of adjacent slices in a sequence of consecutive slices are computed, we can use this information to project all slices onto the same 2D coordinate system. 3D reconstruction can be done by assiging a z-coordiante to each slice after the projection.\n", "\n", "Specifically, we use visualization.partial_stack_slices_pairwise( ):" ] }, { "cell_type": "code", "execution_count": null, "id": "ba8ff201", "metadata": {}, "outputs": [], "source": [ "pis = [pi_AB, pi_BC, pi_CD]\n", "slices = [sliceA, sliceB, sliceC, sliceD]\n", "\n", "new_slices, _, _ = visualization.stack_slices_pairwise(slices, pis, is_partial=True)" ] }, { "cell_type": "markdown", "id": "13061316", "metadata": {}, "source": [ "Now let's plot the coordinates of all slices after the projection:" ] }, { "cell_type": "code", "execution_count": null, "id": "36ab1b6b", "metadata": {}, "outputs": [], "source": [ "layer_to_color_map = {f\"Layer{i + 1}\": sns.color_palette()[i] for i in range(6)}\n", "layer_to_color_map[\"WM\"] = sns.color_palette()[6]\n", "\n", "\n", "def plot_slices_overlap(slices, layer_to_color_map=layer_to_color_map):\n", " plt.figure(figsize=(10, 10))\n", " for i in range(len(slices)):\n", " adata = slices[i]\n", " colors = list(\n", " adata.obs[\"layer_guess_reordered\"].astype(\"str\").map(layer_to_color_map)\n", " )\n", " plt.scatter(\n", " adata.obsm[\"spatial\"][:, 0],\n", " adata.obsm[\"spatial\"][:, 1],\n", " linewidth=0,\n", " s=100,\n", " marker=\".\",\n", " color=colors,\n", " )\n", " plt.legend(\n", " handles=[\n", " mpatches.Patch(\n", " color=layer_to_color_map[\n", " adata.obs[\"layer_guess_reordered\"].cat.categories[i]\n", " ],\n", " label=adata.obs[\"layer_guess_reordered\"].cat.categories[i],\n", " )\n", " for i in range(len(adata.obs[\"layer_guess_reordered\"].cat.categories))\n", " ],\n", " fontsize=10,\n", " title=\"Cortex layer\",\n", " title_fontsize=15,\n", " bbox_to_anchor=(1, 1),\n", " )\n", " plt.gca().invert_yaxis()\n", " plt.axis(\"off\")\n", " plt.show()\n", "\n", "\n", "plot_slices_overlap(new_slices)" ] }, { "cell_type": "markdown", "id": "9c3b98a2", "metadata": {}, "source": [ "Or just the first two, which reproduces Figure 3C of the paper:" ] }, { "cell_type": "code", "execution_count": null, "id": "388f5444", "metadata": {}, "outputs": [], "source": [ "plot_slices_overlap(new_slices[:2])" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.7" } }, "nbformat": 4, "nbformat_minor": 5 }