spDDB’s Bi-variate Spatial and Non-spatial evaluation metrics - DLPFC 151508
!git clone https://github.com/Zafar-Lab/spDDB.git
%cd spDDB/Experiments/_Deconvolution_Metrics_Calculation/
fatal: destination path 'spDDB' already exists and is not an empty directory.
/content/spDDB/Experiments/_Deconvolution_Metrics_Calculation
Mounting google drive to accessing input data
from google.colab import drive
drive.mount('/content/drive')
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
!pip install scanpy
Requirement already satisfied: scanpy in /usr/local/lib/python3.12/dist-packages (1.12.1)
Requirement already satisfied: anndata>=0.10.8 in /usr/local/lib/python3.12/dist-packages (from scanpy) (0.12.16)
Requirement already satisfied: certifi in /usr/local/lib/python3.12/dist-packages (from scanpy) (2026.4.22)
Requirement already satisfied: fast-array-utils>=1.4 in /usr/local/lib/python3.12/dist-packages (from fast-array-utils[accel,sparse]>=1.4->scanpy) (1.4.1)
Requirement already satisfied: h5py>=3.11 in /usr/local/lib/python3.12/dist-packages (from scanpy) (3.16.0)
Requirement already satisfied: joblib in /usr/local/lib/python3.12/dist-packages (from scanpy) (1.5.3)
Requirement already satisfied: legacy-api-wrap>=1.5 in /usr/local/lib/python3.12/dist-packages (from scanpy) (1.5)
Requirement already satisfied: matplotlib>=3.9 in /usr/local/lib/python3.12/dist-packages (from scanpy) (3.10.0)
Requirement already satisfied: natsort in /usr/local/lib/python3.12/dist-packages (from scanpy) (8.4.0)
Requirement already satisfied: networkx>=2.8.8 in /usr/local/lib/python3.12/dist-packages (from scanpy) (3.6.1)
Requirement already satisfied: numba>=0.60 in /usr/local/lib/python3.12/dist-packages (from scanpy) (0.60.0)
Requirement already satisfied: numpy>=2 in /usr/local/lib/python3.12/dist-packages (from scanpy) (2.0.2)
Requirement already satisfied: packaging>=25 in /usr/local/lib/python3.12/dist-packages (from scanpy) (26.1)
Requirement already satisfied: pandas>=2.3 in /usr/local/lib/python3.12/dist-packages (from scanpy) (2.3.3)
Requirement already satisfied: patsy in /usr/local/lib/python3.12/dist-packages (from scanpy) (1.0.2)
Requirement already satisfied: pynndescent>=0.5.13 in /usr/local/lib/python3.12/dist-packages (from scanpy) (0.6.0)
Requirement already satisfied: scikit-learn>=1.6 in /usr/local/lib/python3.12/dist-packages (from scanpy) (1.6.1)
Requirement already satisfied: scipy>=1.13 in /usr/local/lib/python3.12/dist-packages (from scanpy) (1.16.3)
Requirement already satisfied: seaborn>=0.13.2 in /usr/local/lib/python3.12/dist-packages (from scanpy) (0.13.2)
Requirement already satisfied: session-info2 in /usr/local/lib/python3.12/dist-packages (from scanpy) (0.4.1)
Requirement already satisfied: statsmodels>=0.14.5 in /usr/local/lib/python3.12/dist-packages (from scanpy) (0.14.6)
Requirement already satisfied: tqdm in /usr/local/lib/python3.12/dist-packages (from scanpy) (4.67.3)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.12/dist-packages (from scanpy) (4.15.0)
Requirement already satisfied: umap-learn>=0.5.12 in /usr/local/lib/python3.12/dist-packages (from scanpy) (0.5.12)
Requirement already satisfied: array-api-compat>=1.7.1 in /usr/local/lib/python3.12/dist-packages (from anndata>=0.10.8->scanpy) (1.14.0)
Requirement already satisfied: scverse-misc>=0.0.3 in /usr/local/lib/python3.12/dist-packages (from anndata>=0.10.8->scanpy) (0.0.7)
Requirement already satisfied: zarr!=3.0.*,>=2.18.7 in /usr/local/lib/python3.12/dist-packages (from anndata>=0.10.8->scanpy) (3.2.1)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib>=3.9->scanpy) (1.3.3)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.12/dist-packages (from matplotlib>=3.9->scanpy) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib>=3.9->scanpy) (4.62.1)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib>=3.9->scanpy) (1.5.0)
Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.12/dist-packages (from matplotlib>=3.9->scanpy) (11.3.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib>=3.9->scanpy) (3.3.2)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.12/dist-packages (from matplotlib>=3.9->scanpy) (2.9.0.post0)
Requirement already satisfied: llvmlite<0.44,>=0.43.0dev0 in /usr/local/lib/python3.12/dist-packages (from numba>=0.60->scanpy) (0.43.0)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas>=2.3->scanpy) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas>=2.3->scanpy) (2026.1)
Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn>=1.6->scanpy) (3.6.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.7->matplotlib>=3.9->scanpy) (1.17.0)
Requirement already satisfied: donfig>=0.8 in /usr/local/lib/python3.12/dist-packages (from zarr!=3.0.*,>=2.18.7->anndata>=0.10.8->scanpy) (0.8.1.post1)
Requirement already satisfied: google-crc32c>=1.5 in /usr/local/lib/python3.12/dist-packages (from zarr!=3.0.*,>=2.18.7->anndata>=0.10.8->scanpy) (1.8.0)
Requirement already satisfied: numcodecs>=0.14 in /usr/local/lib/python3.12/dist-packages (from zarr!=3.0.*,>=2.18.7->anndata>=0.10.8->scanpy) (0.16.5)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.12/dist-packages (from donfig>=0.8->zarr!=3.0.*,>=2.18.7->anndata>=0.10.8->scanpy) (6.0.3)
Importing Libraries
import pandas as pd
import numpy as np
import scanpy as sc
import scipy
from scipy.spatial.distance import pdist, squareform
import matplotlib.pyplot as plt
from pathlib import Path
from scipy.stats import entropy
from scipy.spatial.distance import jensenshannon
import seaborn as sns
import os
import pickle
from matplotlib import rcParams
from scipy.spatial import distance
from metrics import *
from create_update_metrics import *
"""
Create_new_evaluation: Function to compute all the metrics for all the methods for a dataset.
Update_method_evaluation: Function to update the metric values for a particular method for a dataset.
Update_metric_evaluation: Function to update the metric value for all the methods
Note: l and col are dataset specfic, l = 1.2 default for Visium datasets.
"""
'\nCreate_new_evaluation: Function to compute all the metrics for all the methods for a dataset.\nUpdate_method_evaluation: Function to update the metric values for a particular method for a dataset.\nUpdate_metric_evaluation: Function to update the metric value for all the methods\nNote: l and col are dataset specfic, l = 1.2 default for Visium datasets.\n'
def evaluate_method(gt, pred, coords, method, di, dataset_name, l, eps, co, metrics):
### RMSE Computation; joint excel sheet not working
"""
rmse_path = "/data/Ajita/Spatial/Datasets/Spatial_Deconvolution/_Evaluation/Metrics_Calculation/rmse.csv"
jsd_path = "/data/Ajita/Spatial/Datasets/Spatial_Deconvolution/_Evaluation/Metrics_Calculation/jsd.csv"
rmse = pd.read_csv(rmse_path, index_col=0)
jsd = pd.read_csv(jsd_path, index_col=0)
"""
gt, pred, coords = preprocess_predictions(gt, pred, method, coords)
di["RMSE"].loc[1, method] = get_rmse(pred, gt)
di["JSD"].loc[1, method] = get_jsd(pred, gt)
"""
rmse.loc[dataset_name, method] = di["RMSE"].loc[1, method]
jsd.loc[dataset_name, method] = di["JSD"].loc[1, method]
rmse.to_csv(rmse_path)
jsd.to_csv(jsd_path)
"""
# Other metrics computation
# Check to see if only RMSE and JS needs to be updated.By default, RMSE and JS will always be updated.
if (len(metrics) <= 2) and (("RMSE" in metrics) or ("JSD" in metrics)):
print ("only RMSE and JS updated")
return di
for col in np.array(gt.columns):
for m in metrics:
if (m == "pearson"):
val = get_pearson(gt[col], pred[col], eps = eps)
di["pearson"].loc[col, method] = val
#if (method == "STRIDE"):
# print ("pearson", val)
elif (m == "cosine_sim"):
val = get_cosine_sim(gt[col], pred[col], eps)
di["cosine_sim"].loc[col, method] = val
#print ("cosine similarity", val)
elif (m == "morans_r"):
val = get_morans_R(gt[col], pred[col], coords, l=l, co=co, eps = eps)
di["morans_r"].loc[col, method] = val
#if (method == "STRIDE"):
#print ("Moran's R", val)
# di['spearman'].loc[col, method] = get_spearman(gt[col],pred[col],coords,l = l, co = co)
elif(m == "spatial_pearson"):
val = get_spatial_pearson(gt[col], pred[col], coords, l=l, co=co, eps = eps)
di["spatial_pearson"].loc[col, method] = val
#print ("spatial pearson", val)
elif (m == "ssim"):
val = compute_ssim(gt[col], pred[col], eps = eps)
di["ssim"].loc[col, method] = val
#print ("ssim", val)
elif (m == "lee_stat"):
val = compute_Lee_stats(gt[col], pred[col], coords, l=l, co=co, eps = eps)
di["lee_stat"].loc[col, method] = val
#print ("lee stats", val)
elif (m == "geary_c"):
val = compute_geary(gt[col], pred[col], coords, l=l, co=co, eps = eps)
di["geary_c"].loc[col, method] = val
#print ("geary_c", val)
elif (m == "AUPR"):
compute_AUPR(gt[col], pred[col], coords, l=l, co=co, eps = eps)
print (method + "done")
return di
def create_new_evaluation(path_to_outputs, ground_truth_path, rmse_path, jsd_path, dataset_name, col, methods_all, celltype_metrics,
global_metrics, l = 1.2, eps = 1e-8, co = 0):
gt, coords, adata = preprocess_groundtruth(ground_truth_path, col, dataset_name)
di = {}
for metric in (celltype_metrics):
di[metric] = pd.DataFrame(columns = methods_all, index = adata.var_names)
for metric in (global_metrics):
di[metric] = pd.DataFrame(columns = methods_all, index = [1])
for method in methods_all:
print (method)
try:
if method == "STRIDE":
pred = pd.read_table(
path_to_outputs + "output_" + method + ".csv", index_col=0, sep="\t"
)
elif method == "Polaris":
pred = pd.read_table(
path_to_outputs + "output_" + method + ".tsv", index_col=0
)
else:
pred = pd.read_csv(
path_to_outputs + "output_" + method + ".csv", index_col=0
)
except:
print(method + " output not found")
continue
# If new method is getting evaluated, all metrics should be updated.
di = evaluate_method(gt,pred,coords,method,di, dataset_name, l, eps, co, celltype_metrics + global_metrics)
pickle.dump(di,open(path_to_outputs + "Metrics/eval.pkl","wb"))
for metric in (celltype_metrics + global_metrics):
di[metric].to_csv(path_to_outputs + "Metrics/" + metric + '.csv')
eps = 0.00000001
co = 0
# Sample run on SONAR output
l=1.2
dataset_name = "DLPFC151508"
col = "Average_SynthST_ReX_Norm"
celltype_metrics = ['cosine_sim','morans_r','spatial_pearson','pearson','lee_stat','ssim','geary_c']
global_metrics = ['JSD','RMSE']
data_path = "/content/drive/MyDrive/Major_project/Benchmarking_Shared/spDDB_tutorials/4_data/"
root_path = data_path + "151508/"
#gt_path = data_path + "Simulated_cell_type_proportion_DLPFC_151508.csv"
gt_path = "/content/drive/MyDrive/Major_project/Benchmarking_Shared/spDDB_tutorials/1_data/output_CTP/simulated_st.h5ad"
rmse_path = root_path + "RMSE.csv"
jsd_path = root_path + "JSD.csv"
create_new_evaluation(root_path, gt_path, rmse_path, jsd_path, dataset_name, col, ["Autogenes"], celltype_metrics,
global_metrics, l = l, eps = eps, co = co)
Autogenes
Num cells dropped:0
preprocessing finished (4381, 17) and (4381, 17)
Autogenesdone