Tutorial 4: scDREAMER atlas-level unsupervised integration of cross species dataset i.e. human and mouse

Open In Colab

Importing Libraries

[2]:
import warnings
warnings.filterwarnings('ignore')

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0" # difference between gpu '0' and '1'

import numpy as np
import tensorflow as tf2
import random
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

import scipy.io
from sklearn.decomposition import PCA
import pdb

import pandas as pd
import scanpy as sc


import scipy.sparse
from sklearn.metrics.cluster import normalized_mutual_info_score as nmi
from sklearn.cluster import KMeans
from sklearn.mixture import GaussianMixture
from scipy import stats
from scipy import *
import datetime
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import LabelEncoder

os.getpid()

2023-09-26 13:02:52.426478: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-26 13:02:53.358652: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
WARNING:tensorflow:From /home/ajita/anaconda3/lib/python3.9/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
[2]:
5884

Visualization of Human Mouse data

[ ]:
adata = sc.read_h5ad("/home/ajita/Documents/data_integration/hum_mou/hcl_mca_merged.h5ad")
sc.pp.neighbors(adata, use_rep = "X")
sc.tl.umap(adata)

[4]:
sc.pl.umap(adata, color = "celltype", frameon = False)
sc.pl.umap(adata, color = "batch", frameon = False)
_images/scDREAMER_Atlas_level_integration_6_0.png
_images/scDREAMER_Atlas_level_integration_6_1.png

Setting seed for reproducibility

[2]:

np.random.seed(0) tf.set_random_seed(0) random.seed(0) tf2.random.set_seed(0) tf2.keras.utils.set_random_seed(0)

Building model

[3]:
name = "Human_Mouse"
[4]:

""" Specify path of the input data here... """ data_path = { "Human_Mouse" : "/home/ajita/Documents/data_integration/hum_mou/hcl_mca_merged.h5ad", } batch_key_dict = { 'Human_Mouse' : 'batch', } cell_type_key_dict = { 'Human_Mouse' : "celltype", } learning_rate = { 'Immune_Human' : {"lr_ae" : 0.0002, "lr_dis": 0.0007}, # Small Datasets 'Human_Mouse' : {"lr_ae" : 0.0001, "lr_dis": 0.00001}} # Big Datasets >= 0.5 million cells

Run setting

[5]:
from scDREAMER import scDREAMER


run_config = tf.ConfigProto()

run_config.gpu_options.per_process_gpu_memory_fraction = 0.333
run_config.gpu_options.allow_growth = True

with tf.Session(config = run_config) as sess:

    dreamer = scDREAMER(
        sess,
        epoch = 100,
        z_dim = 20,
        dataset_name = data_path[name],
        batch = batch_key_dict[name],
        cell_type = cell_type_key_dict[name],
        name = name,
        lr_ae = learning_rate[name]['lr_ae'],
        lr_dis = learning_rate[name]['lr_dis']
        )

    dreamer.train_cluster()


2023-09-22 17:51:09.671582: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 5367 MB memory:  -> device: 0, name: Quadro RTX 5000, pci bus id: 0000:af:00.0, compute capability: 7.5
Reading data
encoder input shape  Tensor("concat:0", shape=(?, 2002), dtype=float32)
WARNING:tensorflow:From /home/ajita/anaconda3/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py:1176: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From /home/ajita/anaconda3/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py:1176: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
decoder input shape  Tensor("concat_2:0", shape=(?, 22), dtype=float32)
KL gaussian z Tensor("mul_10:0", shape=(?,), dtype=float32)
KL gaussian l Tensor("mul_9:0", shape=(?,), dtype=float32)
WARNING:tensorflow:From /home/ajita/anaconda3/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py:1176: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.

scDREAMER on DataSet /home/ajita/Documents/data_integration/hum_mou/hcl_mca_merged.h5ad ...
2023-09-22 17:51:42.264559: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:353] MLIR V1 optimization pass is not enabled
Epoch : [0] ,  a_loss = 99.1765
Epoch : [10] ,  a_loss = 81.1942
Epoch : [20] ,  a_loss = 80.7632
Epoch : [30] ,  a_loss = 80.5576
Epoch : [40] ,  a_loss = 80.2687
Epoch : [50] ,  a_loss = 80.0525
Epoch : [60] ,  a_loss = 79.9368
Epoch : [70] ,  a_loss = 79.9967
Epoch : [80] ,  a_loss = 80.2679
Epoch : [90] ,  a_loss = 80.2539
latent_matrix shape (933704, 20)
(933704,)
_images/scDREAMER_Atlas_level_integration_13_4.png
None
_images/scDREAMER_Atlas_level_integration_13_6.png
Done !
[ ]: