Tutorial 4: scDREAMER atlas-level unsupervised integration of cross species dataset i.e. human and mouse
Importing Libraries
[2]:
import warnings
warnings.filterwarnings('ignore')
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0" # difference between gpu '0' and '1'
import numpy as np
import tensorflow as tf2
import random
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import scipy.io
from sklearn.decomposition import PCA
import pdb
import pandas as pd
import scanpy as sc
import scipy.sparse
from sklearn.metrics.cluster import normalized_mutual_info_score as nmi
from sklearn.cluster import KMeans
from sklearn.mixture import GaussianMixture
from scipy import stats
from scipy import *
import datetime
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import LabelEncoder
os.getpid()
2023-09-26 13:02:52.426478: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-26 13:02:53.358652: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
WARNING:tensorflow:From /home/ajita/anaconda3/lib/python3.9/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
[2]:
5884
Visualization of Human Mouse data
[ ]:
adata = sc.read_h5ad("/home/ajita/Documents/data_integration/hum_mou/hcl_mca_merged.h5ad")
sc.pp.neighbors(adata, use_rep = "X")
sc.tl.umap(adata)
[4]:
sc.pl.umap(adata, color = "celltype", frameon = False)
sc.pl.umap(adata, color = "batch", frameon = False)
Setting seed for reproducibility
[2]:
np.random.seed(0)
tf.set_random_seed(0)
random.seed(0)
tf2.random.set_seed(0)
tf2.keras.utils.set_random_seed(0)
Building model
[3]:
name = "Human_Mouse"
[4]:
"""
Specify path of the input data here...
"""
data_path = {
"Human_Mouse" : "/home/ajita/Documents/data_integration/hum_mou/hcl_mca_merged.h5ad",
}
batch_key_dict = {
'Human_Mouse' : 'batch',
}
cell_type_key_dict = {
'Human_Mouse' : "celltype",
}
learning_rate = {
'Immune_Human' : {"lr_ae" : 0.0002, "lr_dis": 0.0007}, # Small Datasets
'Human_Mouse' : {"lr_ae" : 0.0001, "lr_dis": 0.00001}} # Big Datasets >= 0.5 million cells
Run setting
[5]:
from scDREAMER import scDREAMER
run_config = tf.ConfigProto()
run_config.gpu_options.per_process_gpu_memory_fraction = 0.333
run_config.gpu_options.allow_growth = True
with tf.Session(config = run_config) as sess:
dreamer = scDREAMER(
sess,
epoch = 100,
z_dim = 20,
dataset_name = data_path[name],
batch = batch_key_dict[name],
cell_type = cell_type_key_dict[name],
name = name,
lr_ae = learning_rate[name]['lr_ae'],
lr_dis = learning_rate[name]['lr_dis']
)
dreamer.train_cluster()
2023-09-22 17:51:09.671582: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 5367 MB memory: -> device: 0, name: Quadro RTX 5000, pci bus id: 0000:af:00.0, compute capability: 7.5
Reading data
encoder input shape Tensor("concat:0", shape=(?, 2002), dtype=float32)
WARNING:tensorflow:From /home/ajita/anaconda3/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py:1176: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From /home/ajita/anaconda3/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py:1176: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
decoder input shape Tensor("concat_2:0", shape=(?, 22), dtype=float32)
KL gaussian z Tensor("mul_10:0", shape=(?,), dtype=float32)
KL gaussian l Tensor("mul_9:0", shape=(?,), dtype=float32)
WARNING:tensorflow:From /home/ajita/anaconda3/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py:1176: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.
See `tf.nn.softmax_cross_entropy_with_logits_v2`.
scDREAMER on DataSet /home/ajita/Documents/data_integration/hum_mou/hcl_mca_merged.h5ad ...
2023-09-22 17:51:42.264559: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:353] MLIR V1 optimization pass is not enabled
Epoch : [0] , a_loss = 99.1765
Epoch : [10] , a_loss = 81.1942
Epoch : [20] , a_loss = 80.7632
Epoch : [30] , a_loss = 80.5576
Epoch : [40] , a_loss = 80.2687
Epoch : [50] , a_loss = 80.0525
Epoch : [60] , a_loss = 79.9368
Epoch : [70] , a_loss = 79.9967
Epoch : [80] , a_loss = 80.2679
Epoch : [90] , a_loss = 80.2539
latent_matrix shape (933704, 20)
(933704,)
None
Done !
[ ]: