ctheodoris/Geneformer · In Silico perturbation consumes all the GPU memory

Jul 9

I am attempting to run in silico perturbations on single and multiple genes. Despite reducing forward batch sizes and the number of cells used, my GPU memory is consistently maxed out. I have access to eight Tesla V100-SXM2-16GB GPUs, each with 16GB of memory, but only one GPU is being utilized for the perturbations, leading to memory exhaustion.

isp = InSilicoPerturber(perturb_type="delete",
perturb_rank_shift=None,
genes_to_perturb= "all",
combos=0,
anchor_gene=None,
model_type="CellClassifier",
num_classes=3,
emb_mode="cell",
cell_emb_style="mean_pool",
filter_data=filter_data_dict,
cell_states_to_model=cell_states_to_model,
state_embs_dict=state_embs_dict,
max_ncells= 20,
emb_layer=0,
forward_batch_size=4,
nproc=1)

Is there a direct solution for this?
OR
Will implementing Distributed Data Parallel Multi-GPU inferencing using 'torch.nn.parallel.DistributedDataParallel' solve the issue?

vivekanand4 changed discussion title from In Silico perturbation consumes a all the GPU memory - OutofMemoryError: CUDA out of memory. to In Silico perturbation consumes all the GPU memory - OutofMemoryError: CUDA out of memory. Jul 9

ctheodoris

Owner Jul 10

Thank you for your question! To minimize the memory used, you can reduce the batch size to 1. In the genes_to_perturb="all" case, you can edit the code to clear the memory more frequently (currently it does so every 1000 cells). If batch size of 1 is able to fit, you can parallelize by data to any number of GPUs as it is only inference. If you are still memory-limited with a batch size of 1, you may consider distributing the model so that more space can be freed for data/computation.

ctheodoris changed discussion status to closed Jul 10