How to reproduce the results of overexpressing OSKM genes?

#333
by weilangchan - opened

I noticed that there might be a response related to it https://huggingface.co/ctheodoris/Geneformer/discussions/101. However, the corresponding code seems changed these days.

Thank you for your question. Please try the approach from the prior discussion. If you run into any issues feel free to reopen this discussion.

ctheodoris changed discussion status to closed

Here is the problem. How can i define the cell_states_to_model?
image.png

In the example, the cell_states_to_model is defined as follows:
cell_states_to_model={"state_key": "disease",
"start_state": "dcm",
"goal_state": "nf",
"alt_states": ["hcm"]}

Thank you for following up! It would be helpful for you to write what you are providing as the input so we can help you troubleshoot. However, you should follow the instructions in the documentation and you can provide an empty list for “alt_states” if you do not have an alternate state.

I am trying to reproduce your Extended Data Fig. 2c. Here is my input:

image.png

The dataset has three features: 'input_ids', 'length' and 'cell_type'.

Thank you for your question. So the state key is the feature that defines the states you are intending to model. The state key is cell_type in this case. The start state is fibroblast and the end state is iPSCs. You can provide an empty list for alternate state.

https://geneformer.readthedocs.io/en/latest/geneformer.in_silico_perturber.html

After conducting isp.perturb_data, I obtained a dict as follows:

image.png

How can i obtain the Extended Data Fig. 2c?

Besides, whether the "random" state is acquired by randomly overexpressing other 4 genes?

Please follow the example to process the data with InSilicoPerturberStats:

https://huggingface.co/ctheodoris/Geneformer/blob/main/examples/in_silico_perturbation.ipynb

Regarding random genes, you can do this by selecting random genes from the token dictionary to overexpress and compare the outputs.

After implementing the following code, I can only get a "Shift_to_goal_end" value. So what's wrong?

image.png

image.png

Sign up or log in to comment