A bug when loading the new gene embeddings extractor

#284
by iLOVE2D - opened

Hi, I found a bug when I tried to load the current gene embeddings function:

TypeError                                 Traceback (most recent call last)
Cell In[1], line 1
----> 1 from geneformer_new import EmbExtractor

File /gpfs/gibbs/pi/zhao/tl688/Geneformer/geneformer_new/__init__.py:4
      2 from . import pretrainer
      3 from . import collator_for_classification
----> 4 from . import in_silico_perturber
      5 from . import in_silico_perturber_stats
      6 from .tokenizer import TranscriptomeTokenizer

File /gpfs/gibbs/pi/zhao/tl688/Geneformer/geneformer_new/in_silico_perturber.py:40
     37 from datasets import Dataset
     38 from tqdm.auto import trange
---> 40 from . import perturber_utils as pu
     41 from .emb_extractor import get_embs
     42 from .tokenizer import TOKEN_DICTIONARY_FILE

File /gpfs/gibbs/pi/zhao/tl688/Geneformer/geneformer_new/perturber_utils.py:278
    270     return torch.stack(output_batch_list_padded)
    273 # removes perturbed indices
    274 # need to handle the various cases where a set of genes is overexpressed
    275 def remove_perturbed_indices_set(
    276     emb,
    277     perturb_type: str,
--> 278     indices_to_perturb: list[list],
    279     tokens_to_perturb: list[list],
    280     original_lengths: list[int],
    281     input_ids=None,
    282 ):
    283     if perturb_type == "overexpress":
    284         num_perturbed = len(tokens_to_perturb)

TypeError: 'type' object is not subscriptable

Could you please help me address it? Thanks.

Thank you for your interest in Geneformer! Python before 3.9 requires capitalized annotations from the typing library rather than the built-in collection types starting in 3.9: (https://docs.python.org/release/3.9.0/whatsnew/3.9.html#type-hinting-generics-in-standard-collections). We would recommend updating to 3.9+ to avoid this issue.

ctheodoris changed discussion status to closed

Sign up or log in to comment