Edit model card

ESM Protein Localization Model

Model description

The ESM Protein Localization Model is a deep learning model trained on protein sequences to predict their subcellular localization. The model uses contextualized protein sequence embeddings provided by the \href{https://huggingface.co/facebook/esm2_t12_35M_UR50D}{Meta ESM model architecture}.

The model was trained and fine-tuned on a dataset of 11,224 protein sequences, labeled as belonging to one of five localization categories: cytoplasmic, mitochondrial, nuclear, other, or secreted.

Intended uses & limitations

The ESM Protein Localization Model is intended to be used for predicting the subcellular localization of novel protein sequences. It should not be used as a diagnostic tool for medical purposes.

The model has been trained on a limited dataset and its performance may be limited on certain types of proteins or subcellular localizations.

Training data

The model was trained on a dataset of 11,224 protein sequences labeled as belonging to one of five subcellular localization categories: cytoplasmic, mitochondrial, nuclear, other, or secreted.

Training procedure

The model was trained using the Transformers library from Hugging Face. The training data was split into a training set and a validation set, and the model was fine-tuned on the training set.

Evalutation results

The model was evaluated using cross-validation and achieved an average F1 score of 0.88 on the test set.

Limitations and bias

The model has been trained on a limited dataset, and its performance may be limited on certain types of proteins or subcellular localizations. The dataset used for training and evaluation may also contain inherent biases or limitations.

Conclusion

The ESM Protein Localization Model is a deep learning model trained on protein sequences for predicting subcellular localization. It has achieved good performance in initial evaluations, but its performance may be limited on certain types of proteins or subcellular localizations. For further details and implementation specifics, please refer to the \href{https://github.com/ritakurban/protein-localizer}{Github repository}.

Downloads last month
36