AmelieSchreiber
/

esm2_t6_8M_UR50D-finetuned-localization

Text Classification

protein language model

Inference Endpoints

Model card Files Files and versions Community

AmelieSchreiber commited on Aug 5, 2023

Commit

f16bcc1

•

1 Parent(s): 1321090

Create README.md

Files changed (1) hide show

README.md +44 -0

README.md ADDED Viewed

	@@ -0,0 +1,44 @@

+---
+license: mit
+language:
+- en
+library_name: transformers
+tags:
+- esm
+- esm2
+- protein language model
+- biology
+---
+To use try running:
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+# Initialize the tokenizer and model
+model_path_directory = "AmelieSchreiber/esm2_t6_8M_UR50D-finetuned-localization"
+tokenizer = AutoTokenizer.from_pretrained(model_path_directory)
+model = AutoModelForSequenceClassification.from_pretrained(model_path_directory)
+# Define a function to predict the category of a protein sequence
+def predict_category(sequence):
+    # Tokenize the sequence and convert it to tensor format
+    inputs = tokenizer(sequence, return_tensors="pt", truncation=True, max_length=512, padding="max_length")
+    # Make prediction
+    with torch.no_grad():
+        logits = model(**inputs).logits
+    # Determine the category with the highest score
+    predicted_class = torch.argmax(logits, dim=1).item()
+    # Return the category: 0 for cytosolic, 1 for membrane
+    return "cytosolic" if predicted_class == 0 else "membrane"
+# Example sequence
+new_protein_sequence = "MTQRAGAAMLPSALLLLCVPGCLTVSGPSTVMGAVGESLSVQCRYEEKYKTFNKYWCRQPCLPIWHEMVETGGSEGVVRSDQVIITDHPGDLTFTVTLENLTADDAGKYRCGIATILQEDGLSGFLPDPFFQVQVLVSSASSTENSVKTPASPTRPSQCQGSLPSSTCFLLLPLLKVPLLLSILGAILWVNRPWRTPWTES"
+# Predict the category
+category = predict_category(new_protein_sequence)
+print(f"The predicted category for the sequence is: {category}")
+```