ChatterjeeLab
/

PTM-Mamba

Model card Files Files and versions Community

zhangzhi commited on Mar 5

Commit

9d12ea1

•

1 Parent(s): fe0950e

Update README.md

Browse files

Files changed (1) hide show

README.md +24 -7

README.md CHANGED Viewed

@@ -6,17 +6,17 @@ license: cc-by-nc-nd-4.0
 A PTM-Aware Protein Language Model with Bidirectional Gated Mamba Blocks
-[[Huggingface](https://huggingface.co/ChatterjeeLab/PTM-Mamba)]    [[Github](https://github.com/programmablebio/ptm-mamba)]
-<img src="https://cdn-uploads.huggingface.co/production/uploads/64cd5b3f0494187a9e8b7c69/joOVN6BR3CppDSRKBqWxj.png" width="300" height="300">
-Figure generated by DALL-E 3 with prompt "A PTM-Aware Protein Language Model with Bidirectional Gated Mamba Blocks".
 ## Install Enviroment
 ### Docker
-Setting up env for mamba could be a pain, alternatively we suggest using docker containers.
 #### Run container in interactive and detach mode, and mounte project dir to the container workspace.
@@ -43,11 +43,11 @@ pip install -e protein_lm/tokenizer/rust_trie
 ## Data
-We collect protein sequences and their PTM annotations from Uniprot-Swissprot. The PTM annotations are represented as tokens and used to replaced the corresponding amino acids. The data can be downloaded from [here](https://drive.google.com/file/d/151KUp79tgBxphoIky1-ohyuvzIS1gtNS/view?usp=drive_link). Please place the data on  `protein_lm/dataset/`.
 ## Configs
-The training and testing configs are `protein_lm/configs`. We provide a basic training config at `protein_lm/configs/train/base.yaml`.
 ## Training
@@ -57,7 +57,7 @@ The training and testing configs are `protein_lm/configs`. We provide a basic tr
 python ./protein_lm/modeling/scripts/train.py +train=base
 ```
-The commond will use the configs in `protein_lm/configs/train/base.yaml`.
 ##### Multi-GPU Training
@@ -109,3 +109,20 @@ This project is based on the  following codebase. Please give them a star if you
 - [OpenBioML/protein-lm-scaling (github.com)](https://github.com/OpenBioML/protein-lm-scaling)
 - [state-spaces/mamba (github.com)](https://github.com/state-spaces/mamba)

 A PTM-Aware Protein Language Model with Bidirectional Gated Mamba Blocks
+[[Huggingface](https://huggingface.co/ChatterjeeLab/PTM-Mamba)]    [[Github](https://github.com/programmablebio/ptm-mamba)]   [[Paper](https://www.biorxiv.org/content/10.1101/2024.02.28.581983v1)]
+<img src="https://cdn-uploads.huggingface.co/production/uploads/6430c79620265810703d3986/7QdA6MZ6OTmNHuwyDqFnN.png" width="300" height="300">
+>  Figure generated by Dalle-3 with prompt "A PTM-Aware Protein Language Model with Bidirectional Gated Mamba Blocks".
 ## Install Enviroment
 ### Docker
+Setting up env for mamba could be a pain, alternatively, we suggest using docker containers.
 #### Run container in interactive and detach mode, and mounte project dir to the container workspace.
 ## Data
+We collect protein sequences and their PTM annotations from Uniprot-Swissprot. The PTM annotations are represented as tokens and used to replace the corresponding amino acids. The data can be downloaded from [here](https://drive.google.com/file/d/151KUp79tgBxphoIky1-ohyuvzIS1gtNS/view?usp=drive_link). Please place the data in  `protein_lm/dataset/`.
 ## Configs
+The training and testing configs are in `protein_lm/configs`. We provide a basic training config at `protein_lm/configs/train/base.yaml`.
 ## Training
 python ./protein_lm/modeling/scripts/train.py +train=base
 ```
+The command will use the configs in `protein_lm/configs/train/base.yaml`.
 ##### Multi-GPU Training
 - [OpenBioML/protein-lm-scaling (github.com)](https://github.com/OpenBioML/protein-lm-scaling)
 - [state-spaces/mamba (github.com)](https://github.com/state-spaces/mamba)
+## Citation
+Please cite our paper if you enjoy our code :)
+```
+@article {Peng2024.02.28.581983,
+	author = {Zhangzhi Peng and Benjamin Schussheim and Pranam Chatterjee},
+	title = {PTM-Mamba: A PTM-Aware Protein Language Model with Bidirectional Gated Mamba Blocks},
+	elocation-id = {2024.02.28.581983},
+	year = {2024},
+	doi = {10.1101/2024.02.28.581983},
+	publisher = {Cold Spring Harbor Laboratory},
+	URL = {https://www.biorxiv.org/content/early/2024/02/29/2024.02.28.581983},
+	eprint = {https://www.biorxiv.org/content/early/2024/02/29/2024.02.28.581983.full.pdf},
+	journal = {bioRxiv}
+}
+```