monsoon-nlp
/

mamba130-proteinpretrain-quinoa

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

monsoon-nlp commited on Apr 4

Commit

83d48d0

•

1 Parent(s): 4272c6b

Update README.md

Files changed (1) hide show

README.md +13 -15

README.md CHANGED Viewed

@@ -3,31 +3,29 @@ base_model: state-spaces/mamba-130m-hf
 tags:
 - generated_from_trainer
 model-index:
-- name: trainer
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# trainer
-This model is a fine-tuned version of [state-spaces/mamba-130m-hf](https://huggingface.co/state-spaces/mamba-130m-hf) on an unknown dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -45,4 +43,4 @@ The following hyperparameters were used during training:
 - Transformers 4.40.0.dev0
 - Pytorch 2.2.1+cu121
 - Datasets 2.18.0
-- Tokenizers 0.15.2

 tags:
 - generated_from_trainer
 model-index:
+- name: monsoon-nlp/mamba130-proteinpretrain-quinoa
   results: []
+datasets:
+- monsoon-nlp/greenbeing-proteins
 ---
+# mamba130-proteinpretrain-quinoa
+Full model finetuning of Mamba-130M-HF on the "research" split (quinoa
+protein sequences) of GreenBeing-Proteins dataset.
+Due to limits of V100 GPU, trained 510 steps x batches of 3, ~5% of the research split.
+Requires GitHub main branch of Transformers (Mamba is not included in releases)
+Considering training on natural language + proteins, or new "biotokens".
+More details TBD
 ## Training procedure
+Notebook: https://colab.research.google.com/drive/1W1rB6rRt8krHZSVYQ_TjbnD9OwzFQeGL
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - Transformers 4.40.0.dev0
 - Pytorch 2.2.1+cu121
 - Datasets 2.18.0
+- Tokenizers 0.15.2