Transformers
Safetensors
Inference Endpoints
mjbuehler commited on
Commit
84da3c0
·
verified ·
1 Parent(s): b8e8200

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -19
README.md CHANGED
@@ -21,33 +21,26 @@ Raw format of training data (in Llama 3.1 chat template format):
21
  <|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nDominant secondary structure of < V V F D V V F D V V F D V V F D V V F D V V F D V V F D V V F D ><|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nUNSTRUCTURED<|eot_id|>
22
  ```
23
 
24
- ## Sample inference
 
 
 
 
25
 
26
  ```
27
- base_model_name = "lamm-mit/BioinspiredLlama-3-1-8B-128k"
28
-
29
- # Fine-tuned model name
30
- FT_model = "lamm-mit/BioinspiredLlama-3-1-8B-128k-dominant-protein-SS-structure"
31
-
32
- bnb_config4bit = BitsAndBytesConfig(
33
- load_in_4bit=True,
34
- bnb_4bit_quant_type="nf4",
35
- bnb_4bit_compute_dtype=torch.bfloat16,
36
- bnb_4bit_use_double_quant=True,
37
- use_nested_quant = False,
38
- )
39
  model = AutoModelForCausalLM.from_pretrained(
40
- base_model_name,
41
  trust_remote_code=True,
42
  device_map="auto",
43
- quantization_config= bnb_config4bit,
44
  torch_dtype =torch.bfloat16,
45
-
46
  )
47
- model = PeftModel.from_pretrained(model, FT_model, )
48
 
49
- tokenizer = AutoTokenizer.from_pretrained(base_model_name)
50
  ```
 
51
  ## Example
52
 
53
  Inference function for convenience:
@@ -111,4 +104,19 @@ A visualization of the protein, to check:
111
 
112
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/623ce1c6b66fedf374859fe7/aO-0BbS8Sp_dV796w-Hm0.png)
113
 
114
- As predicted, this protein (PDB ID 6N7P, https://www.rcsb.org/structure/6N7P) is primarily alpha-helical.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  <|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nDominant secondary structure of < V V F D V V F D V V F D V V F D V V F D V V F D V V F D V V F D ><|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nUNSTRUCTURED<|eot_id|>
22
  ```
23
 
24
+ Here is a visual representation of what the model predicts:
25
+
26
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/623ce1c6b66fedf374859fe7/xOwcZGs6Q1NRV7ln5oP-m.png)
27
+
28
+ ## How to load the model
29
 
30
  ```
31
+ import torch
32
+ from transformers import AutoModelForCausalLM, AutoTokenizer
33
+
 
 
 
 
 
 
 
 
 
34
  model = AutoModelForCausalLM.from_pretrained(
35
+ 'lamm-mit/BioinspiredLlama-3-1-8B-128k-dominant-protein-SS-structure',
36
  trust_remote_code=True,
37
  device_map="auto",
 
38
  torch_dtype =torch.bfloat16,
 
39
  )
 
40
 
41
+ tokenizer = AutoTokenizer.from_pretrained('lamm-mit/BioinspiredLlama-3-1-8B-128k-dominant-protein-SS-structure',)
42
  ```
43
+
44
  ## Example
45
 
46
  Inference function for convenience:
 
104
 
105
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/623ce1c6b66fedf374859fe7/aO-0BbS8Sp_dV796w-Hm0.png)
106
 
107
+ As predicted, this protein (PDB ID 6N7P, https://www.rcsb.org/structure/6N7P) is primarily alpha-helical.
108
+
109
+ ## Notes
110
+
111
+ This model has been trained on sequences shorter than 128 amino acids.
112
+
113
+ ## Reference
114
+
115
+ ```bibtex
116
+ @article{Buehler_2024,
117
+ title={Fine-tuning LLMs for protein feature predictions},
118
+ author={Markus J. Buehler},
119
+ journal={},
120
+ year={2024}
121
+ }
122
+ ```