mjbuehler commited on
Commit
6d38791
·
verified ·
1 Parent(s): b9dbe5d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -9
README.md CHANGED
@@ -72,12 +72,11 @@ This makes the model particularly suitable for **AI-for-science**, **graph-nativ
72
  import torch
73
  from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
74
 
75
- #token= 'hf_...'
76
 
77
  # ------------------------------------------------------------------------------
78
  # Configuration
79
  # ------------------------------------------------------------------------------
80
-
81
  MODEL_NAME = "lamm-mit/Graph-Preflexor-8b_12292025"
82
  PROMPT = "Give me a short introduction to materiomics."
83
  MAX_NEW_TOKENS = 32_768
@@ -86,25 +85,21 @@ THINK_END_TOKEN_ID = 151668 # </think>
86
  # ------------------------------------------------------------------------------
87
  # Model & Tokenizer Loading
88
  # ------------------------------------------------------------------------------
89
-
90
  tokenizer = AutoTokenizer.from_pretrained(
91
  MODEL_NAME,
92
  token=token,
93
  )
94
-
95
  model = AutoModelForCausalLM.from_pretrained(
96
  MODEL_NAME,
97
  torch_dtype="auto",
98
  device_map="auto",
99
  token=token,
100
  )
101
-
102
  model.eval()
103
 
104
  # ------------------------------------------------------------------------------
105
  # Prompt Construction
106
  # ------------------------------------------------------------------------------
107
-
108
  messages = [
109
  {"role": "user", "content": PROMPT}
110
  ]
@@ -124,7 +119,6 @@ model_inputs = tokenizer(
124
  # ------------------------------------------------------------------------------
125
  # Generation
126
  # ------------------------------------------------------------------------------
127
-
128
  gen_config = GenerationConfig(
129
  max_new_tokens=MAX_NEW_TOKENS,
130
  do_sample=True, # sample
@@ -143,7 +137,6 @@ output_ids = generated[0, model_inputs.input_ids.shape[1]:].tolist()
143
  # ------------------------------------------------------------------------------
144
  # Thinking / Content Parsing
145
  # ------------------------------------------------------------------------------
146
-
147
  def split_thinking(output_ids, tokenizer, think_end_id):
148
  """
149
  Split generated tokens into (thinking, final_content) based on </think>.
@@ -176,7 +169,6 @@ thinking, content = split_thinking(
176
  # ------------------------------------------------------------------------------
177
  # Output
178
  # ------------------------------------------------------------------------------
179
-
180
  print("\n" + "=" * 80)
181
  print("THINKING")
182
  print("=" * 80)
@@ -253,6 +245,8 @@ FINAL OUTPUT
253
  Materiomics is an interdisciplinary field that merges materials science with omics methodologies—such as genomics, proteomics, and metabolomics—to systematically analyze, design, and predict the properties of materials at atomic and molecular scales. At its core, materiomics leverages high-throughput experimental techniques and advanced computational models to generate vast datasets on material composition, structure, processing conditions, and resulting properties. These data are then used to build predictive models that can forecast material behavior under various stimuli, enabling the rational design of novel materials with tailored functionalities. Key phenomena underpinning materiomics include self-assembly processes where molecules spontaneously form ordered structures, phase transitions that dictate stability and transformation under thermal or mechanical stress, and defect engineering that manipulates imperfections to enhance properties like strength or conductivity. By drawing inspiration from biological systems—where complex materials like proteins and cell membranes emerge from simple building blocks—materiomics adopts data-driven, systems-level approaches to accelerate discovery. This field is pivotal in advancing nanotechnology, sustainable materials, and AI-driven R&D, offering a scalable framework to move beyond traditional trial-and-error methods, thereby revolutionizing industries from electronics to energy storage.
254
  ```
255
 
 
 
256
  # References and Citation
257
 
258
  This model was trained based on the ideas presented in the below referenced papers.
 
72
  import torch
73
  from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
74
 
75
+ token= 'hf_...'
76
 
77
  # ------------------------------------------------------------------------------
78
  # Configuration
79
  # ------------------------------------------------------------------------------
 
80
  MODEL_NAME = "lamm-mit/Graph-Preflexor-8b_12292025"
81
  PROMPT = "Give me a short introduction to materiomics."
82
  MAX_NEW_TOKENS = 32_768
 
85
  # ------------------------------------------------------------------------------
86
  # Model & Tokenizer Loading
87
  # ------------------------------------------------------------------------------
 
88
  tokenizer = AutoTokenizer.from_pretrained(
89
  MODEL_NAME,
90
  token=token,
91
  )
 
92
  model = AutoModelForCausalLM.from_pretrained(
93
  MODEL_NAME,
94
  torch_dtype="auto",
95
  device_map="auto",
96
  token=token,
97
  )
 
98
  model.eval()
99
 
100
  # ------------------------------------------------------------------------------
101
  # Prompt Construction
102
  # ------------------------------------------------------------------------------
 
103
  messages = [
104
  {"role": "user", "content": PROMPT}
105
  ]
 
119
  # ------------------------------------------------------------------------------
120
  # Generation
121
  # ------------------------------------------------------------------------------
 
122
  gen_config = GenerationConfig(
123
  max_new_tokens=MAX_NEW_TOKENS,
124
  do_sample=True, # sample
 
137
  # ------------------------------------------------------------------------------
138
  # Thinking / Content Parsing
139
  # ------------------------------------------------------------------------------
 
140
  def split_thinking(output_ids, tokenizer, think_end_id):
141
  """
142
  Split generated tokens into (thinking, final_content) based on </think>.
 
169
  # ------------------------------------------------------------------------------
170
  # Output
171
  # ------------------------------------------------------------------------------
 
172
  print("\n" + "=" * 80)
173
  print("THINKING")
174
  print("=" * 80)
 
245
  Materiomics is an interdisciplinary field that merges materials science with omics methodologies—such as genomics, proteomics, and metabolomics—to systematically analyze, design, and predict the properties of materials at atomic and molecular scales. At its core, materiomics leverages high-throughput experimental techniques and advanced computational models to generate vast datasets on material composition, structure, processing conditions, and resulting properties. These data are then used to build predictive models that can forecast material behavior under various stimuli, enabling the rational design of novel materials with tailored functionalities. Key phenomena underpinning materiomics include self-assembly processes where molecules spontaneously form ordered structures, phase transitions that dictate stability and transformation under thermal or mechanical stress, and defect engineering that manipulates imperfections to enhance properties like strength or conductivity. By drawing inspiration from biological systems—where complex materials like proteins and cell membranes emerge from simple building blocks—materiomics adopts data-driven, systems-level approaches to accelerate discovery. This field is pivotal in advancing nanotechnology, sustainable materials, and AI-driven R&D, offering a scalable framework to move beyond traditional trial-and-error methods, thereby revolutionizing industries from electronics to energy storage.
246
  ```
247
 
248
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/623ce1c6b66fedf374859fe7/iTrRAeKbE1GNQeA6Bugu5.png)
249
+
250
  # References and Citation
251
 
252
  This model was trained based on the ideas presented in the below referenced papers.