nhyha commited on
Commit
a3d0cb6
1 Parent(s): 7745268

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -37
README.md CHANGED
@@ -108,30 +108,29 @@ model-index:
108
 
109
 
110
 
111
- ## Introduction
112
-
113
- N3N_gemma-2-9b-it_20241029_1532 is a 10.2 billion parameter open-source model built upon Gemma2-9B-Instruct through additional training. What sets this model apart is its fine-tuning process using a high-quality dataset derived from 1.6 million arXiv papers.
114
-
115
- - **High-quality Dataset**: The model has been fine-tuned using a comprehensive dataset compiled from 1.6 million arXiv papers, ensuring robust performance across various real-world applications.
116
 
117
- - **Superior Reasoning Capabilities**: The model demonstrates exceptional performance in mathematical reasoning and complex problem-solving tasks, outperforming comparable models in these areas.
118
-
119
- This model represents our commitment to advancing language model capabilities through meticulous dataset preparation and continuous model enhancement.
120
-
121
- ---
122
 
 
123
 
124
- # nhyha/N3N_gemma-2-9b-it_20241029_1532
 
 
 
 
 
125
 
126
- - **License:** apache-2.0
127
- - **Finetuned from model :** unsloth/gemma-2-9b-it
128
 
129
- This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
130
 
131
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
132
 
 
 
 
133
 
134
- **Achieved #1 Ranking for 9B and 12B LLMs on November 8, 2024.**
135
 
136
 
137
 
@@ -180,25 +179,25 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
180
 
181
 
182
 
183
- gemma-2-9b-it_24184_20241029_1532_3232_cosine_3_50True_8_645e-05
184
-
185
 
186
- ## Training hyperparameters
187
-
188
- The following hyperparameters were used during training:
189
- - seed: 3407
190
- - warmup_steps: 50
191
- - total_train_batch_size: 512
192
- - total_eval_batch_size: 64
193
- - learning_rate: 5e-05
194
- - optimizer: adamw_8bit
195
- - lr_scheduler_type: cosine
196
- - num_epochs: 3
197
- - r: 32
198
- - lora_alpha: 32
199
- - rs_lora: True
200
- - weight_decay: 0.01
201
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
202
 
203
 
204
 
@@ -218,15 +217,20 @@ The following hyperparameters were used during training:
218
 
219
 
220
 
 
 
 
221
 
 
 
 
222
 
223
- ## Contact
224
- If you are interested in customized LLMs for business applications powered by Jikji Labs' advanced infrastructure, we’d love to hear from you! Whether you have feedback, suggestions, or just want to explore collaboration opportunities, we are here to help. Please visit [our website](https://www.n3n.ai/) for more details. Jikji Labs specializes in large-scale data processing and tailored model training solutions to meet your business needs. We value your insights as we strive for continuous improvement and innovation. Your partnership and input are what help drive our mission forward!
225
 
226
- ## Collaborations
227
  We are actively seeking support and investment to further our development of robust language models, with a focus on building high-quality and specialized datasets to cater to a wide range of applications. Our expertise in dataset generation enables us to create models that are precise and adaptable to specific business requirements. If you are excited by the opportunity to collaborate and navigate future challenges with us, please visit [our website](https://www.n3n.ai/) for more information.
228
 
229
 
230
  ## Acknowledgement
231
- Many thanks to [google](https://huggingface.co/google) for providing such a valuable model to the Open-Source community.
232
 
 
108
 
109
 
110
 
 
 
 
 
 
111
 
 
 
 
 
 
112
 
113
+ # N3N_gemma-2-9b-it_20241029_1532
114
 
115
+ ## Model Overview
116
+ - **Base Model**: unsloth/gemma-2-9b-it
117
+ - **License**: apache-2.0
118
+ - **Parameters**: 10.2B
119
+ - **Language**: English
120
+ - **Training Framework**: [Unsloth](https://github.com/unslothai/unsloth) + Huggingface TRL
121
 
122
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
123
 
124
+ > **Achievement**: #1 Ranking for 9B and 12B LLMs (November 8, 2024)
125
 
126
+ ## Introduction
127
+ N3N_gemma-2-9b-it_20241029_1532 is a 10.2B parameter open-source model built upon Gemma2-9B-Instruct through additional training. What sets this model apart is its fine-tuning process using a high-quality dataset derived from 1.6 million arXiv papers.
128
 
129
+ ### Key Features
130
+ - **High-quality Dataset**: The model has been fine-tuned using a comprehensive dataset compiled from 1.6 million arXiv papers, ensuring robust performance across various real-world applications.
131
+ - **Superior Reasoning**: The model demonstrates exceptional performance in mathematical reasoning and complex problem-solving tasks, outperforming comparable models in these areas.
132
 
133
+ This model represents our commitment to advancing language model capabilities through meticulous dataset preparation and continuous model enhancement.
134
 
135
 
136
 
 
179
 
180
 
181
 
 
 
182
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
183
 
184
+ ## Training Details
185
+ ### Hyperparameters
186
+ ```python
187
+ {
188
+ "seed": 3407,
189
+ "warmup_steps": 50,
190
+ "total_train_batch_size": 512,
191
+ "total_eval_batch_size": 64,
192
+ "learning_rate": 5e-05,
193
+ "optimizer": "adamw_8bit",
194
+ "lr_scheduler_type": "cosine",
195
+ "num_epochs": 3,
196
+ "r": 32,
197
+ "lora_alpha": 32,
198
+ "rs_lora": True,
199
+ "weight_decay": 0.01
200
+ }
201
 
202
 
203
 
 
217
 
218
 
219
 
220
+ ## Business & Collaboration
221
+ ### Contact
222
+ Are you looking for customized LLMs tailored to your business needs? Jikji Labs offers advanced infrastructure including H100*8 GPU clusters for optimal model training and deployment. Our expertise spans:
223
 
224
+ - Large-scale data processing
225
+ - High-performance GPU computing
226
+ - Custom model development and training
227
 
228
+ We welcome collaborations and are always eager to hear your feedback or discuss potential partnerships. Visit our website to learn how our infrastructure and expertise can drive your AI initiatives forward.
 
229
 
230
+ ### Collaborations
231
  We are actively seeking support and investment to further our development of robust language models, with a focus on building high-quality and specialized datasets to cater to a wide range of applications. Our expertise in dataset generation enables us to create models that are precise and adaptable to specific business requirements. If you are excited by the opportunity to collaborate and navigate future challenges with us, please visit [our website](https://www.n3n.ai/) for more information.
232
 
233
 
234
  ## Acknowledgement
235
+ Special thanks to [google](https://huggingface.co/google) for providing the base model to the Open-Source community.
236