nhyha commited on
Commit
1fd0c6c
·
verified ·
1 Parent(s): ea5298b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -5
README.md CHANGED
@@ -108,9 +108,23 @@ model-index:
108
 
109
  ---
110
 
111
- # Uploaded model
112
 
113
- - **Developed by:** nhyha
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
114
  - **License:** apache-2.0
115
  - **Finetuned from model :** unsloth/gemma-2-9b-it
116
 
@@ -118,10 +132,74 @@ This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslot
118
 
119
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
120
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
121
 
122
 
 
 
123
 
124
- https://www.n3n.ai/
 
 
125
 
126
- I will fill in the following content as TBD.
127
- (TBD)
 
108
 
109
  ---
110
 
 
111
 
112
+
113
+ ## Introduction
114
+
115
+ N3N_gemma-2-9b-it_20241029_1532 is a 10.2 billion parameter open-source model built upon Gemma2-9B-Instruct through additional training. What sets this model apart is its fine-tuning process using a high-quality dataset derived from 1.6 million arXiv papers.
116
+
117
+ - **High-quality Dataset**: The model has been fine-tuned using a comprehensive dataset compiled from 1.6 million arXiv papers, ensuring robust performance across various real-world applications.
118
+
119
+ - **Superior Reasoning Capabilities**: The model demonstrates exceptional performance in mathematical reasoning and complex problem-solving tasks, outperforming comparable models in these areas.
120
+
121
+ This model represents our commitment to advancing language model capabilities through meticulous dataset preparation and continuous model enhancement.
122
+
123
+ ---
124
+
125
+
126
+ # nhyha/N3N_gemma-2-9b-it_20241029_1532
127
+
128
  - **License:** apache-2.0
129
  - **Finetuned from model :** unsloth/gemma-2-9b-it
130
 
 
132
 
133
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
134
 
135
+ **Scoring #1 LLM of 7B and 12B at 08.11.2024.**
136
+
137
+
138
+
139
+
140
+ ## Quickstart
141
+
142
+ Here is a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate content.
143
+
144
+ ```python
145
+ from transformers import AutoModelForCausalLM, AutoTokenizer
146
+ device = "cuda" # the device to load the model onto
147
+
148
+ model = AutoModelForCausalLM.from_pretrained(
149
+ "nhyha/N3N_gemma-2-9b-it_20241029_1532",
150
+ torch_dtype="auto",
151
+ device_map="auto"
152
+ )
153
+ tokenizer = AutoTokenizer.from_pretrained("nhyha/N3N_gemma-2-9b-it_20241029_1532")
154
+
155
+ prompt = "Give me a short introduction to large language model."
156
+ messages = [
157
+ {"role": "system", "content": "You are a helpful assistant."},
158
+ {"role": "user", "content": prompt}
159
+ ]
160
+ text = tokenizer.apply_chat_template(
161
+ messages,
162
+ tokenize=False,
163
+ add_generation_prompt=True
164
+ )
165
+ model_inputs = tokenizer([text], return_tensors="pt").to(device)
166
+
167
+ generated_ids = model.generate(
168
+ model_inputs.input_ids,
169
+ max_new_tokens=512
170
+ )
171
+ generated_ids = [
172
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
173
+ ]
174
+
175
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
176
+ ```
177
+
178
+
179
+
180
+
181
+
182
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
183
+
184
+
185
+ | Metric |Value|
186
+ |-------------------|----:|
187
+ |Avg. |32.02|
188
+ |IFEval (0-Shot) |67.52|
189
+ |BBH (3-Shot) |40.99|
190
+ |MATH Lvl 5 (4-Shot)|20.47|
191
+ |GPQA (0-shot) |12.08|
192
+ |MuSR (0-shot) |16.39|
193
+ |MMLU-PRO (5-shot) |34.69|
194
+
195
 
196
 
197
+ ## Contact
198
+ If you are interested in customized LLMs for business applications powered by Jikji Labs' cutting-edge infrastructure, please get in contact with us via our website. Jikji Labs is designed to support large-scale data processing and model training, ensuring optimal solutions for your business needs. We are also grateful for your feedback and suggestions as we strive to improve and innovate continuously.
199
 
200
+ ## Collaborations
201
+ We are also keenly seeking support and investment as we continue to advance the development of robust language models, with a strong emphasis on creating high-quality and specialized datasets to address a diverse range of purposes and requirements. Our expertise in dataset generation enables us to develop models that are more accurate and tailored to specific business needs. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us through our website.
202
+ (https://www.n3n.ai/)
203
 
204
+ ## Acknowledgement
205
+ Many thanks to [google](https://huggingface.co/google) for providing such a valuable model to the Open-Source community.