Triangle104
/

OpenMath2-Llama3.1-8B-Q6_K-GGUF

Inference Endpoints

Model card Files Files and versions Community

Triangle104 commited on Nov 16, 2024

Commit

7261f83

·

verified ·

1 Parent(s): 8363534

Update README.md

Files changed (1) hide show

README.md +57 -0

README.md CHANGED Viewed

@@ -17,6 +17,63 @@ library_name: transformers
 This model was converted to GGUF format from [`nvidia/OpenMath2-Llama3.1-8B`](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 This model was converted to GGUF format from [`nvidia/OpenMath2-Llama3.1-8B`](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B) for more details on the model.
+---
+Model details:
+-
+OpenMath2-Llama3.1-8B is obtained by finetuning Llama3.1-8B-Base with OpenMathInstruct-2.
+The model outperforms Llama3.1-8B-Instruct on all the popular math benchmarks we evaluate on, especially on MATH by 15.9%.
+How to use the models?
+-
+Our models are trained with the same "chat format" as Llama3.1-instruct models (same system/user/assistant tokens). Please note that these models have not been instruction tuned on general data and thus might not provide good answers outside of math domain.
+We recommend using instructions in our repo to run inference with these models, but here is an example of how to do it through transformers api:
+import transformers
+import torch
+model_id = "nvidia/OpenMath2-Llama3.1-8B"
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model_id,
+    model_kwargs={"torch_dtype": torch.bfloat16},
+    device_map="auto",
+)
+messages = [
+    {
+        "role": "user",
+        "content": "Solve the following math problem. Make sure to put the answer (and only answer) inside \\boxed{}.\n\n" +
+        "What is the minimum value of $a^2+6a-7$?"},
+]
+outputs = pipeline(
+    messages,
+    max_new_tokens=4096,
+)
+print(outputs[0]["generated_text"][-1]['content'])
+Reproducing our results
+-
+We provide all instructions to fully reproduce our results.
+Citation
+If you find our work useful, please consider citing us!
+@article{toshniwal2024openmath2,
+  title   = {OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data},
+  author  = {Shubham Toshniwal and Wei Du and Ivan Moshkov and  Branislav Kisacanin and Alexan Ayrapetyan and Igor Gitman},
+  year    = {2024},
+  journal = {arXiv preprint arXiv:2410.01560}
+}
+Terms of use
+-
+By accessing this model, you are agreeing to the LLama 3.1 terms and conditions of the license, acceptable use policy and Meta’s privacy policy
+---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)