Upload model files

Browse files

Files changed (9) hide show

README.md +78 -3
config.json +3 -0
generation_config.json +3 -0
model-00001-of-00002.safetensors +3 -0
model-00002-of-00002.safetensors +3 -0
model.safetensors.index.json +3 -0
special_tokens_map.json +3 -0
tokenizer.json +3 -0
tokenizer_config.json +3 -0

README.md CHANGED Viewed

@@ -1,3 +1,78 @@
----
-license: llama3.2
----

+---
+license: llama3.2
+datasets:
+- BAAI/Infinity-Instruct
+base_model:
+- meta-llama/Llama-3.2-1B-Instruct
+---
+## Model Overview
+This weight is a fine-tuned version of **[Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)** using the **[LLM-Neo](https://arxiv.org/abs/2411.06839)** method. Usage is identical to the original Llama-3.2-1B-Instruct model.
+## Training Details
+The training process employs the **LLM-Neo** method. The dataset is derived from a mixed sample of **[BAAI/Infinity-Instruct](https://huggingface.co/datasets/BAAI/Infinity-Instruct)**, specifically the `0625` and `7M` subsets, with a total of 10k instruction samples. The KD (knowledge distillation) model used is **[Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)**, with the following hyperparameters:
+- **Learning Rate**: 1e-4
+- **Epochs**: 1
+- **KD Ratio**: 0.9
+- **Rank**: 128
+## Model Performance Evaluation
+<img src="https://raw.githubusercontent.com/Rummyyang/Rummyyang.github.io/refs/heads/main/img/radar_chart_neo_llama3.2_larger_text-1120-1-1.png" alt="Neo_radar" width="600">
+<!-- ![Neo_radar](https://raw.githubusercontent.com/Rummyyang/Rummyyang.github.io/refs/heads/main/img/radar_chart_neo_llama3.2_larger_text-1120-1-1.png) -->
+The evaluation of this model is divided into two parts: results from **[lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)** and **[math-evaluation-harness](https://github.com/ZubinGou/math-evaluation-harness)** frameworks.
+> **Note**: The results are influenced by the specific benchmark versions and testing hardware/software configurations.
+> Therefore, the reported metrics should be interpreted as relative performance within a given setup.
+### Part 1: lm-evaluation-harness results
+In this part, the model was evaluated on several widely-used benchmark datasets, covering reasoning, commonsense, mathematics, and language understanding tasks. Below is a detailed comparison of the performance metrics between **Llama-3.2-1B-Instruct** and the current model:
+| Dataset             | Llama-3.2-1B-Instruct | Llama-3.2-1B-Instruct-Neo |
+|---------------------|------------------------|---------------|
+| ARC Challenge       | 36.09                 | 36.43         |
+| ARC Easy            | 68.52                 | 67.51         |
+| CEval               | 39.45                 | 39.67         |
+| CMMLU               | 35.62                 | 36.48         |
+| MMLU                | 45.91                 | 46.27         |
+| HellaSwag           | 45.07                 | 45.84         |
+| OpenBookQA          | 24.40                 | 25.40         |
+| PIQA                | 73.88                 | 74.32         |
+| Winogrande          | 59.27                 | 61.17         |
+The results demonstrate that the current model outperforms **Llama-3.2-1B-Instruct** in several tasks, especially in reasoning tasks (e.g., **Winogrande**) and commonsense tasks (e.g., **PIQA**).
+---
+### Part 2: math-evaluation-harness results
+In this part, the model was evaluated specifically on mathematical reasoning and related tasks, focusing on its ability to handle complex mathematical problems.
+| Dataset             | Llama-3.2-1B-Instruct | Llama-3.2-1B-Instruct-Neo |
+|---------------------|------------------------|---------------|
+| GSM8K               | 35.00                 | 39.30         |
+| Minerva Math        | 14.80                 | 22.80         |
+| SVAMP               | 50.40                 | 54.50         |
+| ASDiv               | 67.40                 | 71.20         |
+| MAWPS               | 83.50                 | 85.60         |
+| TabMWP              | 41.90                 | 35.40         |
+| MathQ               | 44.20                 | 48.30         |
+| MMLU-STEM           | 37.90                 | 38.90         |
+The mathematical evaluation highlights significant improvements of the current model in handling complex problems, with notable progress on datasets such as **Minerva Math** and **GSM8K**.
+---
+### Summary
+- **Strengths**: The current model demonstrates notable improvements over **Llama-3.2-1B-Instruct** across multiple benchmark tasks, particularly in reasoning and mathematical problem-solving.
+- **Future Directions**: Further optimization in logical reasoning tasks (e.g., **TabMWP**) and continued enhancements in general language and mathematical adaptability.

config.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:63a5518e13fcb8c3d6874e4384fdcaccdf13f93691e4e11efa111ca66984eb0a
+size 872

generation_config.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:46650402223e517e09ac32797ba8cff47cf4cfea248aed800a76a0c50ba4e92d
+size 184

model-00001-of-00002.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5516da399f3d61e832ef8ec884018c78cf10de495b072161c19387ffd20efc78
+size 1997648472

model-00002-of-00002.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:15726f8ace29adffc550ea0fdccdb96bc5a53b8270aaf01d9762149865dacb50
+size 473997096

model.safetensors.index.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a2807004134bcb4f726e56cd6a9a60f499e9a047116729f3572397d968970623
+size 12003

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1b1835caa5b4d70acaa210fa222b0036f1882f9525c4660fd4810fb3e1e40ff8
+size 325

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6b9e4e7fb171f92fd137b777cc2714bf87d11576700a1dcd7a399e7bbe39537b
+size 17209920

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:50536ab56a629f13c0227a1658c5c040cde997239f94dc6d9df3db2128e5ade0
+size 54616