--- library_name: transformers license: mit datasets: - eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1 new_version: Qwen/Qwen2.5-0.5B pipeline_tag: text-generation --- # Qwen2.5-0.5B Fine-Tuned on GSM8K with DeepSeek Augmentation ## Model Overview ๐Ÿš€ This model is a **fine-tuned version of Qwen2.5-0.5B**, specifically trained for **mathematical reasoning tasks** using the **GSM8K dataset**, with additional **Chain-of-Thought (CoT) reasoning augmentation** from **DeepSeek-V3**. The model has been fine-tuned to generate detailed **step-by-step solutions** to grade school math problems, ensuring **better logical reasoning and interpretability**. ### ๐Ÿ”น **Key Features** - **Base Model:** `Qwen/Qwen2.5-0.5B` - **Fine-Tuned On:** `eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1` - **Optimized for:** **Mathematical problem-solving & step-by-step reasoning** - **Fine-tuned with:** **LoRA (Low-Rank Adaptation) for parameter-efficient training** - **Chain-of-Thought (CoT):** Generates clear and structured reasoning for each problem - **Inference-ready:** Available on ๐Ÿค— Hugging Face Hub --- ## **Model Details ๐Ÿ“œ** ### **๐Ÿ“ Description** - **Developed by:** [Your Name or Organization] - **Funded by:** [Optional: Mention if funded] - **Shared by:** Hugging Face Hub - **Model Type:** Causal Language Model (**Text Generation**) - **Languages:** English (`en`) - **License:** MIT License - **Fine-tuned from:** `Qwen/Qwen2.5-0.5B` ### ๐Ÿ“‚ **Model Repository** - **Hugging Face Model Page:** ๐Ÿ‘‰ [Fine-tuned Qwen2.5-0.5B](https://huggingface.co/your-repo-id) --- ## **๐Ÿ“ฅ How to Load & Use This Model** You can load this model using ๐Ÿค— `transformers` as follows: ```python from transformers import AutoModelForCausalLM, AutoTokenizer # Define model repo ID (Replace with actual HF repo) model_name = "your-repo-id" # Load tokenizer and model tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) # Move model to GPU (if available) import torch device = "cuda" if torch.cuda.is_available() else "cpu" model.to(device) # Example inference question = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?" inputs = tokenizer(question, return_tensors="pt").to(device) output = model.generate(**inputs, max_length=200) # Decode and print response print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` --- ## **๐Ÿ”ฌ Training Details** ### **๐Ÿ—„๏ธ Training Data** The model was fine-tuned on the **GSM8K dataset**, specifically the augmented dataset: ๐Ÿ”น [`eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1`](https://huggingface.co/datasets/eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1) This dataset contains: - **8K training samples** (`train` split) - **1K testing samples** (`test` split) - Features: `"question"`, `"answer"`, and `"cot"` (Chain-of-Thought) ### **โš™๏ธ Training Procedure** - **Preprocessing**: Each question was formatted using a prompt template to encourage step-by-step reasoning. - **Training Framework**: Used `transformers`, `trl`, and `unsloth` for efficient fine-tuning. - **Fine-Tuning Strategy**: **LoRA (Low-Rank Adaptation)** - Applied to **query and value projection layers** (`q_proj`, `v_proj`) - **LoRA hyperparameters:** - `r=8`, `lora_alpha=16`, `lora_dropout=0.1` - **Optimization**: - **Mixed Precision Training** (`fp16`) - **Batch Size:** 16 - **Gradient Accumulation:** 1 - **Learning Rate:** 2e-4 - **Training Time:** Approx. **10,446 seconds (~3 hours)** --- ## **๐Ÿ“Š Performance & Evaluation** ### **โœ… Training Performance** | Step | Loss | Grad Norm | Learning Rate | Epoch | |------|------|-----------|---------------|-------| | 10 | 2.1319 | 3.656 | 2e-4 | 0.0107 | | 1000 | 0.2013 | 0.328 | 2.3e-7 | 9.98 | | 9340 | 0.2048 | 0.341 | 2.1e-8 | 9.99 | ### **๐Ÿงช Testing & Expected Results** The model was evaluated on the **1K test samples** and showed **strong accuracy in multi-step problem-solving**. Example expected response: ```text To solve the problem, we first find the clips sold in May: Clips in May = 48 / 2 = 24 Next, we find the total: Total Clips = 48 + 24 = 72 #### Answer: 72 ``` --- ## **๐Ÿ›‘ Bias, Risks, and Limitations** ### โš ๏ธ **Potential Risks** - May **hallucinate** incorrect reasoning steps if prompts are unclear. - Could struggle with **complex mathematical problems** outside its training data. - **Limited generalization** to non-math reasoning tasks. ### ๐ŸŽฏ **Recommendations** - If using this model for **critical applications**, verify outputs with human review. - For **better performance**, fine-tune on **larger datasets** with real-world numerical reasoning. --- ## **๐ŸŒŽ Environmental Impact** **Estimated Carbon Emissions:** - **Hardware Used:** NVIDIA A100 GPU - **Training Time:** ~3 hours - **Estimated CO2 Emitted:** ~5.6 kg CO2eq (using [ML Impact Calculator](https://mlco2.github.io/impact#compute)) --- ## **๐Ÿ“š Citation** If you use this model in your research, please cite it as: ```bibtex @misc{Upcoming, title={Upcoming}, author={Yiqiao}, year={2025} } ```