Fine-Tuned Qwen 2.5-Coder-1.5B is a causal language model fine-tuned for generating contextually relevant responses. The base model, Qwen/Qwen2.5-Coder-1.5B, features a Transformer-based architecture with 1.5 billion parameters. The model was fine-tuned on a custom dataset named subset5, consisting of prompt-response pairs tokenized with a maximum sequence length of 128 tokens. The dataset includes diverse mathematical problems and solutions, along with general prompt-response pairs, to enhance the model's performance on mathematical reasoning tasks. During training, inputs were padded and truncated appropriately, and labels were aligned for causal language modeling. Key hyperparameters included a learning rate of 2e-5, batch size of 1, gradient accumulation steps of 32, and 3 epochs. The AdamW optimizer was used, with weight decay set to 0.01. Training was performed on CPU without CUDA.

The model can be used for tasks like answering questions, completing sentences, solving mathematical problems, or generating responses. For usage, load the model and tokenizer with the Hugging Face Transformers library, tokenize your input prompt, and generate responses with the model’s generate method. Example input-output pairs demonstrate the model’s ability to generate concise, informative answers, including solving mathematical problems accurately. However, the model should not be used for harmful, malicious, or unethical content, and users are responsible for adhering to applicable laws and ethical standards.

Downloads last month
5
Safetensors
Model size
1.54B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.