Model Card for Abhijith71/Abhijith
Model Summary
This model is a fine-tuned version of deepseek-ai/deepseek-coder-6.7b-instruct
using Parameter-Efficient Fine-Tuning (PEFT). It is optimized for text generation tasks, particularly in code generation and instruction-based responses. The model is designed to assist in generating high-quality code snippets, explanations, and other natural language responses related to programming.
Model Details
- Developed by: Abhijith71
- Funded by [optional]: Self-funded
- Shared by: Abhijith71
- Model type: Causal Language Model (CLM)
- Language(s): English
- License: Apache 2.0
- Fine-tuned from: deepseek-ai/deepseek-coder-6.7b-instruct
Model Sources
- Repository: Hugging Face Model Page
- Paper [optional]: N/A
- Demo: N/A
Uses
Direct Use
- Code generation
- Instruction-based responses
- General text completion
Downstream Use
- AI-assisted programming
- Documentation generation
- Code debugging assistance
Out-of-Scope Use
- Not optimized for open-ended conversation beyond code-related tasks
- Not suitable for real-time chatbot applications without further tuning
Bias, Risks, and Limitations
- The model might generate biased or inaccurate responses based on training data.
- Outputs should be verified before using in production.
- Limited knowledge outside the dataset scope (no real-time updating capability).
Recommendations
- Users should review generated code for accuracy and security.
- Additional fine-tuning may be required for specialized use cases.
How to Use
Loading the Model
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Abhijith71/Abhijith"
base_model = "deepseek-ai/deepseek-coder-6.7b-instruct"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, model_name)
inputs = tokenizer("Generate a Python function for factorial", return_tensors="pt")
output = model.generate(**inputs)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Training Details
Training Data
- Fine-tuned on a dataset containing programming-related instructions and code snippets.
Training Procedure
- Preprocessing: Tokenization and formatting of instruction-based prompts.
- Training Regime: Mixed precision training (bf16) using PEFT for efficient fine-tuning.
Evaluation
Testing Data, Factors & Metrics
- Testing Data: Sampled from programming-related sources
- Metrics:
- Perplexity (PPL)
- Code quality assessment
- Instruction-following accuracy
Results
- Model generates coherent and useful code snippets for various prompts.
- Some limitations exist in edge cases and complex multi-step reasoning.
Environmental Impact
- Hardware: NVIDIA A100 GPUs
- Training Duration: Approx. 10-20 hours
- Cloud Provider: AWS
- Carbon Emissions: Estimated using ML Impact Calculator
Technical Specifications
Model Architecture
- Based on deepseek-ai/deepseek-coder-6.7b-instruct
- Uses PEFT for fine-tuning
Compute Infrastructure
- Hardware: 4x A100 GPUs
- Software: PEFT 0.14.0, Transformers 4.34+
Citation
@misc{abhijith2024,
author = {Abhijith71},
title = {Fine-tuned DeepSeek Coder Model},
year = {2024},
publisher = {Hugging Face},
journal = {Hugging Face Model Hub},
url = {https://huggingface.co/Abhijith71/Abhijith}
}
Contact
- Author: Abhijith71
- Hugging Face Profile: https://huggingface.co/Abhijith71
Framework Versions
- PEFT: 0.14.0
- Transformers: 4.34+
- Torch: 2.0+
This model card provides an overview of the fine-tuned model, detailing its purpose, usage, and limitations. Users should review generated outputs and consider fine-tuning for specific applications.
- Downloads last month
- 2,420
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for Abhijith71/Abhijith
Base model
deepseek-ai/deepseek-coder-6.7b-instruct