Model Card for Model ID
Provide a quick This model is designed for instruction-following, reasoning, and general conversational AI with an emphasis on structured responses and knowledge retention.
This modelcard aims to be a base template for new models. It has been generated using this raw template.
Model Details
Model Description
tags:
- conversational-ai
- instruction-following
- reasoning
- fastai license: MIT datasets:
- open-thoughts/OpenThoughts-114k
- ServiceNow-AI/R1-Distill-SFT language:
- en metrics:
- character base_model:
- mistralai/Mistral-Small-24B-Instruct-2501
- deepseek-ai/Janus-Pro-7B new_version: deepseek-ai/Janus-Pro-7B library_name: fastai
Model Card: Instruction-Following AI Model
Overview
This model is designed for instruction-following, reasoning, and general conversational AI, with an emphasis on structured responses and knowledge retention. It is based on a combination of Mistral-Small-24B-Instruct-2501 and Janus-Pro-7B, utilizing the FastAI library for training and optimization.
Model Details
- License: MIT (Permissive for commercial and personal use).
- Language: English (Primary).
- Base Models:
- mistralai/Mistral-Small-24B-Instruct-2501 (Optimized for instruction-following and reasoning).
- deepseek-ai/Janus-Pro-7B (Versatile for multi-turn conversations and coding).
- Latest Version: Updated to deepseek-ai/Janus-Pro-7B for enhanced conversational abilities and structured output.
Training Data
This model has been trained on a mixture of high-quality datasets:
- OpenThoughts-114k – A dataset focused on open-ended reasoning and thought generation.
- ServiceNow-AI/R1-Distill-SFT – A structured fine-tuning dataset designed to improve instruction-based responses.
Evaluation Metrics
The model is evaluated based on character-level metrics, ensuring precise and coherent responses across different prompts.
Intended Use
This model is ideal for:
- Conversational AI – Engaging in human-like interactions.
- Instruction-Following – Responding accurately to queries and directives.
- Reasoning Tasks – Providing structured and logical responses.
- Text Generation – Assisting in content creation, dialogue systems, and chatbot applications.
Limitations & Considerations
- The model primarily supports English and may not perform as well in other languages.
- Responses depend on training data and may require fine-tuning for domain-specific applications.
- While designed for structured outputs, it may sometimes generate unexpected or biased responses.
Future Improvements
- Integration with additional datasets for broader topic coverage.
- Fine-tuning for multi-modal capabilities (e.g., vision-language tasks).
- Enhancing long-term memory retention for more consistent responses across extended conversations.
📢 How to Use
You can load the model with Hugging Face Transformers:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "deepseek-ai/Janus-Pro-7B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
For FastAI users:
from fastai.text.all import *
learn = load_learner("model.pkl")
For more details, check the Hugging Face model page. 🚀
- Developed by: [Raj9305]
- Funded by [optional]: [More Information Needed]
- Shared by [optional]: [More Information Needed]
- Model type: [casual-lm]
- Language(s) (NLP): [English]
- License: [MIT]
- Finetuned from model [optional]: [More Information Needed]
Model Sources [optional]
- Repository: [More Information Needed]
- Paper [optional]: [More Information Needed]
- Demo [optional]: [More Information Needed]
Uses
Personal Assistant & Productivity
Users: Individual consumers, professionals, and students. Use Case: It can serve as a virtual assistant to help with tasks like answering questions, setting reminders, managing schedules, and providing information on-demand (e.g., news, weather). Effect on Users: The model improves personal productivity and daily task management, providing users with instant support in everyday activities.
Direct Use
The model can be employed for entertainment, engaging users in text-based roleplay, interactive storytelling, or casual conversation. It can be customized to simulate different personalities and themes, such as fantasy, sci-fi, or comedy.
[More Information Needed]
Downstream Use [optional]
[More Information Needed]
Out-of-Scope Use
[More Information Needed]
Bias, Risks, and Limitations
[More Information Needed]
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model.
[More Information Needed]
Training Details
Training Data
[More Information Needed]
Training Procedure
Preprocessing [optional]
[More Information Needed]
Training Hyperparameters
- Training regime: [More Information Needed]
Speeds, Sizes, Times [optional]
[More Information Needed]
Evaluation
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
Model Examination [optional]
[More Information Needed]
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: [More Information Needed]
- Hours used: [More Information Needed]
- Cloud Provider: [More Information Needed]
- Compute Region: [More Information Needed]
- Carbon Emitted: [More Information Needed]
Technical Specifications [optional]
Model Architecture and Objective
[More Information Needed]
Compute Infrastructure
[More Information Needed]
Hardware
[More Information Needed]
Software
[More Information Needed]
Citation [optional]
BibTeX:
[More Information Needed]
APA:
[More Information Needed]
Glossary [optional]
[More Information Needed]
More Information [optional]
[More Information Needed]
Model Card Authors [optional]
[More Information Needed]
Model Card Contact
[More Information Needed]
Model tree for raj9305/llama_raj
Base model
deepseek-ai/Janus-Pro-7B