Model Card for Model ID

Provide a quick This model is designed for instruction-following, reasoning, and general conversational AI with an emphasis on structured responses and knowledge retention.

This modelcard aims to be a base template for new models. It has been generated using this raw template.

Model Details

Model Description

tags:

conversational-ai
instruction-following
reasoning
fastai license: MIT datasets:
open-thoughts/OpenThoughts-114k
ServiceNow-AI/R1-Distill-SFT language:
en metrics:
character base_model:
mistralai/Mistral-Small-24B-Instruct-2501
deepseek-ai/Janus-Pro-7B new_version: deepseek-ai/Janus-Pro-7B library_name: fastai

Model Card: Instruction-Following AI Model

Overview

This model is designed for instruction-following, reasoning, and general conversational AI, with an emphasis on structured responses and knowledge retention. It is based on a combination of Mistral-Small-24B-Instruct-2501 and Janus-Pro-7B, utilizing the FastAI library for training and optimization.

Model Details

License: MIT (Permissive for commercial and personal use).
Language: English (Primary).
Base Models:
- mistralai/Mistral-Small-24B-Instruct-2501 (Optimized for instruction-following and reasoning).
- deepseek-ai/Janus-Pro-7B (Versatile for multi-turn conversations and coding).
Latest Version: Updated to deepseek-ai/Janus-Pro-7B for enhanced conversational abilities and structured output.

Training Data

This model has been trained on a mixture of high-quality datasets:

OpenThoughts-114k – A dataset focused on open-ended reasoning and thought generation.
ServiceNow-AI/R1-Distill-SFT – A structured fine-tuning dataset designed to improve instruction-based responses.

Evaluation Metrics

The model is evaluated based on character-level metrics, ensuring precise and coherent responses across different prompts.

Intended Use

This model is ideal for:

Conversational AI – Engaging in human-like interactions.
Instruction-Following – Responding accurately to queries and directives.
Reasoning Tasks – Providing structured and logical responses.
Text Generation – Assisting in content creation, dialogue systems, and chatbot applications.

Limitations & Considerations

The model primarily supports English and may not perform as well in other languages.
Responses depend on training data and may require fine-tuning for domain-specific applications.
While designed for structured outputs, it may sometimes generate unexpected or biased responses.

Future Improvements

Integration with additional datasets for broader topic coverage.
Fine-tuning for multi-modal capabilities (e.g., vision-language tasks).
Enhancing long-term memory retention for more consistent responses across extended conversations.

📢 How to Use
You can load the model with Hugging Face Transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "deepseek-ai/Janus-Pro-7B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

For FastAI users:

from fastai.text.all import *
learn = load_learner("model.pkl")

For more details, check the Hugging Face model page. 🚀

Developed by: [Raj9305]
Funded by [optional]: [More Information Needed]
Shared by [optional]: [More Information Needed]
Model type: [casual-lm]
Language(s) (NLP): [English]
License: [MIT]
Finetuned from model [optional]: [More Information Needed]

Model Sources [optional]

Repository: [More Information Needed]
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Uses

Personal Assistant & Productivity

Users: Individual consumers, professionals, and students. Use Case: It can serve as a virtual assistant to help with tasks like answering questions, setting reminders, managing schedules, and providing information on-demand (e.g., news, weather). Effect on Users: The model improves personal productivity and daily task management, providing users with instant support in everyday activities.

Direct Use

The model can be employed for entertainment, engaging users in text-based roleplay, interactive storytelling, or casual conversation. It can be customized to simulate different personalities and themes, such as fantasy, sci-fi, or comedy.

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Model Card for Model ID

Model Details

Model Description

Model Card: Instruction-Following AI Model

Overview

Model Details

Training Data

Evaluation Metrics

Intended Use

Limitations & Considerations

Future Improvements

Model Sources [optional]

Uses

Direct Use

Downstream Use [optional]

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Training Details

Training Data

Training Procedure

Preprocessing [optional]

Training Hyperparameters

Speeds, Sizes, Times [optional]

Evaluation

Testing Data, Factors & Metrics

Testing Data

Factors

Metrics

Results

Summary

Model Examination [optional]

Environmental Impact

Technical Specifications [optional]

Model Architecture and Objective

Compute Infrastructure

Hardware

Software

Citation [optional]

Glossary [optional]

More Information [optional]

Model Card Authors [optional]

Model Card Contact

Model tree for raj9305/llama_raj

Datasets used to train raj9305/llama_raj