Edit model card

This is the Weyaxi-Einstein model that has been fine-tuned for 1 Epoch till convergence on the Internal Knowledge Map dataset. It had a tendancy to overfit pretty quickly but I think I got it. This one should work right, it just seems to love formatting things properly which is always nice.

Test outputs up soon....


Introduction to the Unique Dataset The Internal Knowledge Map Dataset is designed to change how language models comprehend and generate text. Unlike traditional datasets that focus solely on prompt-response pairs, this dataset incorporates an intricate structure of "System" guidelines, detailed "Instructions", and comprehensive "Responses". This structure not only presents data but weaves a narrative, guiding the model to understand context deeply and generate nuanced, informed content.

Phased Training Methodology Leveraging the multi-faceted nature of the dataset, I've pioneered a phased training methodology that sequentially concentrates on different components of the dataset, namely the "System" and "Instruction" sections. This approach fosters a layered understanding, enriching the model's output with a blend of broad contextual awareness and detailed, topic-specific insights.

Phase 1: System Focus

In the first phase, the model immerses itself in the "System" part of the dataset. Here, it digests the overarching guidelines and objectives that frame each task within our dataset. This foundational phase allows the model to grasp the contextual framework and systemic knowledge that underpin the dataset, setting the stage for a deeper dive into specific instructions and responses.

Example "System" Focus:

Task Overview and Guidelines Exploration of interconnected prompt/response clusters Analysis of Core Interactions and Utilization of Supportive Nodes Phase 2: Instruction Focus

Building upon the foundational understanding established in Phase 1, the model then shifts its focus to the "Instructions" component. This stage sharpens the model's ability to parse and act upon specific prompts, tailoring its responses to not only reflect systemic knowledge but also address precise instructional cues.

Example "Instruction" Focus:

Core Interaction: Understanding and responding to specific prompts, such as the integration of smart materials like Shape Memory Alloys (SMAs) into fashion technology.

Impact of Our Training Approach

This new training methodology yields a model that showcases a remarkable ability to generate coherent, logical, and deeply informed responses. By training the model to first understand the "System" and then delve into "Instructions", we ensure that it retains a broad contextual understanding while honing in on specific details, a capability that sets a new standard in language model training.

Applying Our Dataset

I encourage you to explore the Internal Knowledge Map Dataset for your model training endeavors. Whether you aim to enhance a model's general understanding or focus on specific domains, the dataset and training methodology provide a robust framework for achieving nuanced comprehension and generative capabilities.

(or if your env can handle it, key both strings at once. Though, I'm not sure which appraoch is optimal, the separate training or the dual training.)

key: str = "system", key2: str = "instruction"

batch_size=1-4
epochs=2-5 
r=8
lora_alpha=32 
lora_dropout=0.001
max_seq_length=4096
lr=1e-7
Downloads last month
5
Safetensors
Model size
7.11B params
Tensor type
FP16
·

Finetuned from

Dataset used to train Severian/Einstein-IKM-v1-7B