passthepizza's picture
Create README.md
efb3463 verified
|
raw
history blame
3.12 kB
metadata
license: mit
language:
  - en
base_model:
  - meta-llama/Llama-3.1-70B-Instruct

Cakrawala-70B

Model Description

Cakrawala-70B is a fine-tuned variant of the Llama-3.1-70B-Instruct model, specifically optimized for generating rich roleplaying conversations and character interactions. The model uses QLoRA (Quantized Low-Rank Adaptation) fine-tuning techniques to efficiently adapt the large language model for this specialized use case.

Intended Use

Primary Use Case

Cakrawala-70B is designed specifically for generating high-quality roleplaying conversations with the following key characteristics:

  • Rich, descriptive character interactions
  • Consistent character voice and emotional development
  • Show-don't-tell emotional states
  • Clear separation between character perspectives
  • Structured turn-taking in conversations
  • Detailed physical descriptions and environmental awareness

Target Audience

  • Game developers creating interactive narratives
  • Writers seeking AI assistance in character development
  • RPG platforms and applications
  • Interactive fiction developers
  • Educational platforms teaching creative writing or character development

Training Data

Dataset Composition

  • Total examples: 5,867 conversation pairs
  • Format: JSON Lines (.jsonl)
  • Structure: Conversations field containing alternating messages between participants
  • Validation split: 5% of total data

Data Characteristics

Each training example consists of:

  1. Character establishment prompts
  2. Multi-turn conversations (12-13 turns minimum)
  3. Rich descriptive elements including:
    • Physical actions
    • Facial expressions
    • Tone indicators
    • Environmental details
    • Character reactions

Data Processing

  • Messages are structured with distinct role and content fields
  • Training focuses exclusively on completion tokens (train_on_inputs: false)
  • Input loss is excluded from calculations
  • Sequence length is set to 2048 tokens
  • Sample packing is enabled for efficient training

Training Details

Base Model

  • Architecture: meta-llama/Llama-3.1-70B-Instruct
  • Model Type: LlamaForCausalLM
  • Tokenizer: AutoTokenizer

Fine-tuning Approach

  • Method: QLoRA (Quantized Low-Rank Adaptation)
  • Quantization: 4-bit precision
  • Sequence Length: 2048 tokens
  • Training Duration: 3 epochs

LoRA Configuration

  • Rank (r): 32
  • Alpha: 64
  • Dropout: 0.1
  • Target Modules:
    • Query Projection (q_proj)
    • Key Projection (k_proj)
    • Value Projection (v_proj)
    • Output Projection (o_proj)

Training Parameters

  • Gradient Accumulation Steps: 16
  • Micro Batch Size: 4
  • Learning Rate: 0.0003
  • Optimizer: AdamW
  • Scheduler: Cosine
  • Mixed Precision: BF16 & FP16 with TF32 support

Performance Characteristics

Limitations

Content Limitations:

  • Training data size (5,867 examples) may limit variety in some scenarios
  • Specialized for roleplaying conversations, may not generalize well to other tasks

Additional Information

Special Tokens:

  • Pad Token: <|end_of_text|>

Infrastructure:

  • Supports 8 x H100 NVL configuration
  • Utilizes 128 vCPU and 1509 GB RAM