metadata
license: mit
language:
- en
base_model:
- meta-llama/Llama-3.1-70B-Instruct
Cakrawala-70B
Model Description
Cakrawala-70B is a fine-tuned variant of the Llama-3.1-70B-Instruct model, specifically optimized for generating rich roleplaying conversations and character interactions. The model uses QLoRA (Quantized Low-Rank Adaptation) fine-tuning techniques to efficiently adapt the large language model for this specialized use case.
Intended Use
Primary Use Case
Cakrawala-70B is designed specifically for generating high-quality roleplaying conversations with the following key characteristics:
- Rich, descriptive character interactions
- Consistent character voice and emotional development
- Show-don't-tell emotional states
- Clear separation between character perspectives
- Structured turn-taking in conversations
- Detailed physical descriptions and environmental awareness
Target Audience
- Game developers creating interactive narratives
- Writers seeking AI assistance in character development
- RPG platforms and applications
- Interactive fiction developers
- Educational platforms teaching creative writing or character development
Training Data
Dataset Composition
- Total examples: 5,867 conversation pairs
- Format: JSON Lines (.jsonl)
- Structure: Conversations field containing alternating messages between participants
- Validation split: 5% of total data
Data Characteristics
Each training example consists of:
- Character establishment prompts
- Multi-turn conversations (12-13 turns minimum)
- Rich descriptive elements including:
- Physical actions
- Facial expressions
- Tone indicators
- Environmental details
- Character reactions
Data Processing
- Messages are structured with distinct role and content fields
- Training focuses exclusively on completion tokens (train_on_inputs: false)
- Input loss is excluded from calculations
- Sequence length is set to 2048 tokens
- Sample packing is enabled for efficient training
Training Details
Base Model
- Architecture: meta-llama/Llama-3.1-70B-Instruct
- Model Type: LlamaForCausalLM
- Tokenizer: AutoTokenizer
Fine-tuning Approach
- Method: QLoRA (Quantized Low-Rank Adaptation)
- Quantization: 4-bit precision
- Sequence Length: 2048 tokens
- Training Duration: 3 epochs
LoRA Configuration
- Rank (r): 32
- Alpha: 64
- Dropout: 0.1
- Target Modules:
- Query Projection (q_proj)
- Key Projection (k_proj)
- Value Projection (v_proj)
- Output Projection (o_proj)
Training Parameters
- Gradient Accumulation Steps: 16
- Micro Batch Size: 4
- Learning Rate: 0.0003
- Optimizer: AdamW
- Scheduler: Cosine
- Mixed Precision: BF16 & FP16 with TF32 support
Performance Characteristics
Limitations
Content Limitations:
- Training data size (5,867 examples) may limit variety in some scenarios
- Specialized for roleplaying conversations, may not generalize well to other tasks
Additional Information
Special Tokens:
- Pad Token: <|end_of_text|>
Infrastructure:
- Supports 8 x H100 NVL configuration
- Utilizes 128 vCPU and 1509 GB RAM