File size: 3,121 Bytes

efb3463

---
license: mit
language:
- en
base_model:
- meta-llama/Llama-3.1-70B-Instruct
---

# Cakrawala-70B

## Model Description

Cakrawala-70B is a fine-tuned variant of the Llama-3.1-70B-Instruct model, specifically optimized for generating rich roleplaying conversations and character interactions. The model uses QLoRA (Quantized Low-Rank Adaptation) fine-tuning techniques to efficiently adapt the large language model for this specialized use case.

## Intended Use

### Primary Use Case
Cakrawala-70B is designed specifically for generating high-quality roleplaying conversations with the following key characteristics:
- Rich, descriptive character interactions
- Consistent character voice and emotional development
- Show-don't-tell emotional states
- Clear separation between character perspectives
- Structured turn-taking in conversations
- Detailed physical descriptions and environmental awareness

### Target Audience
- Game developers creating interactive narratives
- Writers seeking AI assistance in character development
- RPG platforms and applications
- Interactive fiction developers
- Educational platforms teaching creative writing or character development

## Training Data

### Dataset Composition
- Total examples: 5,867 conversation pairs
- Format: JSON Lines (.jsonl)
- Structure: Conversations field containing alternating messages between participants
- Validation split: 5% of total data

### Data Characteristics
Each training example consists of:
1. Character establishment prompts
2. Multi-turn conversations (12-13 turns minimum)
3. Rich descriptive elements including:
   - Physical actions
   - Facial expressions
   - Tone indicators
   - Environmental details
   - Character reactions

### Data Processing
- Messages are structured with distinct role and content fields
- Training focuses exclusively on completion tokens (train_on_inputs: false)
- Input loss is excluded from calculations
- Sequence length is set to 2048 tokens
- Sample packing is enabled for efficient training

## Training Details

### Base Model
- Architecture: meta-llama/Llama-3.1-70B-Instruct
- Model Type: LlamaForCausalLM
- Tokenizer: AutoTokenizer

### Fine-tuning Approach
- Method: QLoRA (Quantized Low-Rank Adaptation)
- Quantization: 4-bit precision
- Sequence Length: 2048 tokens
- Training Duration: 3 epochs

### LoRA Configuration
- Rank (r): 32
- Alpha: 64
- Dropout: 0.1
- Target Modules:
  - Query Projection (q_proj)
  - Key Projection (k_proj)
  - Value Projection (v_proj)
  - Output Projection (o_proj)

### Training Parameters
- Gradient Accumulation Steps: 16
- Micro Batch Size: 4
- Learning Rate: 0.0003
- Optimizer: AdamW
- Scheduler: Cosine
- Mixed Precision: BF16 & FP16 with TF32 support

## Performance Characteristics

## Limitations

Content Limitations:
   - Training data size (5,867 examples) may limit variety in some scenarios
   - Specialized for roleplaying conversations, may not generalize well to other tasks

## Additional Information

Special Tokens:
- Pad Token: <|end_of_text|>

Infrastructure:
- Supports 8 x H100 NVL configuration
- Utilizes 128 vCPU and 1509 GB RAM