metadata

license: mit
language:
  - en
base_model:
  - meta-llama/Llama-3.1-70B-Instruct

Cakrawala-70B

Model Description

Cakrawala-70B is a fine-tuned variant of the Llama-3.1-70B-Instruct model, specifically optimized for generating rich roleplaying conversations and character interactions. The model uses QLoRA (Quantized Low-Rank Adaptation) fine-tuning techniques to efficiently adapt the large language model for this specialized use case.

Intended Use

Primary Use Case

Cakrawala-70B is designed specifically for generating high-quality roleplaying conversations with the following key characteristics:

Rich, descriptive character interactions
Consistent character voice and emotional development
Show-don't-tell emotional states
Clear separation between character perspectives
Structured turn-taking in conversations
Detailed physical descriptions and environmental awareness

Target Audience

Game developers creating interactive narratives
Writers seeking AI assistance in character development
RPG platforms and applications
Interactive fiction developers
Educational platforms teaching creative writing or character development

Training Data

Dataset Composition

Total examples: 5,867 conversation pairs
Format: JSON Lines (.jsonl)
Structure: Conversations field containing alternating messages between participants
Validation split: 5% of total data

Data Characteristics

Each training example consists of:

Character establishment prompts
Multi-turn conversations (12-13 turns minimum)
Rich descriptive elements including:
- Physical actions
- Facial expressions
- Tone indicators
- Environmental details
- Character reactions

Data Processing

Messages are structured with distinct role and content fields
Training focuses exclusively on completion tokens (train_on_inputs: false)
Input loss is excluded from calculations
Sequence length is set to 2048 tokens
Sample packing is enabled for efficient training

Training Details

Base Model

Architecture: meta-llama/Llama-3.1-70B-Instruct
Model Type: LlamaForCausalLM
Tokenizer: AutoTokenizer

Fine-tuning Approach

Method: QLoRA (Quantized Low-Rank Adaptation)
Quantization: 4-bit precision
Sequence Length: 2048 tokens
Training Duration: 3 epochs

LoRA Configuration

Rank (r): 32
Alpha: 64
Dropout: 0.1
Target Modules:
- Query Projection (q_proj)
- Key Projection (k_proj)
- Value Projection (v_proj)
- Output Projection (o_proj)

Training Parameters

Gradient Accumulation Steps: 16
Micro Batch Size: 4
Learning Rate: 0.0003
Optimizer: AdamW
Scheduler: Cosine
Mixed Precision: BF16 & FP16 with TF32 support

Performance Characteristics

Limitations

Content Limitations:

Training data size (5,867 examples) may limit variety in some scenarios
Specialized for roleplaying conversations, may not generalize well to other tasks

Additional Information

Special Tokens:

Pad Token: <|end_of_text|>

Infrastructure:

Supports 8 x H100 NVL configuration
Utilizes 128 vCPU and 1509 GB RAM