passthepizza commited on
Commit
efb3463
1 Parent(s): 1efaa0a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +105 -0
README.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ base_model:
6
+ - meta-llama/Llama-3.1-70B-Instruct
7
+ ---
8
+
9
+ # Cakrawala-70B
10
+
11
+ ## Model Description
12
+
13
+ Cakrawala-70B is a fine-tuned variant of the Llama-3.1-70B-Instruct model, specifically optimized for generating rich roleplaying conversations and character interactions. The model uses QLoRA (Quantized Low-Rank Adaptation) fine-tuning techniques to efficiently adapt the large language model for this specialized use case.
14
+
15
+ ## Intended Use
16
+
17
+ ### Primary Use Case
18
+ Cakrawala-70B is designed specifically for generating high-quality roleplaying conversations with the following key characteristics:
19
+ - Rich, descriptive character interactions
20
+ - Consistent character voice and emotional development
21
+ - Show-don't-tell emotional states
22
+ - Clear separation between character perspectives
23
+ - Structured turn-taking in conversations
24
+ - Detailed physical descriptions and environmental awareness
25
+
26
+ ### Target Audience
27
+ - Game developers creating interactive narratives
28
+ - Writers seeking AI assistance in character development
29
+ - RPG platforms and applications
30
+ - Interactive fiction developers
31
+ - Educational platforms teaching creative writing or character development
32
+
33
+ ## Training Data
34
+
35
+ ### Dataset Composition
36
+ - Total examples: 5,867 conversation pairs
37
+ - Format: JSON Lines (.jsonl)
38
+ - Structure: Conversations field containing alternating messages between participants
39
+ - Validation split: 5% of total data
40
+
41
+ ### Data Characteristics
42
+ Each training example consists of:
43
+ 1. Character establishment prompts
44
+ 2. Multi-turn conversations (12-13 turns minimum)
45
+ 3. Rich descriptive elements including:
46
+ - Physical actions
47
+ - Facial expressions
48
+ - Tone indicators
49
+ - Environmental details
50
+ - Character reactions
51
+
52
+ ### Data Processing
53
+ - Messages are structured with distinct role and content fields
54
+ - Training focuses exclusively on completion tokens (train_on_inputs: false)
55
+ - Input loss is excluded from calculations
56
+ - Sequence length is set to 2048 tokens
57
+ - Sample packing is enabled for efficient training
58
+
59
+ ## Training Details
60
+
61
+ ### Base Model
62
+ - Architecture: meta-llama/Llama-3.1-70B-Instruct
63
+ - Model Type: LlamaForCausalLM
64
+ - Tokenizer: AutoTokenizer
65
+
66
+ ### Fine-tuning Approach
67
+ - Method: QLoRA (Quantized Low-Rank Adaptation)
68
+ - Quantization: 4-bit precision
69
+ - Sequence Length: 2048 tokens
70
+ - Training Duration: 3 epochs
71
+
72
+ ### LoRA Configuration
73
+ - Rank (r): 32
74
+ - Alpha: 64
75
+ - Dropout: 0.1
76
+ - Target Modules:
77
+ - Query Projection (q_proj)
78
+ - Key Projection (k_proj)
79
+ - Value Projection (v_proj)
80
+ - Output Projection (o_proj)
81
+
82
+ ### Training Parameters
83
+ - Gradient Accumulation Steps: 16
84
+ - Micro Batch Size: 4
85
+ - Learning Rate: 0.0003
86
+ - Optimizer: AdamW
87
+ - Scheduler: Cosine
88
+ - Mixed Precision: BF16 & FP16 with TF32 support
89
+
90
+ ## Performance Characteristics
91
+
92
+ ## Limitations
93
+
94
+ Content Limitations:
95
+ - Training data size (5,867 examples) may limit variety in some scenarios
96
+ - Specialized for roleplaying conversations, may not generalize well to other tasks
97
+
98
+ ## Additional Information
99
+
100
+ Special Tokens:
101
+ - Pad Token: <|end_of_text|>
102
+
103
+ Infrastructure:
104
+ - Supports 8 x H100 NVL configuration
105
+ - Utilizes 128 vCPU and 1509 GB RAM