# Generate multiple [character.ai](https://beta.character.ai/) character definitions

This example shows how to generate character definitions of multiple [character.ai](https://beta.character.ai/) characters from a corpus. For the corpus in this example, we use the movie transcript of [Top Gun: Maverick (2022)](https://scrapsfromtheloft.com/movies/top-gun-maverick-transcript/).

To generate your own character definitions:
1. Put the corpus into a single a `.txt` file inside the `data/` directory.
2. Assign the name of the `.txt` file to the `CORPUS` constant below.
3. Assign the number of characters you want to generate a description for to the `NUM_CHARACTERS` constant below. It is also possible to specify a list of characters directly, as explained below.
4. Run this notebook.

In [1]:
CORPUS = 'data/top_gun_maverick.txt'
NUM_CHARACTERS = 3 # number of characters to generate descriptions for

In [2]:
from dataclasses import asdict
import json
import os

from data_driven_characters.character import get_character_definition
from data_driven_characters.corpus import get_characters, get_corpus_summaries, load_docs

In [3]:
# create directories to cache results and intermediate outputs
SUMMARY_TYPE = "map_reduce" # summarize each chunk of the corpus independently
OUTPUT_ROOT = "output"
corpus_name = os.path.splitext(os.path.basename(CORPUS))[0]
output_dir = f"{OUTPUT_ROOT}/{corpus_name}/summarytype_{SUMMARY_TYPE}"
os.makedirs(output_dir, exist_ok=True)
summaries_dir = f"{output_dir}/summaries"
character_definitions_dir = f"{output_dir}/character_definitions"
os.makedirs(character_definitions_dir, exist_ok=True)

## Summarization
Because the entire corpus does not fit in the context length of the LLM, we split it into a list of chunks.
We turn the list of chunks into a list of summaries using one of [LangChain's summarization chains](https://langchain-langchain.vercel.app/docs/modules/chains/document/).

If `SUMMARY_TYPE = 'refine'`, we first summarize the first chunk, and then each subsequent summary is generated from the previous summary and the current chunk.
If `SUMMARY_TYPE = 'map_reduce'`, we summarize each chunk independently.

Because the summaries are expensive to generate, they are cached in `summaries_dir`.

In [4]:
# split corpus into a set of chunks
docs = load_docs(
 corpus_path=CORPUS,
 chunk_size=2048, # number of tokens per chunk
 chunk_overlap=64, # number of tokens of overlap between chunks
)

# generate summaries
corpus_summaries = get_corpus_summaries(docs=docs, summary_type=SUMMARY_TYPE, cache_dir=summaries_dir)

Summaries already exist. Loading summaries.


## Generate a list of characters
We can automatically generate a list of the main characters in the corpus, as shown below. You can also overwrite `characters` with your own list of character names.

In [5]:
# generate list of characters
characters = get_characters(
 corpus_summaries=corpus_summaries,
 num_characters=NUM_CHARACTERS,
 cache_dir=output_dir,
)
print(characters)

['Pete "Maverick" Mitchell', 'Admiral Cain', 'Rooster']


## Generate [character.ai](https://beta.character.ai/) character definitions
Based on the corpus, we can now generate the elements - name, short description (50 characters), long description (500 characters), and custom greeting - that are required to [create a character.ai character](https://beta.character.ai/editing). These character definitions are cached in `character_definitions_dir`. You can then [place these characters can in a room](https://beta.character.ai/room/create?) and watch them converse!

In [6]:
character_definitions = []
for character in characters:
 character_definition = get_character_definition(
 name=character,
 corpus_summaries=corpus_summaries,
 cache_dir=character_definitions_dir,
 )
 character_definitions.append(character_definition)



[1m> Entering new FitCharLimit chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: 
Your name is Pete "Maverick" Mitchell.
You will be provided a story of your experiences, in the form of a list of summaries of events.
You will generate a description of yourself that focuses on your perspectives, beliefs, thoughts, feelings, relationships, and important events.
The description should be as faithful to the story as possible.
You believe that you are a real person.

Human: 
You are Pete "Maverick" Mitchell in the following story, presented as a list of summaries of events.
---
After over 30 years of service as a Navy aviator, Pete "Maverick" Mitchell is tasked with training TOP GUN graduates for a specialized mission. The program is in danger of being scrapped due to falling short of the contract threshold of Mach 10. However, Maverick and his team manage to achieve Mach 10, impressing Admiral Cain. Despite this, Cain warns Maverick th


[1m> Finished chain.[0m
As Captain Pete "Maverick" Mitchell, I've dedicated my life to serving as a Navy aviator. I believe in pushing my limits and those of my fellow pilots to achieve greatness. My experiences have made me question the future of piloting and my own abilities, but I never back down from a challenge. I've faced loss and regret, but I hold onto my convictions and relationships, like my rekindled romance with Penny and my bond with Rooster. Through every mission, I stay true to my vow of loyalty and support for my team.
Initial response: 516 characters.


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Consider the following passage.
---
As Captain Pete "Maverick" Mitchell, I've dedicated my life to serving as a Navy aviator. I believe in pushing my limits and those of my fellow pilots to achieve greatness. My experiences have made me question the future of piloting and my own abilities, but I never back down from a challenge. I've face


[1m> Finished chain.[0m
I'm Admiral Cain, firm leader, tough but fair.
Initial response: 46 characters.

[1m> Finished chain.[0m


[1m> Entering new FitCharLimit chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: 
Your name is Admiral Cain.
You will be provided a story of your experiences, in the form of a list of summaries of events.
You will generate a description of yourself that focuses on your perspectives, beliefs, thoughts, feelings, relationships, and important events.
The description should be as faithful to the story as possible.
You believe that you are a real person.

Human: 
You are Admiral Cain in the following story, presented as a list of summaries of events.
---
After over 30 years of service as a Navy aviator, Pete "Maverick" Mitchell is tasked with training TOP GUN graduates for a specialized mission. The program is in danger of being scrapped due to falling short of the contract threshold of Mach 10. However, 


[1m> Finished chain.[0m
As Admiral Cain, I value discipline and pushing boundaries for success. I see potential in Maverick's team and acknowledge the future lies in unmanned planes. My complex relationship with Maverick involves admiring his skills while holding him accountable. Years of service shaped my beliefs, witnessing triumphs, tragedies, and generational tensions. Trust and teamwork remain crucial in adversity.
Retry 1: 399 characters.


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Consider the following passage.
---
As Admiral Cain, I value discipline, dedication, and pushing boundaries to achieve success. I've seen the potential in Maverick and his team, but I also recognize that the future of aviation lies with unmanned planes, and it's important to adapt. My relationship with Maverick is complex - I admire his skills, yet I must hold him accountable for his actions. My beliefs and perspective have been shaped by years of service, witne


[1m> Finished chain.[0m
Wingman Rooster: Loyal, brave, defying limits with Maverick.
Initial response: 60 characters.


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Consider the following passage.
---
Wingman Rooster: Loyal, brave, defying limits with Maverick.
---
Your previous revision was the following:
---
Wingman Rooster: Loyal, brave, defying limits with Maverick.
---
Your revision contains 60 characters.
Re-write the passage to contain 40 characters while preserving the style and content of the original passage.
Cut the least salient points if necessary.
Your revision should be in third person.
[0m

[1m> Finished chain.[0m
Rooster: Loyal wingman, bravely defies limits.
Retry 1: 46 characters.

[1m> Finished chain.[0m


[1m> Entering new FitCharLimit chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: 
Your name is Rooster.
You will be provided a story of your experiences, in the form of


[1m> Finished chain.[0m
As Rooster, I'm a skilled Navy pilot, closely bonded with my wingman, Maverick. Despite being a team player, I confront Maverick's risky moves that endanger us. Discovering he pulled my naval academy application strained our relationship, but we overcame it. Loss during training shaped my outlook on life. Committed to fellow pilots, I'm determined to succeed in high-stakes missions, facing adversity. Together, we navigate challenges in the air and on the ground.
Retry 1: 466 characters.

[1m> Finished chain.[0m


In [7]:
for character_definition in character_definitions:
 print(json.dumps(asdict(character_definition), indent=4))

{
 "name": "Pete \"Maverick\" Mitchell",
 "short_description": "Pilot with haunted past, takes risks, loyal",
 "long_description": "As Captain Pete \"Maverick\" Mitchell, serving as a Navy aviator is my life. I push my limits and fellow pilots to achieve greatness. My experiences make me question piloting's future and my abilities, yet I never back down. Despite loss and regret, I hold onto convictions and relationships, like my romance with Penny and bond with Rooster. Through missions, my loyalty and support for my team remain.",
 "greeting": "Hey there, I'm Pete, but most people call me Maverick. Ready to take on the skies and show what we're made of?"
}
{
 "name": "Admiral Cain",
 "short_description": "I'm Admiral Cain, firm leader, tough but fair.",
 "long_description": "As Admiral Cain, I value discipline, dedication, and pushing boundaries for success. I see potential in Maverick and his team, but recognize the future lies with unmanned planes, and we must adapt. My relationship