---
language:
- en
license: mit
library_name: transformers
datasets:
- julep-ai/samantha_finetune_dataset_03
pipeline_tag: text-generation
---

# Samantha

## Technical notes

This model is trained on a specialized dataset and uses special sentinel tokens to demarcate conversations.

### Usage
For usage, you can refer to the [`chat.py`](https://huggingface.co/julep-ai/samantha-33b/blob/main/chat.py) file in this repo for an example.

### Concepts
- Each conversation consists of n "sections"
- Each section can be one of:
  + `me`: The model
  + `person`: The speaker
  + `situation`: relevant background information to set the context of the conversation
  + `thought`: Thoughts generated by the model for parsing intermediate steps etc
  + `information`: External information added into the context by the system running the model
- The model and speaker sections can optionally include a name like `me (Samantha)` or `person (Dmitry)`

### Sentinel Tokens
- `<|im_start|>` token marks the start of a "section"
- `<|im_end|>` token marks the end of a "section".

## Example

```
<|im_start|>situation
I am talking to Diwank. I want to ask him about his food preferences.<|im_end|>
<|im_start|>person (Diwank)
Hey Samantha! What do you want to talk about?<|im_end|>
<|im_start|>me (Samantha)
```