TRL documentation

Data Utilities

You are viewing v0.11.2 version. A newer version v0.13.0 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Data Utilities

trl.is_conversational

< >

( example: Dict ) bool

Parameters

  • example (Dict[str, Any]) — A single data entry of a dataset. The example can have different keys depending on the dataset format.

Returns

bool

True if the data is in a conversational format, False otherwise.

Check if the example is in a conversational format.

Examples:

>>> example = {"prompt": [{"role": "user", "content": "What color is the sky?"}]}
>>> is_conversational(example)
True
>>> example = {"prompt": "The sky is"})
>>> is_conversational(example)
False

trl.apply_chat_template

< >

( example: Dict tokenizer: PreTrainedTokenizer )

Apply a chat template to a conversational example.

For more details, see maybe_apply_chat_template().

trl.maybe_apply_chat_template

< >

( example: Dict tokenizer: PreTrainedTokenizer ) Dict[str, str]

Parameters

  • example (Dict[str, List[Dict[str, str]]) — Dictionary representing a single data entry of a conversational dataset. Each data entry can have different keys depending on the dataset format. The supported dataset formats are:

    • Language modeling dataset: "messages".
    • Prompt-only dataset: "prompt".
    • Prompt-completion dataset: "prompt" and "completion".
    • Preference dataset: "prompt", "chosen", and "rejected".
    • Preference dataset with implicit prompt: "chosen" and "rejected".
    • Unpaired preference dataset: "prompt", "completion", and "label".

    For keys "messages", "prompt", "chosen", "rejected", and "completion", the values are lists of messages, where each message is a dictionary with keys "role" and "content".

  • tokenizer (PreTrainedTokenizer) — The tokenizer to apply the chat template with.

Returns

Dict[str, str]

The formatted example with the chat template applied.

If the example is in a conversational format, apply a chat template to it.

Note: This function does not alter the keys, except for Language modeling dataset, where "messages" is replaced by "text".

Example:

>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-128k-instruct")
>>> example = {
...     "prompt": [{"role": "user", "content": "What color is the sky?"}],
...     "completion": [{"role": "assistant", "content": "It is blue."}]
... }
>>> apply_chat_template(example, tokenizer)
{'prompt': '<|user|>\nWhat color is the sky?<|end|>\n<|assistant|>\n', 'completion': 'It is blue.<|end|>\n<|endoftext|>'}

trl.extract_prompt

< >

( example: Dict )

Extracts the shared prompt from a preference data example, where the prompt is implicit within both the chosen and rejected completions.

For more details, see maybe_extract_prompt().

trl.maybe_extract_prompt

< >

( example: Dict ) Dict[str, List]

Parameters

  • example (Dict[str, List]) — A dictionary representing a single data entry in the preference dataset. It must contain the keys "chosen" and "rejected", where each value is a list.

Returns

Dict[str, List]

A dictionary containing:

  • "prompt": The longest common prefix between the “chosen” and “rejected” completions.
  • "chosen": The remainder of the “chosen” completion, with the prompt removed.
  • "rejected": The remainder of the “rejected” completion, with the prompt removed.

Extracts the shared prompt from a preference data example, where the prompt is implicit within both the chosen and rejected completions.

If the example already contains a "prompt" key, the function returns the example as is. Else, the function

identifies the longest common sequence (prefix) of conversation turns between the “chosen” and “rejected” completions and extracts this as the prompt. It then removes this prompt from the respective “chosen” and “rejected” completions.

Examples:

>>> example = {
...     "chosen": [
...         {"role": "user", "content": "What color is the sky?"},
...         {"role": "assistant", "content": "It is blue."}
...     ],
...     "rejected": [
...         {"role": "user", "content": "What color is the sky?"},
...         {"role": "assistant", "content": "It is green."}
...     ]
... }
>>> extract_prompt(example)
{'prompt': [{'role': 'user', 'content': 'What color is the sky?'}],
 'chosen': [{'role': 'assistant', 'content': 'It is blue.'}],
 'rejected': [{'role': 'assistant', 'content': 'It is green.'}]}

Or, with the map method of datasets.Dataset:

>>> from trl import extract_prompt
>>> from datasets import Dataset
>>> dataset_dict = {
...     "chosen": [
...         [
...             {"role": "user", "content": "What color is the sky?"},
...             {"role": "assistant", "content": "It is blue."},
...         ],
...         [
...             {"role": "user", "content": "Where is the sun?"},
...             {"role": "assistant", "content": "In the sky."},
...         ],
...     ],
...     "rejected": [
...         [
...             {"role": "user", "content": "What color is the sky?"},
...             {"role": "assistant", "content": "It is green."},
...         ],
...         [
...             {"role": "user", "content": "Where is the sun?"},
...             {"role": "assistant", "content": "In the sea."},
...         ],
...     ],
... }
>>> dataset = Dataset.from_dict(dataset_dict)
>>> dataset = dataset.map(extract_prompt)
>>> dataset[0]
{'prompt': [{'role': 'user', 'content': 'What color is the sky?'}],
 'chosen': [{'role': 'assistant', 'content': 'It is blue.'}],
 'rejected': [{'role': 'assistant', 'content': 'It is green.'}]}

trl.unpair_preference_dataset

< >

( dataset: DatasetType num_proc: Optional = None ) Dataset

Parameters

  • dataset (Dataset or DatasetDict) — Preference dataset to unpair. The dataset must have columns "chosen", "rejected" and optionally "prompt".
  • num_proc (Optional[int], optional, defaults to None) — Number of processes to use for processing the dataset.

Returns

Dataset

The unpaired preference dataset.

Unpair a preference dataset.

Example:

>>> from datasets import Dataset
>>> dataset_dict = {
...     "prompt": ["The sky is", "The sun is"]
...     "chosen": [" blue.", "in the sky."],
...     "rejected": [" green.", " in the sea."]
... }
>>> dataset = Dataset.from_dict(dataset_dict)
>>> dataset = unpair_preference_dataset(dataset)
>>> dataset
Dataset({
    features: ['prompt', 'completion', 'label'],
    num_rows: 4
})
>>> dataset[0]
{'prompt': 'The sky is', 'completion': ' blue.', 'label': True}

trl.maybe_unpair_preference_dataset

< >

( dataset: DatasetType num_proc: Optional = None ) Dataset or DatasetDict

Parameters

  • dataset (Dataset or DatasetDict) — Preference dataset to unpair. The dataset must have columns "chosen", "rejected" and optionally "prompt".
  • num_proc (Optional[int], optional, defaults to None) — Number of processes to use for processing the dataset.

Returns

Dataset or DatasetDict

The unpaired preference dataset if it was paired, otherwise the original dataset.

Unpair a preference dataset if it is paired.

Example:

>>> from datasets import Dataset
>>> dataset_dict = {
...     "prompt": ["The sky is", "The sun is"]
...     "chosen": [" blue.", "in the sky."],
...     "rejected": [" green.", " in the sea."]
... }
>>> dataset = Dataset.from_dict(dataset_dict)
>>> dataset = unpair_preference_dataset(dataset)
>>> dataset
Dataset({
    features: ['prompt', 'completion', 'label'],
    num_rows: 4
})
>>> dataset[0]
{'prompt': 'The sky is', 'completion': ' blue.', 'label': True}
< > Update on GitHub