File size: 2,433 Bytes
218bf3b
 
 
 
 
 
 
3b1ad8f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
library_name: transformers
tags: []
---

# Model Card for Model ID

This is the [Llama-2-7b-chat tokenizer](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) but modified to support tool use and function calling.

There are only two things different about this tokenizer:
1. This chat template supports the "tool" role while the original tokenizer only supported the "system", "assisstant", and "user" roles.
2. The old Llama tokenizer forced users to have alternating "assistant", "user", "assistant", "user" turns. This chat template does not have any such requirements.

The chat template of this tokenizer looks like this:
```python
{% if messages[0]['role'] == 'system' %}
    {% set loop_messages = messages[1:] %}
    {% set system_message = '<<SYS>>\n' + messages[0]['content'].strip() + '\n<</SYS>>\n\n' %}
{% else %}
    {% set loop_messages = messages %}
    {% set system_message = '' %}
{% endif %}

{% for message in loop_messages %}

    {% if loop.index0 == 0 %}
        {% set content = system_message + message['content'] %}
    {% else %}
        {% set content = message['content'] %}
    {% endif %}

    {% if message['role'] == 'user' %}
        {{ bos_token + '[INST] ' + content.strip() + ' [/INST]' }}
    {% elif message['role'] == 'assistant' %}
        {{ ' '  + content.strip() + ' ' + eos_token }}
    {% elif message['role'] == 'tool' %}
        {{ ' '  + content.strip() + ' ' + eos_token }}
    {% endif %}
{% endfor %}
```

The old Llama chat template (which we no longer use) looked like this:
```python
{% if messages[0]['role'] == 'system' %}
  {% set loop_messages = messages[1:] %}
  {% set system_message = messages[0]['content'] %}
{% else %}
  {% set loop_messages = messages %}
  {% set system_message = false %}
{% endif %}

{% for message in loop_messages %}

  {% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}
    {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}
  {% if loop.index0 == 0 and system_message != false %}
    {% set content = '<<SYS>>\n' + system_message + '\n<</SYS>>\n\n' + message['content'] %}
  {% else %}
    {% set content = message['content'] %}
  {% endif %}

  {% if message['role'] == 'user' %}
    {{ bos_token + '[INST] ' + content.strip() + ' [/INST]' }}
  {% elif message['role'] == 'assistant' %}
    {{ ' '  + content.strip() + ' ' + eos_token }}
  {% endif %}
{% endfor %}
```