File size: 7,493 Bytes
5d9ca4f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
#
# SPDX-FileCopyrightText: Hadad <hadad@linuxmail.org>
# SPDX-License-Identifier: Apache-2.0
#

# Define function to keep only the first <think> tag at the beginning of the text
def reasoning_tag_start(text: str) -> str:
    """
    This function ensures that the reasoning text contains exactly one opening <think> tag at the very beginning.
    It is common for streamed or concatenated reasoning texts to accumulate multiple <think> tags due to incremental
    appends or repeated insertions. This function cleans the text by removing all occurrences of the <think> tag
    throughout the entire string, then checks if the original text started with a <think> tag. If it did, it reinserts
    a single <think> tag at the start to preserve the intended opening marker.

    The purpose of this function is to normalize the reasoning text so that it has a clean, unambiguous opening tag,
    which is critical for consistent parsing, rendering, or further processing downstream. By preventing multiple
    opening tags, it avoids confusion and formatting errors in the final output.

    Steps:
    1. Remove all <think> tags from the entire text to eliminate duplicates.
    2. Check if the original text began with <think>.
    3. If yes, prepend a single <think> tag to the cleaned text.
    4. If no, return the cleaned text without any opening tag.

    Parameters:
    - text (str): The reasoning text which may contain multiple or misplaced <think> tags.

    Returns:
    - str: The reasoning text normalized to have at most one <think> tag at the start.
    """

    # Remove all <think> tags from the text
    reasoning_mode = text.replace("<think>", "")  # Strip all <think> tags throughout the text
    # Check if the original text started with <think> and reinsert one if so
    if text.startswith("<think>"):  # Reinsert a single <think> tag at the beginning
        return "<think>" + reasoning_mode  # Return the cleaned text with one <think> tag at the start
    else:
        return reasoning_mode  # Return the cleaned text without any <think> tags

# Define function to keep only the last </think> tag at the end of the text
def reasoning_tag_stop(text: str) -> str:
    """
    This function ensures that the reasoning text contains exactly one closing </think> tag at the very end.
    Similar to the opening tag, streamed or concatenated reasoning texts might accumulate multiple closing </think> tags,
    which can cause parsing or display issues. This function removes all closing </think> tags from the text and then
    checks if the original text ended with a closing tag. If it did, it appends a single closing </think> tag at the end,
    preserving the intended closing marker.

    This normalization is important to maintain a clean and consistent structure in the reasoning text, ensuring that
    the closing tag is unambiguous and properly positioned for downstream consumers or renderers.

    Steps:
    1. Remove all </think> tags from the entire text to eliminate duplicates.
    2. Check if the original text ended with </think>.
    3. If yes, append a single </think> tag to the cleaned text.
    4. If no, return the cleaned text without any closing tag.

    Parameters:
    - text (str): The reasoning text which may contain multiple or misplaced </think> tags.

    Returns:
    - str: The reasoning text normalized to have at most one </think> tag at the end.
    """

    # Remove all </think> tags from the text
    reasoning_mode = text.replace("</think>", "")  # Strip all </think> tags throughout the text
    # Check if the original text ended with </think> and reinsert one if so
    if text.endswith("</think>"):  # Reinsert a single </think> tag at the end
        return reasoning_mode + "</think>"  # Return the cleaned text with one </think> tag at the end
    else:
        return reasoning_mode  # Return the cleaned text without any </think> tags

# Define function to ensure text starts with exactly one <think> tag
def reasoning_tag_open(text: str) -> str:
    """
    This function guarantees that the reasoning text starts with exactly one opening <think> tag.
    It first strips any leading whitespace to accurately detect whether the tag is already present.
    If the tag is missing, it inserts a <think> tag followed by a newline at the very beginning of the text.
    If the tag is present, it calls reasoning_tag_start to remove any duplicate tags and ensure only one opening tag remains.

    This function is essential for preparing reasoning text before streaming or output, as it enforces a consistent
    and clean opening tag structure. The newline after the tag improves readability and formatting when displayed.

    Steps:
    1. Strip leading whitespace from the text.
    2. Check if the text starts with <think>.
    3. If not, prepend "<think>\n" to the text.
    4. If yes, clean duplicates using reasoning_tag_start.
    5. Return the normalized text.

    Parameters:
    - text (str): The reasoning text to be normalized.

    Returns:
    - str: The reasoning text with exactly one <think> tag at the start.
    """

    # Remove leading whitespace for accurate tag checking
    stripped = text.lstrip()  # Eliminate spaces or newlines from the start
    # If tag is missing, insert it, else clean up any duplicates
    if not stripped.startswith("<think>"):  # Check if <think> tag is absent at the beginning
        text = "<think>\n" + text  # Add <think> tag followed by a newline at the start
    else:
        text = reasoning_tag_start(text)  # Remove duplicates if the tag is already present
    return text  # Return text with one valid <think> tag at the start

# Define function to ensure text ends with exactly one </think> tag
def reasoning_tag_close(text: str) -> str:
    """
    This function guarantees that the reasoning text ends with exactly one closing </think> tag.
    It first strips any trailing whitespace to accurately detect whether the tag is already present.
    If the tag is missing, it appends a newline, the closing </think> tag, and two additional newlines to the end of the text.
    If the tag is present, it calls reasoning_tag_stop to remove any duplicate closing tags and ensure only one remains.

    This function is crucial for finalizing reasoning text before output or further processing, ensuring the closing tag
    is properly placed and that the text formatting remains clean and readable. The added newlines after the closing tag
    provide spacing for separation from subsequent content.

    Steps:
    1. Strip trailing whitespace from the text.
    2. Check if the text ends with </think>.
    3. If not, append "\n</think>\n\n" to the text.
    4. If yes, clean duplicates using reasoning_tag_stop.
    5. Return the normalized text.

    Parameters:
    - text (str): The reasoning text to be normalized.

    Returns:
    - str: The reasoning text with exactly one </think> tag at the end.
    """

    # Remove trailing whitespace for accurate tag checking
    stripped = text.rstrip()  # Eliminate spaces or newlines from the end
    # If tag is missing, append it, else clean up any duplicates
    if not stripped.endswith("</think>"):  # Check if </think> tag is absent at the end
        text = text.rstrip() + "\n</think>\n\n"  # Append </think> tag with spacing
    else:
        text = reasoning_tag_stop(text)  # Remove duplicates if the tag is already present
    return text  # Return text with one valid </think> tag at the end