Spaces:
Running
Running
# | |
# SPDX-FileCopyrightText: Hadad <hadad@linuxmail.org> | |
# SPDX-License-Identifier: Apache-2.0 | |
# | |
# Define function to keep only the first <think> tag at the beginning of the text | |
def reasoning_tag_start(text: str) -> str: | |
""" | |
This function ensures that the reasoning text contains exactly one opening <think> tag at the very beginning. | |
It is common for streamed or concatenated reasoning texts to accumulate multiple <think> tags due to incremental | |
appends or repeated insertions. This function cleans the text by removing all occurrences of the <think> tag | |
throughout the entire string, then checks if the original text started with a <think> tag. If it did, it reinserts | |
a single <think> tag at the start to preserve the intended opening marker. | |
The purpose of this function is to normalize the reasoning text so that it has a clean, unambiguous opening tag, | |
which is critical for consistent parsing, rendering, or further processing downstream. By preventing multiple | |
opening tags, it avoids confusion and formatting errors in the final output. | |
Steps: | |
1. Remove all <think> tags from the entire text to eliminate duplicates. | |
2. Check if the original text began with <think>. | |
3. If yes, prepend a single <think> tag to the cleaned text. | |
4. If no, return the cleaned text without any opening tag. | |
Parameters: | |
- text (str): The reasoning text which may contain multiple or misplaced <think> tags. | |
Returns: | |
- str: The reasoning text normalized to have at most one <think> tag at the start. | |
""" | |
# Remove all <think> tags from the text | |
reasoning_mode = text.replace("<think>", "") # Strip all <think> tags throughout the text | |
# Check if the original text started with <think> and reinsert one if so | |
if text.startswith("<think>"): # Reinsert a single <think> tag at the beginning | |
return "<think>" + reasoning_mode # Return the cleaned text with one <think> tag at the start | |
else: | |
return reasoning_mode # Return the cleaned text without any <think> tags | |
# Define function to keep only the last </think> tag at the end of the text | |
def reasoning_tag_stop(text: str) -> str: | |
""" | |
This function ensures that the reasoning text contains exactly one closing </think> tag at the very end. | |
Similar to the opening tag, streamed or concatenated reasoning texts might accumulate multiple closing </think> tags, | |
which can cause parsing or display issues. This function removes all closing </think> tags from the text and then | |
checks if the original text ended with a closing tag. If it did, it appends a single closing </think> tag at the end, | |
preserving the intended closing marker. | |
This normalization is important to maintain a clean and consistent structure in the reasoning text, ensuring that | |
the closing tag is unambiguous and properly positioned for downstream consumers or renderers. | |
Steps: | |
1. Remove all </think> tags from the entire text to eliminate duplicates. | |
2. Check if the original text ended with </think>. | |
3. If yes, append a single </think> tag to the cleaned text. | |
4. If no, return the cleaned text without any closing tag. | |
Parameters: | |
- text (str): The reasoning text which may contain multiple or misplaced </think> tags. | |
Returns: | |
- str: The reasoning text normalized to have at most one </think> tag at the end. | |
""" | |
# Remove all </think> tags from the text | |
reasoning_mode = text.replace("</think>", "") # Strip all </think> tags throughout the text | |
# Check if the original text ended with </think> and reinsert one if so | |
if text.endswith("</think>"): # Reinsert a single </think> tag at the end | |
return reasoning_mode + "</think>" # Return the cleaned text with one </think> tag at the end | |
else: | |
return reasoning_mode # Return the cleaned text without any </think> tags | |
# Define function to ensure text starts with exactly one <think> tag | |
def reasoning_tag_open(text: str) -> str: | |
""" | |
This function guarantees that the reasoning text starts with exactly one opening <think> tag. | |
It first strips any leading whitespace to accurately detect whether the tag is already present. | |
If the tag is missing, it inserts a <think> tag followed by a newline at the very beginning of the text. | |
If the tag is present, it calls reasoning_tag_start to remove any duplicate tags and ensure only one opening tag remains. | |
This function is essential for preparing reasoning text before streaming or output, as it enforces a consistent | |
and clean opening tag structure. The newline after the tag improves readability and formatting when displayed. | |
Steps: | |
1. Strip leading whitespace from the text. | |
2. Check if the text starts with <think>. | |
3. If not, prepend "<think>\n" to the text. | |
4. If yes, clean duplicates using reasoning_tag_start. | |
5. Return the normalized text. | |
Parameters: | |
- text (str): The reasoning text to be normalized. | |
Returns: | |
- str: The reasoning text with exactly one <think> tag at the start. | |
""" | |
# Remove leading whitespace for accurate tag checking | |
stripped = text.lstrip() # Eliminate spaces or newlines from the start | |
# If tag is missing, insert it, else clean up any duplicates | |
if not stripped.startswith("<think>"): # Check if <think> tag is absent at the beginning | |
text = "<think>\n" + text # Add <think> tag followed by a newline at the start | |
else: | |
text = reasoning_tag_start(text) # Remove duplicates if the tag is already present | |
return text # Return text with one valid <think> tag at the start | |
# Define function to ensure text ends with exactly one </think> tag | |
def reasoning_tag_close(text: str) -> str: | |
""" | |
This function guarantees that the reasoning text ends with exactly one closing </think> tag. | |
It first strips any trailing whitespace to accurately detect whether the tag is already present. | |
If the tag is missing, it appends a newline, the closing </think> tag, and two additional newlines to the end of the text. | |
If the tag is present, it calls reasoning_tag_stop to remove any duplicate closing tags and ensure only one remains. | |
This function is crucial for finalizing reasoning text before output or further processing, ensuring the closing tag | |
is properly placed and that the text formatting remains clean and readable. The added newlines after the closing tag | |
provide spacing for separation from subsequent content. | |
Steps: | |
1. Strip trailing whitespace from the text. | |
2. Check if the text ends with </think>. | |
3. If not, append "\n</think>\n\n" to the text. | |
4. If yes, clean duplicates using reasoning_tag_stop. | |
5. Return the normalized text. | |
Parameters: | |
- text (str): The reasoning text to be normalized. | |
Returns: | |
- str: The reasoning text with exactly one </think> tag at the end. | |
""" | |
# Remove trailing whitespace for accurate tag checking | |
stripped = text.rstrip() # Eliminate spaces or newlines from the end | |
# If tag is missing, append it, else clean up any duplicates | |
if not stripped.endswith("</think>"): # Check if </think> tag is absent at the end | |
text = text.rstrip() + "\n</think>\n\n" # Append </think> tag with spacing | |
else: | |
text = reasoning_tag_stop(text) # Remove duplicates if the tag is already present | |
return text # Return text with one valid </think> tag at the end |