Spaces:

You-shen
/

MisDetectorV1

Sleeping

App Files Files Community

MisDetectorV1 / prompt_test.py

You-shen

Update prompt_test.py

94ec5c6 verified 10 months ago

raw

history blame contribute delete

15 kB

	from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
	from langchain_core.pydantic_v1 import BaseModel, Field
	from langchain_core.output_parsers import PydanticOutputParser
	from langchain_core.messages import (
	AIMessage,
	BaseMessage,
	HumanMessage,
	SystemMessage,
	ToolMessage,
	)
	import uuid
	from typing import Dict, List,Optional

	# rag_template = """
	# You are an assistant specialized in detecting fake news. \
	# Your task is to analyze a given Claim and provide a judgment along with an explanation based on your knowledge and the provided Context. \
	# Use only the relevant information from the context to support your judgment.\
	# Instructions:
	# 1. The answer is in Chinese\
	# 2. Use only the context information that directly pertains to the claim.\
	# 3. Ensure that dates and years in the claim match those in the context.\
	# 4. Clearly state whether the claim is true or false, and provide a detailed explanation that supports your judgment.\
	# 5. Base your judgment on the evidence provided in the context. Do not accept the claim at face value; critically evaluate it against the context.\
	# 6. Do not mention yourself in the response.
	# Claim: {question}
	# Context: {context}
	# Provide your judgment and explanation below:
	# Judgment:
	# Explanation:
	# """
	rag_template="""
	You are an assistant specializing in verifying the authenticity of news. \
	Your task is to analyze a given Claim and provide a judgment along with an explanation based on your knowledge and the provided Context. Then,assign an appropriate label based on the Context provided. \
	After determining the correct label, provide a detailed explanation for your judgment. \

	Rumor Detection refers to classifications made without sufficient factual evidence.
	Potential labels: Possibly True, Possibly False, Likely False, Likely True, Neutral.\
	Fact-Checking refers to classifications made with sufficient factual evidence.
	Potential labels: Completely False, Mostly False, Partially False, Partially True, Mostly True, Mixed Truth, Completely True.\
	Other refer to unqualified statements,this category should be applied in addition to a Rumor Detection or Fact-Checking label if necessary.Cannot appear as a separate category
	Potential labels:ambiguous, incomplete, opinion-based, or refer to future states\
	Ensure that your analysis is grounded in the Context and follows these guidelines:

	Instructions:
	1.Your answer should be in Chinese.\
	2.Use only the information from the Context that directly relates to the Claim.\
	3.Ensure that dates, years, and events in the Claim match those in the Context.\
	4.Assess the given Claim within its Context, determine whether factual evidence is present\
	5.Classify the claim into one of two categories: Rumor Detection or Fact-Checking.\
	6.If the statement is ambiguous, incomplete, opinion-based, or refers to future states, add these label alongside the main label.\
	7. Do not mention yourself in the response.\
	Claim: {question}
	Context: {context}


	Format:
	类别: [谣言检测 / 事实核查]
	标签: [选择正确的标签]
	解释:

	"""

	template = """
	You are an assistant for fake news detection tasks.
	you need to detect the claim and generate your judgment explanation based on what you know.
	Don't mention yourself.
	The answer is in Chinese\
	Claim: {question} \n Answer:
	"""

	front_template = """
	The Keywords_List is a list such as [Keyword(keyword='...'), Keyword(keyword='...'), Keyword(keyword='...')...]. \
	There may be one or more keyword. \
	If the keyword is not in English, translate it into English and use the English form of the keyword in subsequent operations\
	If a keyword consists of more than one word, link the individual words in the keyword with a "+",and don't link more than 3 words. \

	Then replace the "nouns" in the URLS with the keyword ,these keyword must be no more than 3 words in length,these keyword must be used in English.
	The replaced URL is the HTML page ,then you need to visit these pages and extract the specific content related to the Claim, and then integrate the accessed information together\

	Returns the exact information you extracted from those pages.
	Keywords_List:{keywords}\n Claim: {question} \n URLs: {urls}
	"""

	class Keyword(BaseModel):
	"""Information about a keyword."""
	keyword: str = Field(..., description="The keyword extracted from the sentence.")

	class Keywords_List(BaseModel):
	"""Identifying information about all keywords in a text."""
	Keywords_List: List[Keyword]

	ext_parser = PydanticOutputParser(pydantic_object=Keywords_List)

	# Prompt
	ext_prompt = ChatPromptTemplate.from_messages(
	[
	(
	"system",
	"The provided sentences are some news, and you need to extract keywords from the news so that you can determine the authenticity of the news by searching for these keywords. Wrap the output in `json` tags.\n{format_instructions}",
	),
	("human", "{query}"),
	]
	).partial(format_instructions=ext_parser.get_format_instructions())

	trans_template="""
	The question is a list such as [SubQuery (sub_query='...',SubQuery (sub_query='...')...]. \
	There may be one or more sub_query. \
	You need to extract the content from each sub_query. \
	Just need to return the extracted content without any introduction
	Claim: {question}
	"""

	adaptive_template="""
	The Claim is a stack of news sub_claims. \
	You need to traverse each one sub_claims,You only need to judge, not search

	If you can confidently judge the correctness of a sub_claims based on your knowledge, mark it as "certain". and mark things that haven't happened as "uncertain"\
	If you cannot confidently judge the correctness of a sub_claims, mark it as "uncertain". \

	If all news sub_claims in the Claim are marked as "certain", just return NONE without any introduction,
	otherwise just need to return the sub_claims that marked as "uncertain",do not return the sub_claims that marked as "certain".

	Claim: {question}
	"""

	class SubClaim(BaseModel):

	"""Assess the truthfulness of a sub_claim in a news article."""
	sub_claim: str = Field(..., description="A specific sub-claim from the news article.")

	examples = []
	news_article = "In 2024, Putin visited the Harbin Institute of Technology and gave a speech in the HIT Auditorium."
	queries = [
	SubClaim(sub_claim="In 2024, Putin visited the Harbin Institute of Technology."),
	SubClaim(sub_claim="Putin gave a speech in the HIT Auditorium."),
	]
	examples.append({"input": news_article, "tool_calls": queries})

	news_article = "In a historic move, the United Nations has imposed new sanctions on a country accused of human rights abuses. The sanctions include travel bans, asset freezes, and restrictions on trade. This decision was made after extensive investigations and reports from human rights organizations."
	queries = [
	SubClaim(sub_claim="The United Nations has imposed new sanctions on a country accused of human rights abuses."),
	SubClaim(sub_claim="The sanctions include travel bans."),
	SubClaim(sub_claim="The sanctions include asset freezes."),
	SubClaim(sub_claim="The sanctions include restrictions on trade."),
	SubClaim(sub_claim="The decision was made after extensive investigations and reports from human rights organizations."),
	]
	examples.append({"input": news_article, "tool_calls": queries})

	news_article = "A recent study claims that eating chocolate can improve memory function. The study was conducted by a team of scientists at a major university."
	queries = [
	SubClaim(sub_claim="A recent study claims that eating chocolate can improve memory function."),
	SubClaim(sub_claim="The study was conducted by a team of scientists at a major university."),
	]
	examples.append({"input": news_article, "tool_calls": queries})

	news_article = "Scientists have discovered a new exoplanet that is potentially habitable. The planet, named Proxima b, is located in the habitable zone of its star, Proxima Centauri, and has conditions that could support liquid water. The discovery was made using the latest data from the European Southern Observatory."
	queries = [
	SubClaim(sub_claim="Scientists have discovered a new exoplanet that is potentially habitable."),
	SubClaim(sub_claim="The planet, named Proxima b, is located in the habitable zone of its star, Proxima Centauri."),
	SubClaim(sub_claim="Proxima b has conditions that could support liquid water."),
	SubClaim(sub_claim="The discovery was made using the latest data from the European Southern Observatory."),
	]
	examples.append({"input": news_article, "tool_calls": queries})

	news_article = "The government has announced a new health initiative aimed at reducing childhood obesity by 20% over the next five years. The initiative includes increased funding for physical education programs in schools, public awareness campaigns, and subsidies for healthy school lunches. Critics argue that the plan does not address the root causes of obesity."
	queries = [
	SubClaim(sub_claim="The government has announced a new health initiative aimed at reducing childhood obesity by 20% over the next five years."),
	SubClaim(sub_claim="The initiative includes increased funding for physical education programs in schools."),
	SubClaim(sub_claim="The initiative includes public awareness campaigns."),
	SubClaim(sub_claim="The initiative includes subsidies for healthy school lunches."),
	SubClaim(sub_claim="Critics argue that the plan does not address the root causes of obesity."),
	]
	examples.append({"input": news_article, "tool_calls": queries})

	news_article = "In 2024, a new technology was developed that can fully charge an electric car in just 10 minutes. This breakthrough was achieved by a team of scientists at MIT. The technology is expected to revolutionize the electric vehicle industry."
	queries = [
	SubClaim(sub_claim="In 2024, a new technology was developed that can fully charge an electric car in just 10 minutes."),
	SubClaim(sub_claim="This breakthrough was achieved by a team of scientists at MIT."),
	SubClaim(sub_claim="The technology is expected to revolutionize the electric vehicle industry."),
	]
	examples.append({"input": news_article, "tool_calls": queries})
	def tool_example_to_messages(example: Dict) -> List[BaseMessage]:
	messages: List[BaseMessage] = [HumanMessage(content=example["input"])]
	openai_tool_calls = []
	for tool_call in example["tool_calls"]:
	openai_tool_calls.append(
	{
	"id": str(uuid.uuid4()),
	"type": "function",
	"function": {
	"name": tool_call.__class__.__name__,
	"arguments": tool_call.json(),
	},
	}
	)
	messages.append(
	AIMessage(content="", additional_kwargs={"tool_calls": openai_tool_calls})
	)
	tool_outputs = example.get("tool_outputs") or [
	"This is an example of a correct usage of this tool. Make sure to continue using the tool this way."
	] * len(openai_tool_calls)
	for output, tool_call in zip(tool_outputs, openai_tool_calls):
	messages.append(ToolMessage(content=output, tool_call_id=tool_call["id"]))
	return messages
	example_msgs = [msg for ex in examples for msg in tool_example_to_messages(ex)]

	system = """
	You are an expert at verifying the truthfulness of news articles. \
	You have access to a comprehensive database of verified information and fact-checking resources.\
	Your task is to decompose a given news article into its component sentences and sub_claims. \
	For each sentence or sub_claim, assess its truthfulness based on available evidence. Provide a detailed explanation and cite sources if possible.\
	Guidelines:
	1. Decompose the news article into the most specific sub_claims possible, each focusing on a single fact or idea.
	2. Enable the derivation of the correctness of the news articles by determining the correctness of the sub_claim.
	3. If there are acronyms, technical terms, or unfamiliar words, keep them as they are; do not rephrase them.
	5. Use clear and concise language for your explanations.
	"""

	prompt = ChatPromptTemplate.from_messages(
	[
	("system", system),
	MessagesPlaceholder("examples", optional=True),
	("human", "{claim}"),
	]
	)

	css1="""
	.gradio-container {
	background-color: #FAF3E0; /* Warm, light beige background for a soft appearance */
	border: 3px solid #B5651D; /* Muted brown border for a subtle contrast */
	border-radius: 12px;
	padding: 25px;
	box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
	font-family: 'Arial', sans-serif;
	}

	.gradio-header {
	color: #4A4A4A; /* Neutral dark grey for the header text */
	font-size: 36px;
	font-weight: bold;
	text-align: center;
	margin-bottom: 20px;
	}

	.gradio-input {
	width: 98%;
	padding: 12px;
	border: 10px solid #B5651D; /* Matching the container border color */
	background-color: #FFFFFF; /* White background for clarity */
	border-radius: 8px;
	margin: 9px;
	font-size: 16px;
	}

	.gradio-output {
	width: 98%;
	padding: 12px;
	border: 10px solid #B5651D; /* Matching the container border color */
	background-color: #FFFFFF; /* White background for consistency */
	border-radius: 8px;
	font-size: 9px;
	margin: 4.5px;
	}

	.gradio-button {
	background-color: #FF3333; /* 设置按钮颜色为橙色 */
	color: #FF8800;
	padding: 12px 24px;
	border: none;
	border-radius: 8px;
	cursor: pointer;
	font-size: 18px;
	font-weight: bold;
	transition: background-color 0.3s ease;
	}
	"""

	css2="""
	.gradio-container {
	background-color: #666666;
	border: 5px solid #000000;
	border-radius: 12px;
	padding: 25px;
	box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
	font-family: 'Arial', sans-serif;
	}

	.gradio-header {
	color: #333333;
	font-size: 36px;
	font-weight: bold;
	text-align: center;
	margin-bottom: 20px;
	}

	.gradio-input {
	width: 98%;
	padding: 12px;
	border: 10px solid #000000; /* 更改输入框边框颜色 */
	background-color: #ffffff;
	border-radius: 8px;
	margin: 9px;
	font-size: 16px;
	}

	.gradio-output {
	width: 98%;
	padding: 12px;
	border: 10px solid #000000; /* 更改输出框边框颜色 */
	background-color: #ffffff;
	border-radius: 8px;
	font-size: 16px;
	margin: 4.5px;
	}

	.gradio-button {
	background-color: #FF3333; /* 设置按钮颜色为橙色 */
	color: #FF8800;
	padding: 12px 24px;
	border: none;
	border-radius: 8px;
	cursor: pointer;
	font-size: 18px;
	font-weight: bold;
	transition: background-color 0.3s ease;
	}

	.gradio-button:hover {
	background-color: #ff8c00;
	}

	.gradio-button:focus {
	outline: none;
	box-shadow: 0 0 5px 2px rgba(255, 165, 0, 0.5);
	}
	"""
	custom_css = """
	#custom_chatbot .h-16 {
	display: none;
	}
	"""