{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Minerva: AI Guardian for Scam Protection" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook implements a multi-agent system for analyzing images (screenshots) to identify scam attempts, and provide personalized scam prevention. It uses [AutoGen](https://github.com/microsoft/autogen/) to orchestrate various specialized agents that work together.\n", "\n", "Benefits:\n", "- Automates the process of identifying suspicious scam patterns.\n", "- Prevents Financial Loss\n", "- Saves Time: Early scam detection reduces the number of claims filed by end-users." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Install Dependencies" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "%pip install -q autogen-agentchat~=0.2 pillow pytesseract pyyaml" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "flaml.automl is not available. Please install flaml[automl] to enable AutoML functionalities.\n" ] } ], "source": [ "import autogen\n", "\n", "from IPython.display import Image as IPImage\n", "from IPython.display import display" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "import os\n", "from dotenv import load_dotenv, find_dotenv\n", "\n", "load_dotenv(find_dotenv())\n", "\n", "config_list = [\n", " {\n", " \"model\": \"gpt-4o-mini\",\n", " \"api_key\": os.getenv(\"OPENAI_API_KEY\")\n", " }\n", "]\n", "\n", "llm_config = {\n", " \"config_list\": config_list,\n", " \"timeout\": 120,\n", "}" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "import yaml\n", "\n", "with open('config/agents.yaml', 'r') as file:\n", " config = yaml.safe_load(file)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Agents Creation" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "from tools import Tools\n", "\n", "def create_agents():\n", " tools = Tools()\n", "\n", " ocr_agent = autogen.AssistantAgent(\n", " name=\"OCR_Specialist\",\n", " system_message=config['ocr_agent']['assignment'],\n", " llm_config=llm_config\n", " )\n", "\n", " url_checker_agent = autogen.AssistantAgent(\n", " name=\"URL_Checker\",\n", " system_message=config['url_checker_agent']['assignment'],\n", " llm_config=llm_config\n", " )\n", " \n", " content_agent = autogen.AssistantAgent(\n", " name=\"Content_Analyst\",\n", " system_message=config['content_agent']['assignment'],\n", " llm_config=llm_config\n", " )\n", "\n", " decision_agent = autogen.AssistantAgent(\n", " name=\"Decision_Maker\",\n", " system_message=config['decision_agent']['assignment'],\n", " llm_config=llm_config\n", " )\n", "\n", " summary_agent = autogen.AssistantAgent(\n", " name=\"Summary_Agent\",\n", " description=\"Generates a summary of the findings\",\n", " system_message=config['summary_agent']['assignment'],\n", " llm_config=llm_config\n", " )\n", "\n", " user_proxy = autogen.UserProxyAgent(\n", " name=\"user_proxy\",\n", " is_termination_msg=lambda msg: \"ANALYSIS_COMPLETE\" in msg.get(\"content\", \"\"),\n", " human_input_mode=\"NEVER\",\n", " max_consecutive_auto_reply=10,\n", " )\n", "\n", " @user_proxy.register_for_execution()\n", " @ocr_agent.register_for_llm(description=\"Extracts text from an image path\")\n", " def ocr(image_path: str) -> str:\n", " return tools.ocr(image_path)\n", " \n", " @user_proxy.register_for_execution()\n", " @url_checker_agent.register_for_llm(description=\"Checks if a URL is safe\")\n", " def is_url_safe(url: str) -> str:\n", " return tools.is_url_safe(url)\n", "\n", "\n", " return ocr_agent, url_checker_agent, content_agent, decision_agent, summary_agent, user_proxy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Workflow" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "class ScamDetectionWorkflow:\n", " def __init__(self):\n", " self.ocr_agent, self.url_checker_agent, self.content_agent, self.decision_agent, self.summary_agent, self.user_proxy = create_agents()\n", " \n", " def analyze(self, image_path):\n", " \"\"\"Coordinate the multi-agent analysis.\n", " \"\"\"\n", " \n", " groupchat = autogen.GroupChat(\n", " agents=[self.ocr_agent, self.url_checker_agent, self.content_agent, self.decision_agent, self.summary_agent, self.user_proxy],\n", " messages=[],\n", " max_round=15,\n", " )\n", " manager = autogen.GroupChatManager(groupchat=groupchat)\n", "\n", " messages = self.user_proxy.initiate_chat(\n", " manager,\n", " message=f\"\"\"\n", " 1. OCR Agent: Extract text from this image: {image_path}\n", " 2. Extract any URL from the text and check if it is safe\n", " 2. Content Agent: Evaluate the messaging and claims\n", " 3. Decision Maker: Synthesize all analyses and make final determination\n", " 4. Summarize the findings\"\"\",\n", " )\n", "\n", " return messages" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "image_path = \"./samples/02.giftcard.message.scam.png\"\n", "#image_path = \"./samples/74.customer.service.twitter.scam.png\"\n", "#image_path = \"./samples/42.amazon.email.scam.png\"\n", "#image_path = \"./samples/22.giveaway.email.scam.png\"\n", "#image_path = \"./samples/32.multifactor.email.scam.png\"\n", "#image_path = \"./samples/39.paypal.email.scam.png\"\n", "display(IPImage(filename=image_path))" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[33muser_proxy\u001b[0m (to chat_manager):\n", "\n", "\n", " 1. OCR Agent: Extract text from this image: ./samples/02.giftcard.message.scam.png\n", " 2. Extract any URL from the text and check if it is safe\n", " 2. Content Agent: Evaluate the messaging and claims\n", " 3. Decision Maker: Synthesize all analyses and make final determination\n", " 4. Summarize the findings\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[32m\n", "Next speaker: OCR_Specialist\n", "\u001b[0m\n", "\u001b[33mOCR_Specialist\u001b[0m (to chat_manager):\n", "\n", "\u001b[32m***** Suggested tool call (call_nImrxy4iboAGXWRO6F3HlXco): ocr *****\u001b[0m\n", "Arguments: \n", "{\"image_path\":\"./samples/02.giftcard.message.scam.png\"}\n", "\u001b[32m********************************************************************\u001b[0m\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[32m\n", "Next speaker: user_proxy\n", "\u001b[0m\n", "\u001b[35m\n", ">>>>>>>> EXECUTING FUNCTION ocr...\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[33muser_proxy\u001b[0m (to chat_manager):\n", "\n", "\u001b[33muser_proxy\u001b[0m (to chat_manager):\n", "\n", "\u001b[32m***** Response from calling tool (call_nImrxy4iboAGXWRO6F3HlXco) *****\u001b[0m\n", "Congratulations!\n", "You've won a $1,000\n", "Walmart gift card. Go\n", "\n", "http://bit.ly/123456\n", "tp claim now.\n", "\f\n", "\u001b[32m**********************************************************************\u001b[0m\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[32m\n", "Next speaker: OCR_Specialist\n", "\u001b[0m\n", "\u001b[33mOCR_Specialist\u001b[0m (to chat_manager):\n", "\n", "\u001b[32m***** Suggested tool call (call_38Y8iIQ4rrgZCLEHmHziBXZa): check_url_safety *****\u001b[0m\n", "Arguments: \n", "{\"url\": \"http://bit.ly/123456\"}\n", "\u001b[32m*********************************************************************************\u001b[0m\n", "\u001b[32m***** Suggested tool call (call_RpsQS2ep2q9ahungdLSb5Xp2): evaluate_messaging *****\u001b[0m\n", "Arguments: \n", "{\"text\": \"Congratulations! You've won a $1,000 Walmart gift card. Go http://bit.ly/123456 tp claim now.\"}\n", "\u001b[32m***********************************************************************************\u001b[0m\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[32m\n", "Next speaker: user_proxy\n", "\u001b[0m\n", "\u001b[33muser_proxy\u001b[0m (to chat_manager):\n", "\n", "\u001b[33muser_proxy\u001b[0m (to chat_manager):\n", "\n", "\u001b[32m***** Response from calling tool (call_38Y8iIQ4rrgZCLEHmHziBXZa) *****\u001b[0m\n", "Error: Function check_url_safety not found.\n", "\u001b[32m**********************************************************************\u001b[0m\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[33muser_proxy\u001b[0m (to chat_manager):\n", "\n", "\u001b[32m***** Response from calling tool (call_RpsQS2ep2q9ahungdLSb5Xp2) *****\u001b[0m\n", "Error: Function evaluate_messaging not found.\n", "\u001b[32m**********************************************************************\u001b[0m\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[32m\n", "Next speaker: OCR_Specialist\n", "\u001b[0m\n", "\u001b[33mOCR_Specialist\u001b[0m (to chat_manager):\n", "\n", "It seems that there was an error attempting to check the URL for safety and evaluate the messaging. However, I can summarize the findings based on the extracted text.\n", "\n", "### Extracted Text:\n", "- **Message**: \"Congratulations! You've won a $1,000 Walmart gift card. Go to http://bit.ly/123456 to claim now.\"\n", "\n", "### Findings:\n", "1. **Claim Analysis**: The message makes a claim of winning a significant prize ($1,000 Walmart gift card), which is a common tactic used in scams to lure individuals into providing personal information or engaging in fraudulent activities.\n", " \n", "2. **URL Safety**: The URL provided is a shortened link (bit.ly), which can often be used to mask the true destination. Shortened URLs can lead to unsafe sites, making it essential to check the link before clicking.\n", "\n", "### Conclusion:\n", "Based on the extracted text and the nature of the claims, the messaging appears to be a scam, typical of phishing attempts. The claim of winning a large gift card is insincere and aims to deceive recipients. \n", "\n", "If you need to proceed with URL safety checks or further evaluations, please inform me!\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[32m\n", "Next speaker: URL_Checker\n", "\u001b[0m\n", "\u001b[33mURL_Checker\u001b[0m (to chat_manager):\n", "\n", "\u001b[32m***** Suggested tool call (call_tA5lFFHPQVNLNdYtB0Ugvunr): is_url_safe *****\u001b[0m\n", "Arguments: \n", "{\"url\":\"http://bit.ly/123456\"}\n", "\u001b[32m****************************************************************************\u001b[0m\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[32m\n", "Next speaker: user_proxy\n", "\u001b[0m\n", "\u001b[35m\n", ">>>>>>>> EXECUTING FUNCTION is_url_safe...\u001b[0m\n", "\u001b[33muser_proxy\u001b[0m (to chat_manager):\n", "\n", "\u001b[33muser_proxy\u001b[0m (to chat_manager):\n", "\n", "\u001b[32m***** Response from calling tool (call_tA5lFFHPQVNLNdYtB0Ugvunr) *****\u001b[0m\n", "Error: Error checking URL safety: 400 Client Error: Bad Request for url: https://safebrowsing.googleapis.com/v4/threatMatches:find?key=AIzaSyAgHquJ24dNaT5Tspr1vcDmcVHPSOnjgFc\n", "\u001b[32m**********************************************************************\u001b[0m\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[32m\n", "Next speaker: OCR_Specialist\n", "\u001b[0m\n", "\u001b[33mOCR_Specialist\u001b[0m (to chat_manager):\n", "\n", "It seems there was an issue while attempting to verify the safety of the URL provided (http://bit.ly/123456). \n", "\n", "### Summary of Findings:\n", "1. **Message Content**:\n", " - The extracted message suggests that the recipient has won a $1,000 Walmart gift card, which is a likely scam intended to lure individuals into divulging personal information or clicking on potentially harmful links.\n", "\n", "2. **URL Safety**:\n", " - The URL is a shortened link which hides the actual destination, a common tactic used in scams. Without confirming its safety through a reliable service, caution should be exercised.\n", "\n", "### Conclusion:\n", "The messaging appears to be fraudulent, designed to deceive individuals into believing they have won a prize. Any engagement with such messages, including clicking on links, is not advisable. \n", "\n", "If you need additional assistance or have other questions, feel free to ask!\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[32m\n", "Next speaker: URL_Checker\n", "\u001b[0m\n", "\u001b[33mURL_Checker\u001b[0m (to chat_manager):\n", "\n", "### Summary of Findings:\n", "\n", "1. **Message Content**:\n", " - The extracted message states: \"Congratulations! You've won a $1,000 Walmart gift card. Go to http://bit.ly/123456 to claim now.\"\n", " - This message is a classic scam tactic aiming to entice recipients by claiming they have won a significant prize. Such claims are often associated with phishing attempts designed to acquire personal information.\n", "\n", "2. **URL Safety**:\n", " - The provided URL is a shortened link (bit.ly), which can mask the actual destination. This makes it hard to assess the link’s safety without further checks. Shortened URLs are often used in scams, and clicking on them can lead to unsafe websites.\n", "\n", "### Conclusion:\n", "The messaging appears to be fraudulent and aimed at deceiving individuals. It is highly recommended not to engage with such messages or click on any links provided.\n", "\n", "If you require further analysis or have other inquiries, please let me know!\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[32m\n", "Next speaker: Content_Analyst\n", "\u001b[0m\n", "\u001b[33mContent_Analyst\u001b[0m (to chat_manager):\n", "\n", "### Final Summary of Findings:\n", "\n", "1. **Message Content**:\n", " - The message claims, \"Congratulations! You've won a $1,000 Walmart gift card. Go to http://bit.ly/123456 to claim now.\"\n", " - This is a common scam tactic that tempts recipients with the lure of a significant prize, often leading to phishing attempts aimed at stealing personal information.\n", "\n", "2. **URL Safety**:\n", " - The URL is a shortened link (bit.ly), which conceals its true destination. It is important to approach such links with caution, as they are frequently utilized in scams and can direct users to potentially harmful sites.\n", "\n", "3. **Urgency and Pressure Tactics**:\n", " - The phrase \"claim now\" creates a sense of urgency, a typical pressure tactic used in scams to prompt immediate action without thoughtful consideration.\n", "\n", "4. **Inconsistencies**:\n", " - Legitimate communications from companies like Walmart typically do not deliver prizes via unsolicited messages. There’s no prior contact or context provided indicating that the recipient has entered any contests with Walmart.\n", "\n", "### Conclusion:\n", "The evidence strongly suggests that this message is a scam. It is advisable not to click any links or provide personal information in response to such offers. If further assistance is needed or you have more queries, please let me know!\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[32m\n", "Next speaker: Decision_Maker\n", "\u001b[0m\n", "\u001b[33mDecision_Maker\u001b[0m (to chat_manager):\n", "\n", "### Final Determination on Scam Probability: High\n", "\n", "### Detailed Explanation of the Decision:\n", "\n", "1. **Identifiable Scam Characteristics**:\n", " - The message employs common elements of scams, such as congratulatory language for a large prize that the recipient has not entered to win. Scams frequently use similar tactics to generate excitement and encourage immediate action.\n", " \n", "2. **Use of Shortened URLs**:\n", " - The presence of a shortened URL (http://bit.ly/123456) is particularly concerning, as such links can obscure the destination website. Many scams utilize shortened URLs because they can lead to phishing websites without showing the actual link, making it harder for recipients to evaluate the safety of the link before clicking.\n", "\n", "3. **Urgency**:\n", " - The phrasing \"claim now\" creates a sense of urgency and pressures recipients to act quickly, a common tactic employed by scammers to bypass critical thinking, prompting them to take immediate action without verifying the legitimacy of the offer.\n", "\n", "4. **Lack of Context and Prior Engagement**:\n", " - There is no prior context that suggests the recipient would have any connection to the reward being offered. Legitimate promotional offers typically have a clear and contextually relevant background, often requiring prior engagement with the company.\n", "\n", "5. **Trustworthiness of the Source**:\n", " - Major retailers like Walmart do not conduct their promotions through unsolicited messages promising prizes without prior engagement. This adds to the legitimacy concerns surrounding the communication.\n", "\n", "Given the combination of these factors, the probability of this being a scam is assessed to be high. Recipients are highly advised to ignore such messages, refrain from clicking on any links, and avoid providing any personal information in response.\n", "\n", "If further clarity or assistance is needed, please feel free to reach out!\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[32m\n", "Next speaker: Summary_Agent\n", "\u001b[0m\n", "\u001b[33mSummary_Agent\u001b[0m (to chat_manager):\n", "\n", "This analysis confirms that the message claiming you won a $1,000 Walmart gift card is a high-probability scam. Key indicators include the use of congratulatory language for a prize you did not enter to win, a suspicious shortened URL (http://bit.ly/123456), and high-pressure tactics urging immediate action. Legitimate communications from companies like Walmart typically do not come unsolicited, meaning this message lacks context and authenticity. **Recommendation:** Do not click any links or provide personal information in response to such messages. Remain cautious and report any suspicious communications. \n", "\n", "ANALYSIS_COMPLETE\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[32m\n", "Next speaker: user_proxy\n", "\u001b[0m\n" ] } ], "source": [ "workflow = ScamDetectionWorkflow()\n", "results = workflow.analyze(image_path)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'content': 'This analysis confirms that the message claiming you won a $1,000 '\n", " 'Walmart gift card is a high-probability scam. Key indicators '\n", " 'include the use of congratulatory language for a prize you did '\n", " 'not enter to win, a suspicious shortened URL '\n", " '(http://bit.ly/123456), and high-pressure tactics urging '\n", " 'immediate action. Legitimate communications from companies like '\n", " 'Walmart typically do not come unsolicited, meaning this message '\n", " 'lacks context and authenticity. **Recommendation:** Do not click '\n", " 'any links or provide personal information in response to such '\n", " 'messages. Remain cautious and report any suspicious '\n", " 'communications. \\n'\n", " '\\n'\n", " 'ANALYSIS_COMPLETE',\n", " 'name': 'Summary_Agent',\n", " 'role': 'user'}\n" ] } ], "source": [ "import pprint\n", "\n", "pprint.pprint(results.chat_history[-1])" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [], "source": [ "import json\n", "\n", "\n", "with open('results.json', 'w') as json_file:\n", " json.dump(results.__dict__, json_file, indent=4)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'This analysis confirms that the message claiming you won a $1,000 Walmart gift card is a high-probability scam. Key indicators include the use of congratulatory language for a prize you did not enter to win, a suspicious shortened URL (http://bit.ly/123456), and high-pressure tactics urging immediate action. Legitimate communications from companies like Walmart typically do not come unsolicited, meaning this message lacks context and authenticity. **Recommendation:** Do not click any links or provide personal information in response to such messages. Remain cautious and report any suspicious communications. \\n\\nANALYSIS_COMPLETE'" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "results.summary" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.1" } }, "nbformat": 4, "nbformat_minor": 2 }