omni_bot / swarms /tools /README.md
Zack Zitting Bradshaw
Upload folder using huggingface_hub
4962437

A newer version of the Gradio SDK is available: 4.36.1

Upgrade

Here are 20 tools the individual worker swarm nodes can use:

  1. Write File Tool: Create a new file and write content to it.
  2. Read File Tool: Open and read the content of an existing file.
  3. Copy File Tool: Duplicate a file.
  4. Delete File Tool: Remove a file.
  5. Rename File Tool: Rename a file.
  6. Web Search Tool: Use a web search engine (like Google or DuckDuckGo) to find information.
  7. API Call Tool: Make requests to APIs.
  8. Process CSV Tool: Load a CSV file and perform operations on it using pandas.
  9. Create Directory Tool: Create a new directory.
  10. List Directory Tool: List all the files in a directory.
  11. Install Package Tool: Install Python packages using pip.
  12. Code Compilation Tool: Compile and run code in different languages.
  13. System Command Tool: Execute system commands.
  14. Image Processing Tool: Perform operations on images (resizing, cropping, etc.).
  15. PDF Processing Tool: Read, write, and manipulate PDF files.
  16. Text Processing Tool: Perform text processing operations like tokenization, stemming, etc.
  17. Email Sending Tool: Send emails.
  18. Database Query Tool: Execute SQL queries on a database.
  19. Data Scraping Tool: Scrape data from web pages.
  20. Version Control Tool: Perform Git operations.

The architecture for these tools involves creating a base Tool class that can be extended for each specific tool. The base Tool class would define common properties and methods that all tools would use.

The pseudocode for each tool would follow a similar structure:

Class ToolNameTool extends Tool:
    Define properties specific to the tool

    Method run: 
        Perform the specific action of the tool
        Return the result

Here's an example of how you might define the WriteFileTool:

import os
from langchain.tools import BaseTool

class WriteFileTool(BaseTool):
    name = "write_file"
    description = "Create a new file and write content to it."

    def __init__(self, root_dir: str):
        self.root_dir = root_dir

    def _run(self, file_name: str, content: str) -> str:
        """Creates a new file and writes the content."""
        try:
            with open(os.path.join(self.root_dir, file_name), 'w') as f:
                f.write(content)
            return f"Successfully wrote to {file_name}"
        except Exception as e:
            return f"Error: {e}"

This tool takes the name of the file and the content to be written as parameters, writes the content to the file in the specified directory, and returns a success message. In case of any error, it returns the error message. You would follow a similar process to create the other tools.

For completing browser-based tasks, you can use web automation tools. These tools allow you to interact with browsers as if a human user was interacting with it. Here are 20 tasks that individual worker swarm nodes can handle:

  1. Open Browser Tool: Open a web browser.
  2. Close Browser Tool: Close the web browser.
  3. Navigate To URL Tool: Navigate to a specific URL.
  4. Fill Form Tool: Fill in a web form with provided data.
  5. Submit Form Tool: Submit a filled form.
  6. Click Button Tool: Click a button on a webpage.
  7. Hover Over Element Tool: Hover over a specific element on a webpage.
  8. Scroll Page Tool: Scroll up or down a webpage.
  9. Navigate Back Tool: Navigate back to the previous page.
  10. Navigate Forward Tool: Navigate forward to the next page.
  11. Refresh Page Tool: Refresh the current page.
  12. Switch Tab Tool: Switch between tabs in a browser.
  13. Capture Screenshot Tool: Capture a screenshot of the current page.
  14. Download File Tool: Download a file from a webpage.
  15. Send Email Tool: Send an email using a web-based email service.
  16. Login Tool: Log in to a website using provided credentials.
  17. Search Website Tool: Perform a search on a website.
  18. Extract Text Tool: Extract text from a webpage.
  19. Extract Image Tool: Extract image(s) from a webpage.
  20. Browser Session Management Tool: Handle creation, usage, and deletion of browser sessions.

You would typically use a library like Selenium, Puppeteer, or Playwright to automate these tasks. Here's an example of how you might define the FillFormTool using Selenium in Python:

from selenium import webdriver
from langchain.tools import BaseTool

class FillFormTool(BaseTool):
    name = "fill_form"
    description = "Fill in a web form with provided data."

    def _run(self, field_dict: dict) -> str:
        """Fills a web form with the data in field_dict."""
        try:
            driver = webdriver.Firefox()
            
            for field_name, field_value in field_dict.items():
                element = driver.find_element_by_name(field_name)
                element.send_keys(field_value)

            return "Form filled successfully."
        except Exception as e:
            return f"Error: {e}"

In this tool, field_dict is a dictionary where the keys are the names of the form fields and the values are the data to be filled in each field. The tool finds each field in the form and fills it with the provided data.

Please note that in a real scenario, you would need to handle the browser driver session more carefully (like closing the driver when it's not needed anymore), and also handle waiting for the page to load and exceptions more thoroughly. This is a simplified example for illustrative purposes.