YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Project Setup & Usage Guide

Overview

This project consists of two main components:

  1. Web Scraper
    Built using Python, Selenium, and BeautifulSoup to scrape IBM SkillsBuild course data.

  2. Search Engine
    Uses Elasticsearch to index and search the scraped course data.


Requirements

Make sure the following are installed before running the project:

  • Python 3.8+
  • Selenium
  • BeautifulSoup4
  • Pandas
  • Elasticsearch
  • tqdm
  • json (included with Python)

Project Structure

Only the following folders are required for the project to function correctly:

  • Single Threaded Scraper
  • Search Engine

All other folders are supporting material and are not required to run the system.


Running the Web Scraper

  1. Navigate to the Single Threaded Scraper folder.
  2. Run the scraper:
python integrated_scraper.py

This will scrape course data and prepare it for indexing.


Running the Search Engine

  1. Start Elasticsearch.
  2. Navigate to the Search Engine folder.
  3. Run the indexing script:
python index_courses.py
  1. Run the search script:
python search_courses.py

You can modify the query inside the script.
Search results will be exported as CSV and JSON files.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support