Pravincoder's picture
Upload 4 files
ed8425a verified
# SHL Assessment Retrieval System
## Overview
The **SHL Assessment Retrieval System** is a web application designed to query and retrieve relevant assessments from the SHL product catalog. It utilizes a Retrieval-Augmented Generation (RAG) model to provide users with accurate and contextually relevant test assessments based on their queries. The application is built using Streamlit for the frontend and integrates with ChromaDB for efficient data storage and retrieval.
## Features
- **Data Scraping**: Automatically scrapes assessment data from the SHL product catalog.
- **Data Processing**: Preprocesses and chunks the scraped data for efficient querying.
- **Embedding Model**: Utilizes the `SentenceTransformer` model for embedding queries and documents.
- **Diverse Query Results**: Returns diverse and relevant results based on user queries.
- **User-Friendly Interface**: Built with Streamlit for an interactive user experience.
## Technologies Used
- Python
- Streamlit
- Pandas
- Sentence Transformers
- ChromaDB
- BeautifulSoup (for web scraping)
- Requests
## Installation
### Prerequisites
Make sure you have Python 3.7 or higher installed on your machine. You can download it from [python.org](https://www.python.org/downloads/).
### Clone the Repository
```bash
git clone https://github.com/yourusername/shl-assessment-retrieval.git
cd shl-assessment-retrieval
```
### Install Dependencies
You can install the required packages using pip. It is recommended to create a virtual environment first.
```bash
# Create a virtual environment (optional)
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
# Install dependencies
pip install -r requirements.txt
```
## Usage
### Scraping Data
Before querying the assessments, you need to scrape the data from the SHL product catalog. You can do this by running the `shl_scraper.py` script:
```bash
python shl_scraper.py
```
This will create a CSV file named `shl_products.csv` containing the scraped assessment data.
### Running the Streamlit App
Once the data is scraped, you can run the Streamlit app:
```bash
streamlit run app.py
```
Open your web browser and navigate to `http://localhost:8501` to access the application.
### Querying Assessments
- Enter your query in the input box and click the "Submit" button.
- The application will display relevant assessments based on your query.
## Code Structure
```
shl-assessment-retrieval/
β”‚
β”œβ”€β”€ app.py # Streamlit application for querying assessments
β”œβ”€β”€ rag.py # RAG model implementation for data processing and querying
β”œβ”€β”€ shl_scraper.py # Web scraper for fetching assessment data
β”œβ”€β”€ evaluate.py # Evaluation script for assessing model performance
β”œβ”€β”€ requirements.txt # List of dependencies
└── README.md # Project documentation
```