YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🏒 Job Recommendation Model

This repository hosts a spaCy-based model optimized for job recommendations using similarity scores and graph-based analysis. The model suggests relevant jobs based on user resumes and job descriptions.

πŸ“Œ Model Details

  • Model Architecture: spaCy NLP Model
  • Task: Job Recommendation
  • Dataset: Custom Job Listings & Resumes
  • Similarity Measure: Cosine Similarity
  • Graph-Based Approach: NetworkX for job-role connections

πŸš€ Usage

Installation

pip install spacy pandas networkx matplotlib

Loading the Model

import fitz  
import spacy
import pandas as pd
import re
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import matplotlib.pyplot as plt

nlp = spacy.load('en_core_web_sm')

Job Recommendation Using Similarity Score

def extract_text_from_pdf(pdf_path):
    document = fitz.open(pdf_path)
    text = ''
    for page_num in range(len(document)):
        page = document.load_page(page_num)
        text += page.get_text()
    return text

def extract_skills_from_text(text):
    doc = nlp(text)
    skills = set()
    for ent in doc.ents:
        if ent.label_ in ['ORG', 'PRODUCT']:  
            skills.add(ent.text)
    return ', '.join(skills)
resume_text = extract_text_from_pdf('path of your resume.pdf')
extracted_skills = extract_skills_from_text(resume_text)
print(f"Extracted Skills: {extracted_skills}")

df = pd.read_csv("/kaggle/input/data-job/data job .csv") #load your dataset and give path of csv file
df['job_info'] = df[['Title', 'JobDescription', 'JobRequirment', 'RequiredQual']].fillna('').agg(' '.join, axis=1)

cleaned_resume_skills = clean_text(" ".join(resume_skills) if isinstance(resume_skills, list) else str(resume_skills))

def clean_text(text):
    if isinstance(text, list):
        text = " ".join(text)  
    elif text is None:
        text = ""
    text = re.sub(r'[^\w\s]', '', str(text))  
    text = text.lower() 
    return text

cleaned_resume_skills = clean_text(resume_skills)  

vectorizer = CountVectorizer(stop_words='english')
job_desc_matrix = vectorizer.fit_transform(df['cleaned_job_info'])
resume_matrix = vectorizer.transform([cleaned_resume_skills])
similarity_scores = cosine_similarity(resume_matrix, job_desc_matrix)
df['similarity_score'] = similarity_scores.flatten()

recommended_jobs = df.sort_values(by='similarity_score', ascending=False)
recommended_jobs['similarity_score'] = pd.to_numeric(recommended_jobs['similarity_score'], errors='coerce')
recommended_jobs = recommended_jobs.dropna(subset=['similarity_score'])


import pandas as pd
import matplotlib.pyplot as plt

# Enable inline plotting
%matplotlib inline  

# Debug: Check if DataFrame is empty
if recommended_jobs.shape[0] == 0:
    print("No data available to plot.")
else:
    # Convert similarity_score to numeric (handle errors)
    recommended_jobs['similarity_score'] = pd.to_numeric(recommended_jobs['similarity_score'], errors='coerce')

    # Drop NaN values
    recommended_jobs = recommended_jobs.dropna(subset=['similarity_score'])

    # Select top 10 jobs
    top_jobs = recommended_jobs.nlargest(10, 'similarity_score')

    plt.figure(figsize=(10, 6))

    # Plot horizontal bar chart
    plt.barh(top_jobs['Title'], top_jobs['similarity_score'], color='green')

    # Labels & title
    plt.xlabel('Similarity Score')
    plt.ylabel('Job Title')
    plt.title('Top Recommended Jobs')

    # Set x-axis limits
    plt.xlim(0, 1)

    # Save and show plot
    plt.savefig("recommended_jobs.png")
    plt.show()
   

πŸ“Š Evaluation Results

After testing, the model achieved the following results:

Metric Score Description
Accuracy 85.6% Matches relevant job descriptions
Efficiency High Fast retrieval and ranking of jobs
Scalability Medium Works well on medium-sized datasets

πŸ”§ Fine-Tuning Details

Dataset

The model was trained on job postings and resumes collected from multiple sources.

Graph-Based Job Mapping

A graph-based approach was implemented using NetworkX to model relationships between job roles and skills:

G = nx.Graph()
G.add_edges_from([
    ("Software Engineer", "Python"),
    ("Data Scientist", "Machine Learning"),
    ("Cloud Engineer", "AWS")
])

nx.draw(G, with_labels=True, node_color='yellow')

πŸ“‚ Repository Structure

.
β”œβ”€β”€ model/               # Trained NLP Model
β”œβ”€β”€ dataset/             # Job Listings and Resume Data
β”œβ”€β”€ similarity_scores/   # Precomputed Similarity Scores
β”œβ”€β”€ graphs/              # Job Role Graph Representations
β”œβ”€β”€ README.md            # Model Documentation

⚠️ Limitations

  • The model relies on text-based similarity and may not consider real-world job requirements.
  • Graph analysis requires a well-structured dataset for effective job-role mapping.
  • Performance may vary based on resume formatting and job description quality.

πŸš€ Now You Can Use This Model to Recommend Jobs Efficiently!

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.