Spaces:

sameer2026
/

iris_backend

Sleeping

App Files Files Community

Saandraahh commited on Mar 12

Commit

4b3a33f

1 Parent(s): 84d4394

Implemented clustering

Browse files

Files changed (45) hide show

Supabase/.temp/cli-latest +1 -1
Supabase/functions/otp/index.ts +8 -5
backend/api.py +12 -4
backend/check_clusters_after_run.py +23 -0
backend/check_db_clustering.py +33 -0
backend/check_job_data.py +26 -0
backend/debug_profile.json +90 -0
backend/debug_score.py +55 -0
backend/docs/efficiency_guide.md +41 -0
backend/final_verify.py +38 -0
backend/fix_profile_embeddings_trigger.sql +56 -0
backend/generate_realistic_resumes.py +183 -0
backend/inspect_columns.py +30 -0
backend/inspect_schema.py +28 -0
backend/inspect_schema_fixed.py +34 -0
backend/out_cmd.txt +20 -0
backend/realistic_synthetic_resumes.json +0 -0
backend/remove_triggers_for_profile_embeddings.sql +19 -0
backend/repair_system_mismatches.sql +104 -0
backend/requirements.txt +1 -0
backend/script_output.txt +0 -0
backend/src/embeddings/benchmark_bge.py +55 -0
backend/src/embeddings/evaluate_quality.py +197 -0
backend/src/embeddings/job_embed.py +1 -1
backend/src/embeddings/match_benchmark_granular.py +228 -0
backend/src/embeddings/profile_entities_bench.py +115 -0
backend/src/matching/similarity.py +40 -17
backend/src/services/clustering_service.py +148 -0
backend/src/services/test_clustering.py +21 -0
backend/src/services/verify_labels.py +41 -0
backend/supabase_ingest.py +9 -4
backend/test_ingest_output.txt +0 -0
debug_log.txt +15 -0
entity_benchmark_scaled_results.txt +12 -0
experimental_results.tex +53 -0
match_benchmark_results.json +22 -0
matching_analysis_report.md +0 -0
quality_metrics_adversarial.json +6 -0
schema_dump.txt +22 -0
src/components/Admin/AdminLayout.jsx +36 -34
src/components/Admin/TalentClusters.jsx +496 -0
src/components/JobListings.jsx +36 -36
src/pages/Admindashboard.jsx +13 -10
src/pages/ApplicantProfile.jsx +17 -6
system_architecture.txt +66 -0

Supabase/.temp/cli-latest CHANGED Viewed

	@@ -1 +1 @@
1	- v2.67.1


1	+ v2.75.0

Supabase/functions/otp/index.ts CHANGED Viewed

@@ -6,7 +6,7 @@ const corsHeaders = {
   'Access-Control-Allow-Headers': 'authorization, x-client-info, apikey, content-type',
 };
-serve(async (req) => {
   if (req.method === 'OPTIONS') {
     return new Response('ok', { headers: corsHeaders });
   }
@@ -35,6 +35,7 @@ serve(async (req) => {
     // ACTION: SEND SMS (VIA TWILIO)
     // ==========================================
     if (action === 'send') {
       const { data: profile } = await supabaseAdmin
         .from('profiles')
         .select('phone')
@@ -88,10 +89,10 @@ serve(async (req) => {
         console.error("Twilio Error:", errorText);
         throw new Error("Failed to send SMS. Check server logs.");
       }
-      // --- TWILIO LOGIC ENDS HERE ---
       return new Response(
-        JSON.stringify({ message: "OTP sent successfully" }),
         { headers: { ...corsHeaders, 'Content-Type': 'application/json' }, status: 200 }
       );
     }
@@ -100,6 +101,7 @@ serve(async (req) => {
     // ACTION: VERIFY
     // ==========================================
     if (action === 'verify') {
       if (!userCode) throw new Error("Missing OTP code");
       const { data: profile } = await supabaseAdmin.from('profiles').select('phone').eq('id', user.id).single();
@@ -119,16 +121,17 @@ serve(async (req) => {
       // Success
       await supabaseAdmin.from('profiles').update({ is_phone_verified: true }).eq('id', user.id);
       await supabaseAdmin.from('otp_verifications').delete().eq('phone', phone);
       return new Response(
-        JSON.stringify({ message: "Phone verified successfully!" }),
         { headers: { ...corsHeaders, 'Content-Type': 'application/json' }, status: 200 }
       );
     }
     return new Response(JSON.stringify({ error: "Invalid Action" }), { status: 400, headers: corsHeaders });
-  } catch (error) {
     return new Response(
       JSON.stringify({ error: error.message }),
       { headers: { ...corsHeaders, 'Content-Type': 'application/json' }, status: 400 }

   'Access-Control-Allow-Headers': 'authorization, x-client-info, apikey, content-type',
 };
+serve(async (req: Request) => {
   if (req.method === 'OPTIONS') {
     return new Response('ok', { headers: corsHeaders });
   }
     // ACTION: SEND SMS (VIA TWILIO)
     // ==========================================
     if (action === 'send') {
+      /** // Logic commented out to disable phone verification
       const { data: profile } = await supabaseAdmin
         .from('profiles')
         .select('phone')
         console.error("Twilio Error:", errorText);
         throw new Error("Failed to send SMS. Check server logs.");
       }
+      **/
       return new Response(
+        JSON.stringify({ message: "OTP sent successfully (Verification disabled)" }),
         { headers: { ...corsHeaders, 'Content-Type': 'application/json' }, status: 200 }
       );
     }
     // ACTION: VERIFY
     // ==========================================
     if (action === 'verify') {
+      /** // Logic commented out to disable phone verification
       if (!userCode) throw new Error("Missing OTP code");
       const { data: profile } = await supabaseAdmin.from('profiles').select('phone').eq('id', user.id).single();
       // Success
       await supabaseAdmin.from('profiles').update({ is_phone_verified: true }).eq('id', user.id);
       await supabaseAdmin.from('otp_verifications').delete().eq('phone', phone);
+      **/
       return new Response(
+        JSON.stringify({ message: "Phone verified successfully! (Verification disabled)" }),
         { headers: { ...corsHeaders, 'Content-Type': 'application/json' }, status: 200 }
       );
     }
     return new Response(JSON.stringify({ error: "Invalid Action" }), { status: 400, headers: corsHeaders });
+  } catch (error: any) {
     return new Response(
       JSON.stringify({ error: error.message }),
       { headers: { ...corsHeaders, 'Content-Type': 'application/json' }, status: 400 }

backend/api.py CHANGED Viewed

@@ -262,18 +262,26 @@ async def perform_candidate_analysis(candidate_id: str, job_id: str, force_refre
     # 6. Persist to Database
     try:
         data_to_save = {
             "ai_summary": ai_insights.get("summary"),
             "ai_insights": {
                 "weaknesses": ai_insights.get("weaknesses") or [],
                 "missing_skills": missing,
-                "score_breakdown": semantic_result.get("breakdown")
             },
-            "AI_score": ai_insights.get("score") or 0,
-            "semantic_score": semantic_result.get("total_score")
         }
         client.table("applications").update(data_to_save).eq("user_id", candidate_id).eq("job_id", job_id).execute()
-        print(f"💾 Persisted AI analysis for candidate {candidate_id}")
     except Exception as db_err:
         print(f"⚠️ Failed to persist AI analysis: {db_err}")

     # 6. Persist to Database
     try:
+        breakdown = semantic_result.get("breakdown") or {}
         data_to_save = {
             "ai_summary": ai_insights.get("summary"),
             "ai_insights": {
                 "weaknesses": ai_insights.get("weaknesses") or [],
                 "missing_skills": missing,
+                "score_breakdown": breakdown
             },
+            "AI_score": int(ai_insights.get("score") or 0),
+            "match_score": int(semantic_result.get("total_score") or 0),
+            # Granular Scores mapping to table columns
+            "skills_match": int(breakdown.get("skills", 0)),
+            "technical_skills_match": int(breakdown.get("technical_skills", 0)),
+            "work_experience_match": int(breakdown.get("experience", 0)),
+            "education_match": int(breakdown.get("education", 0)),
+            "certifications_match": int(breakdown.get("certifications", 0)),
+            "project_match": int(breakdown.get("projects", 0))
         }
         client.table("applications").update(data_to_save).eq("user_id", candidate_id).eq("job_id", job_id).execute()
+        print(f"💾 Persisted AI analysis and granular scores for candidate {candidate_id}")
     except Exception as db_err:
         print(f"⚠️ Failed to persist AI analysis: {db_err}")

backend/check_clusters_after_run.py ADDED Viewed

	@@ -0,0 +1,23 @@

+import asyncio
+import os
+from supabase import create_client
+from dotenv import load_dotenv
+load_dotenv()
+SUPABASE_URL = os.environ.get("SUPABASE_URL")
+SUPABASE_KEY = os.environ.get("SUPABASE_SERVICE_ROLE_KEY") or os.environ.get("SUPABASE_KEY")
+client = create_client(SUPABASE_URL, SUPABASE_KEY)
+async def check_clusters():
+    res = client.table("profiles").select("id, cluster_label").limit(10).execute()
+    if not res.data:
+        print("No profiles found")
+        return
+    print("Sample Cluster Labels:")
+    for row in res.data:
+        print(f"  - ID: {row['id']} | Label: {row['cluster_label']}")
+if __name__ == "__main__":
+    asyncio.run(check_clusters())

backend/check_db_clustering.py ADDED Viewed

	@@ -0,0 +1,33 @@

+import os
+from supabase import create_client, Client
+from dotenv import load_dotenv
+load_dotenv()
+url = os.environ.get("SUPABASE_URL")
+key = os.environ.get("SUPABASE_SERVICE_ROLE_KEY") or os.environ.get("SUPABASE_KEY")
+client: Client = create_client(url, key)
+def check_clustering_status():
+    print("Checking profiles table for cluster labels...")
+    resp = client.table("profiles").select("id, cluster_label").limit(20).execute()
+    data = resp.data
+    if not data:
+        print("No profiles found.")
+        return
+    # Count how many have labels
+    labeled = [d for d in data if d.get("cluster_label")]
+    print(f"Sample size: {len(data)}")
+    print(f"Profiles with cluster_label: {len(labeled)}")
+    if labeled:
+        print("Sample labels:")
+        for d in labeled[:5]:
+            print(f" - {d['id']}: {d['cluster_label']}")
+    else:
+        print("No profiles have cluster labels in this sample.")
+if __name__ == "__main__":
+    check_clustering_status()

backend/check_job_data.py ADDED Viewed

	@@ -0,0 +1,26 @@

+import asyncio
+import os
+from supabase import create_client
+from dotenv import load_dotenv
+load_dotenv()
+SUPABASE_URL = os.environ.get("SUPABASE_URL")
+SUPABASE_KEY = os.environ.get("SUPABASE_SERVICE_ROLE_KEY") or os.environ.get("SUPABASE_KEY")
+client = create_client(SUPABASE_URL, SUPABASE_KEY)
+async def check_job():
+    job_id = "45bcca29-4e12-45bf-97d4-0b77ff55472f"
+    res = client.table("job_embeddings").select("*").eq("job_id", job_id).execute()
+    if not res.data:
+        print("Job not found in job_embeddings")
+        return
+    data = res.data[0]
+    print(f"Data for Job {job_id}:")
+    for k, v in data.items():
+        if k in ['job_id', 'created_at', 'updated_at']: continue
+        print(f"  - {k}: {'POPULATED' if v else 'NULL'}")
+if __name__ == "__main__":
+    asyncio.run(check_job())

backend/debug_profile.json ADDED Viewed

	@@ -0,0 +1,90 @@

+{
+  "id": "a29ba56a-0d5b-4bc9-9a15-e314f6447260",
+  "updated_at": "2026-01-30T06:38:28.185688+00:00",
+  "full_name": null,
+  "role": "applicant",
+  "company_id": null,
+  "avatar_url": null,
+  "resume_url": "a29ba56a-0d5b-4bc9-9a15-e314f6447260/1769755098632_resume_ey.pdf",
+  "location": null,
+  "headline": "Final-year Computer Science student hands-on experience Machine Learning. Skilled React, Python, Flask, Supabase (PostgreSQL). Built ATS-style resume screening tools stock price prediction apps.",
+  "summary": null,
+  "skills": [
+    "Artificial Intelligence",
+    "Machine Learning",
+    "Communication",
+    "Team Work",
+    "Problem Solving",
+    "Conflict resolution"
+  ],
+  "work_experience": [
+    {
+      "role": "AI/ML Intern",
+      "years": "June 2025",
+      "company": "ICT Academy Kerala",
+      "duration": "1 month",
+      "description": "Completed 1-month internship focused Artificial Intelligence Machine Learning. CreatedandassessedMLmodelsonreal-worlddatasets;improvedvalidationaccuracyafterfeatureengineering hyperparameter tuning."
+    },
+    {
+      "role": "Django Intern",
+      "years": "Sept 2023",
+      "company": "Neo Green Labs",
+      "duration": null,
+      "description": "Delivered Django API endpoints (CRUD) connected relational database. Supported REST API implementation production use cases."
+    }
+  ],
+  "education": [
+    {
+      "year": "Nov 2022 Ongoing",
+      "course": "B.Tech Computer Science",
+      "institution": "APJ Abdul Technological University"
+    },
+    {
+      "year": "Jun 2020 Mar 2022",
+      "course": "Higher Secondary Education",
+      "institution": "Carmel College Engineering Technology"
+    }
+  ],
+  "phone": "+91 8921173593",
+  "current_position": null,
+  "address": null,
+  "linkedin": null,
+  "github": null,
+  "portfolio": null,
+  "experience_years": null,
+  "certifications": "ICT Academy Kerala (2025), The Joy Of Computing Using Python (Elite Rank), NPTEL (2025)",
+  "technical_skills": "Python, Java, C, SQL, React, Flask, Supabase, PostgreSQL, Django, XGBoost, LSTM",
+  "languages": null,
+  "professional_references": null,
+  "desired_salary": null,
+  "industry_experience": null,
+  "career_goals": null,
+  "willing_to_relocate": false,
+  "available_remote": false,
+  "processed": true,
+  "file_hash": "58406de4a011cd48192fe9e8a8e93e0255263632344bec40629deb639b54e847",
+  "projects": [
+    {
+      "title": "CV Ordering And Numbering Application",
+      "description": "Implemented automated CV filtering, ranking, clustering using job-specific criteria skill similarity algorithms. Built role-based access control system administrators, recruiters, applicants ensure secure streamlined workflows. Added PDF Excel report generation, Systematized email notifications, ATS-compatible resume for- matting. Architected platform emphasis scalability, data privacy, user-centric design improve recruiter efficiency candidate experience.",
+      "technologies_used": [
+        "React",
+        "Vite",
+        "Supabase"
+      ]
+    },
+    {
+      "title": "Stock Price Prediction System",
+      "description": "Built React + Flask web app forecast stock prices using historical OHLCV data. TrainedandbenchmarkedXGBoostandLSTMmodels;servedpredictionsthroughRESTAPIsanddisplayed trends. Created responsive UI entering stock symbols comparing predicted vs. actual price trends; trained models Google Colab delivered Matplotlib plots Flask backend.",
+      "technologies_used": [
+        "React",
+        "Flask",
+        "Python"
+      ]
+    }
+  ],
+  "email": null,
+  "is_phone_verified": false,
+  "ai_score": 0,
+  "cluster_label": null
+}

backend/debug_score.py ADDED Viewed

	@@ -0,0 +1,55 @@

+import asyncio
+import os
+import json
+from supabase import create_client
+from dotenv import load_dotenv
+from src.matching.similarity import calculate_granular_match_score
+load_dotenv()
+SUPABASE_URL = os.environ.get("SUPABASE_URL")
+SUPABASE_KEY = os.environ.get("SUPABASE_SERVICE_ROLE_KEY") or os.environ.get("SUPABASE_KEY")
+client = create_client(SUPABASE_URL, SUPABASE_KEY)
+async def run_test():
+    res = client.table("applications").select("user_id, job_id").limit(1).execute()
+    if not res.data:
+        print("No apps found")
+        return
+    c_id = res.data[0]["user_id"]
+    j_id = res.data[0]["job_id"]
+    # Raw fetch
+    p_emb = client.table("profile_embeddings").select("*").eq("id", c_id).execute().data[0]
+    j_emb = client.table("job_embeddings").select("*").eq("job_id", j_id).execute().data[0]
+    log = []
+    log.append(f"Testing {c_id} against {j_id}")
+    def get_len(v):
+        if v is None: return "None"
+        if isinstance(v, str):
+            try:
+                # Approximate len by comma count
+                return v.count(',') + 1
+            except: return "StringError"
+        return len(v)
+    log.append("\n--- Profile Lengths ---")
+    for k in ['skills', 'technical_skills', 'experience', 'certifications']:
+        log.append(f"{k}: {get_len(p_emb.get(k))}")
+    log.append("\n--- Job Lengths ---")
+    for k in ['skills', 'technical_skills', 'work_experience', 'certifications']:
+        log.append(f"{k}: {get_len(j_emb.get(k))}")
+    result = await calculate_granular_match_score(client, c_id, j_id)
+    log.append(f"\nResult: {json.dumps(result)}")
+    with open("debug_log.txt", "w") as f:
+        f.write("\n".join(log))
+    print("Logged to debug_log.txt")
+if __name__ == "__main__":
+    asyncio.run(run_test())

backend/docs/efficiency_guide.md ADDED Viewed

	@@ -0,0 +1,41 @@

+# BGE-M3 Efficiency Guide
+This guide explains how to measure and optimize the efficiency of the BAAI/bge-m3 model used in the IRIS project.
+## 1. Key Metrics
+### Performance (Infrastructure)
+- **Latency**: Time taken to generate an embedding for a single text. Critical for real-time search.
+- **Throughput**: Number of documents processed per second. Important for batch processing (e.g., initial profile indexing).
+- **VRAM/RAM Usage**: Memory footprint of the model. BGE-M3 is ~2.2GB in FP32.
+### Retrieval Quality (Accuracy)
+- **Precision@K**: The proportion of relevant candidates in the top K results.
+    * *Example*: If you return 10 candidates and 3 are actually qualified, Precision@10 = 30%.
+- **Recall@K** (Correlation to User's "callback"): The proportion of total relevant candidates that were successfully captured in the top K.
+    * *Example*: If there are 5 qualified candidates in the database and your search finds 4 of them in the top 10, Recall@10 = 80%.
+- **MRR (Mean Reciprocal Rank)**: Evaluates how high the first relevant candidate is ranked.
+    * *Formula*: $1 / Rank$. If the best candidate is at position #1, score is 1.0. If at #2, score is 0.5.
+- **NDCG (Normalized Discounted Cumulative Gain)**: Measures the overall quality of the ranking order, giving more weight to highly relevant results at the very top.
+## 2. BGE-M3 Specific Features
+BGE-M3 is a "multi-function" model. You can measure efficiency across three modes:
+1. **Dense Retrieval**: Standard 1024d vectors. Fast and semantic.
+2. **Sparse Retrieval (Lexical)**: Similar to BM25 but learned. More efficient for exact keyword matching.
+3. **Multi-Vector (ColBERT style)**: Most accurate but highest storage and latency cost.
+## 3. Optimization Techniques
+### Precision Tuning
+- **FP16**: Use `model.half()` if on GPU to double speed and halve memory with negligible accuracy loss.
+- **Quantization**: Int8 or GGUF formats can reduce memory usage by 4x.
+### Batching
+Using optimal batch sizes (e.g., 16-32) significantly improves throughput compared to single-sentence processing.
+## 4. Measuring Quality in IRIS
+To measure quality, create a "Golden Dataset" of (Job Description, Relevant Profiles) and calculate Hit Rate:
+1. Fetch top 10 profiles for a job.
+2. Check if the "ideal" candidate is in that list.
+3. Average this over 50 test cases.

backend/final_verify.py ADDED Viewed

	@@ -0,0 +1,38 @@

+import asyncio
+import os
+import json
+from supabase import create_client
+from dotenv import load_dotenv
+from api import perform_candidate_analysis
+load_dotenv()
+SUPABASE_URL = os.environ.get("SUPABASE_URL")
+SUPABASE_KEY = os.environ.get("SUPABASE_SERVICE_ROLE_KEY") or os.environ.get("SUPABASE_KEY")
+client = create_client(SUPABASE_URL, SUPABASE_KEY)
+async def verify():
+    res = client.table("applications").select("user_id, job_id").limit(1).execute()
+    if not res.data:
+        print("No apps found")
+        return
+    c_id = res.data[0]["user_id"]
+    j_id = res.data[0]["job_id"]
+    print(f"Triggering fresh analysis for {c_id} / {j_id}")
+    await perform_candidate_analysis(c_id, j_id, force_refresh=True)
+    print("\nChecking resulting record in DB:")
+    final_res = client.table("applications") \
+        .select("match_score, skills_match, technical_skills_match, work_experience_match, education_match, certifications_match, project_match") \
+        .eq("user_id", c_id).eq("job_id", j_id) \
+        .execute()
+    if final_res.data:
+        print(json.dumps(final_res.data[0], indent=2))
+    else:
+        print("Record not found after update")
+if __name__ == "__main__":
+    asyncio.run(verify())

backend/fix_profile_embeddings_trigger.sql ADDED Viewed

	@@ -0,0 +1,56 @@

+-- fix_profile_embeddings_trigger.sql
+-- Run this in your Supabase SQL Editor to fully resolve the "j_emb" error!
+-- 1. Redefine the function used by the trigger that refreshes recommendations
+-- The error "record j_emb has no field experience" was likely deeply cached in this logic
+CREATE OR REPLACE FUNCTION public.trg_refresh_recommendations_for_user()
+ RETURNS trigger
+ LANGUAGE plpgsql
+AS $function$
+DECLARE
+  j_id uuid;
+  match_res json;
+BEGIN
+  -- First clear out old recommendations for this user
+  DELETE FROM public.job_recommendations WHERE user_id = NEW.id;
+  -- Iterate through all existing job embeddings
+  FOR j_id IN SELECT job_id FROM public.job_embeddings LOOP
+    -- Call the fixed match_profile_job function
+    match_res := public.match_profile_job(NEW.id, j_id);
+    -- Only insert if there's an actual match > 0
+    IF (match_res->>'match_score')::int > 0 THEN
+      INSERT INTO public.job_recommendations (
+        user_id, job_id, match_score, skills_match, technical_skills_match,
+        work_experience_match, education_match, certifications_match, project_match
+      ) VALUES (
+        NEW.id, j_id,
+        (match_res->>'match_score')::int,
+        (match_res->>'skills_match')::int,
+        (match_res->>'technical_skills_match')::int,
+        (match_res->>'work_experience_match')::int,
+        (match_res->>'education_match')::int,
+        (match_res->>'certifications_match')::int,
+        (match_res->>'project_match')::int
+      );
+    END IF;
+  END LOOP;
+  RETURN NEW;
+END;
+$function$;
+-- 2. Drop the redundant webhook trigger since you only need the recommendation refresh
+-- Having both might cause race conditions or unnecessary webhooks
+DROP TRIGGER IF EXISTS on_profile_embedding_upsert ON public.profile_embeddings;
+-- 3. Ensure the embedding refresh trigger is properly attached
+DROP TRIGGER IF EXISTS on_profile_embedding_change ON public.profile_embeddings;
+CREATE TRIGGER on_profile_embedding_change
+AFTER INSERT OR UPDATE ON public.profile_embeddings
+FOR EACH ROW
+EXECUTE FUNCTION trg_refresh_recommendations_for_user();

backend/generate_realistic_resumes.py ADDED Viewed

	@@ -0,0 +1,183 @@

+import json
+import random
+import uuid
+import datetime
+try:
+    from faker import Faker
+except ImportError:
+    print("Faker not found. Please install it with: pip install Faker")
+    exit(1)
+fake = Faker()
+# ---------------------------------------------------------
+# CONSTANTS & DICTIONARIES
+# ---------------------------------------------------------
+SOFT_SKILLS = [
+    "Communication", "Teamwork", "Adaptability", "Analytical Thinking", "Problem Solving",
+    "Leadership", "Time Management", "Critical Thinking", "Empathy", "Conflict Resolution",
+    "Creativity", "Attention to Detail", "Work Ethic", "Interpersonal Skills", "Emotional Intelligence"
+]
+TECH_SKILLS = [
+    "Python", "Java", "C++", "C#", "JavaScript", "TypeScript", "React", "Angular", "Vue.js",
+    "Node.js", "Express", "Django", "Flask", "Spring Boot", "SQL", "PostgreSQL", "MySQL",
+    "MongoDB", "AWS", "Azure", "GCP", "Docker", "Kubernetes", "Git", "TensorFlow", "PyTorch",
+    "Pandas", "NumPy", "Scikit-learn", "HTML", "CSS", "Bash", "Linux"
+]
+ROLES = [
+    "Software Engineer", "Frontend Developer", "Backend Developer", "Full Stack Developer",
+    "Data Scientist", "Machine Learning Engineer", "DevOps Engineer", "Cloud Architect",
+    "System Administrator", "Database Administrator", "QA Engineer", "Product Manager"
+]
+DEGREES = [
+    "B.Tech in Computer Science and Engineering",
+    "B.S. in Computer Science",
+    "M.S. in Software Engineering",
+    "B.A. in Information Technology",
+    "M.Tech in Data Science",
+    "B.S. in Electrical Engineering",
+    "Bootcamp Graduate in Web Development"
+]
+CERTIFICATIONS = [
+    "AWS Certified Solutions Architect", "Google Cloud Professional Data Engineer",
+    "Certified Kubernetes Administrator (CKA)", "Cisco Certified Network Associate (CCNA)",
+    "Microsoft Certified: Azure Administrator Associate", "CompTIA Security+",
+    "Deep Learning Specialization (Coursera)", "Oracle Certified Professional Java SE Programmer"
+]
+# ---------------------------------------------------------
+# GENERATION LOGIC
+# ---------------------------------------------------------
+def generate_education():
+    edu_list = []
+    # Always a bachelor/masters
+    year_start = random.randint(2015, 2022)
+    course = random.choice(DEGREES)
+    institution = fake.company() + " University"
+    year = f"{year_start} - {year_start + 4}"
+    edu_list.append({
+        "course": course,
+        "institution": institution,
+        "year": year
+    })
+    # Sometimes high school
+    if random.random() > 0.5:
+        edu_list.append({
+            "course": "Higher Secondary Education",
+            "institution": f"{fake.city()} High School",
+            "year": f"{year_start - 2} - {year_start}"
+        })
+    return edu_list
+def generate_work_experience(role):
+    exp_list = []
+    num_jobs = random.randint(1, 3)
+    current_year = 2026
+    for _ in range(num_jobs):
+        start_year = current_year - random.randint(1, 3)
+        duration = f"{fake.month_name()[:3]} {start_year} - " + (f"{fake.month_name()[:3]} {current_year}" if current_year < 2026 else "Present")
+        # Descriptions with actual tech context
+        action = random.choice(["Developed", "Maintained", "Architected", "Optimized", "Spearheaded", "Collaborated on"])
+        project = random.choice(["a scalable microservices architecture", "a responsive web application", "a high-throughput data pipeline", "an internal dashboard", "a machine learning model"])
+        impact = random.choice(["reducing latency by 30%.", "increasing user engagement by 15%.", "saving $10k annually.", "improving deployment speed."])
+        description = f"{action} {project} {impact}. Worked within an Agile framework to deliver features on schedule."
+        exp_list.append({
+            "role": role if random.random() > 0.3 else random.choice(ROLES),
+            "company": fake.company(),
+            "years": duration,
+            "description": description
+        })
+        current_year = start_year - 1
+    return exp_list
+def generate_projects(tech_pool):
+    proj_list = []
+    num_proj = random.randint(1, 3)
+    for _ in range(num_proj):
+        p_tech = random.sample(tech_pool, k=min(len(tech_pool), random.randint(2, 4)))
+        desc = f"Built a {fake.bs()} platform using {', '.join(p_tech)}. Implemented {fake.catch_phrase().lower()} to solve real-world industry challenges."
+        proj_list.append({
+            "tech_stack": p_tech,
+            "description": desc
+        })
+    return proj_list
+def build_candidate():
+    user_id = str(uuid.uuid4())
+    role = random.choice(ROLES)
+    # 1. SOFT SKILLS (LIST)
+    cand_soft_skills = random.sample(SOFT_SKILLS, k=random.randint(3, 6))
+    # 2. TECH SKILLS (COMMA STRING LIKE IN DEBUG_PAYLOAD)
+    cand_tech_list = random.sample(TECH_SKILLS, k=random.randint(6, 12))
+    cand_tech_string = ", ".join(cand_tech_list)
+    # 3. CERTIFICATIONS (COMMA STRING)
+    cand_certs = ", ".join(random.sample(CERTIFICATIONS, k=random.randint(0, 2)))
+    # 4. EDUCATION
+    edu = generate_education()
+    # 5. EXPERIENCE
+    exp = generate_work_experience(role)
+    # 6. PROJECTS
+    proj = generate_projects(cand_tech_list)
+    # 7. SUMMARY
+    summary = f"{role} with {random.randint(1, 10)} years of experience. Proficient in {cand_tech_list[0]}, {cand_tech_list[1]}, and {cand_tech_list[2]}. Known for {cand_soft_skills[0].lower()} and {cand_soft_skills[1].lower()}. Dedicated to {fake.catch_phrase().lower()}."
+    payload = {
+        "id": user_id,
+        "resume_url": f"{user_id}/resume.pdf",
+        "file_hash": fake.sha256(),
+        "processed": True,
+        "updated_at": "now()",
+        "full_name": fake.name(),
+        "summary": summary,
+        "phone": fake.phone_number(),
+        "email": fake.email(),
+        "skills": cand_soft_skills,          # Note: Soft skills as List
+        "technical_skills": cand_tech_string, # Note: Tech skills as string representation (matching actual IRIS DB ingest logic)
+        "education": edu,
+        "work_experience": exp,
+        "projects": proj,
+        "certifications": cand_certs if cand_certs else None
+    }
+    return payload
+def generate_dataset(num_records=250):
+    print(f"🚀 Generating highly realistic dataset with {num_records} candidates...")
+    candidates = []
+    for _ in range(num_records):
+        candidates.append(build_candidate())
+    file_name = "realistic_synthetic_resumes.json"
+    with open(file_name, "w", encoding="utf-8") as f:
+        json.dump(candidates, f, indent=4)
+    print(f"✅ Successfully wrote {num_records} real-format JSON objects to '{file_name}'!")
+if __name__ == "__main__":
+    generate_dataset(250)

backend/inspect_columns.py ADDED Viewed

	@@ -0,0 +1,30 @@

+import asyncio
+import os
+from supabase import create_client
+from dotenv import load_dotenv
+load_dotenv()
+SUPABASE_URL = os.environ.get("SUPABASE_URL")
+SUPABASE_KEY = os.environ.get("SUPABASE_SERVICE_ROLE_KEY") or os.environ.get("SUPABASE_KEY")
+client = create_client(SUPABASE_URL, SUPABASE_KEY)
+async def inspect():
+    print("--- Profile Embeddings Columns ---")
+    p_res = client.table("profile_embeddings").select("*").limit(1).execute()
+    if p_res.data:
+        for k in sorted(p_res.data[0].keys()):
+            print(f"  - {k}")
+    else:
+        print("No profile embeddings found")
+    print("\n--- Job Embeddings Columns ---")
+    j_res = client.table("job_embeddings").select("*").limit(1).execute()
+    if j_res.data:
+        for k in sorted(j_res.data[0].keys()):
+            print(f"  - {k}")
+    else:
+        print("No job embeddings found")
+if __name__ == "__main__":
+    asyncio.run(inspect())

backend/inspect_schema.py ADDED Viewed

	@@ -0,0 +1,28 @@

+import asyncio
+import os
+from supabase import create_client
+from dotenv import load_dotenv
+load_dotenv()
+SUPABASE_URL = os.environ.get("SUPABASE_URL")
+SUPABASE_KEY = os.environ.get("SUPABASE_SERVICE_ROLE_KEY") or os.environ.get("SUPABASE_KEY")
+client = create_client(SUPABASE_URL, SUPABASE_KEY)
+async def inspect():
+    print("--- Profile Embeddings Sample ---")
+    p_res = client.table("profile_embeddings").select("*").limit(1).execute()
+    if p_res.data:
+        print(", ".join(p_res.data[0].keys()))
+    else:
+        print("No profile embeddings found")
+    print("\n--- Job Embeddings Sample ---")
+    j_res = client.table("job_embeddings").select("*").limit(1).execute()
+    if j_res.data:
+        print(", ".join(j_res.data[0].keys()))
+    else:
+        print("No job embeddings found")
+if __name__ == "__main__":
+    asyncio.run(inspect())

backend/inspect_schema_fixed.py ADDED Viewed

	@@ -0,0 +1,34 @@

+import asyncio
+import os
+import json
+from supabase import create_client
+from dotenv import load_dotenv
+load_dotenv()
+SUPABASE_URL = os.environ.get("SUPABASE_URL")
+SUPABASE_KEY = os.environ.get("SUPABASE_SERVICE_ROLE_KEY") or os.environ.get("SUPABASE_KEY")
+client = create_client(SUPABASE_URL, SUPABASE_KEY)
+async def inspect():
+    with open("schema_dump.txt", "w") as f:
+        f.write("--- Profile Embeddings ---\n")
+        p_res = client.table("profile_embeddings").select("*").limit(1).execute()
+        if p_res.data:
+            cols = sorted(p_res.data[0].keys())
+            for c in cols:
+                f.write(f"- {c}\n")
+        else:
+            f.write("No data\n")
+        f.write("\n--- Job Embeddings ---\n")
+        j_res = client.table("job_embeddings").select("*").limit(1).execute()
+        if j_res.data:
+            cols = sorted(j_res.data[0].keys())
+            for c in cols:
+                f.write(f"- {c}\n")
+        else:
+            f.write("No data\n")
+if __name__ == "__main__":
+    asyncio.run(inspect())

backend/out_cmd.txt ADDED Viewed

	@@ -0,0 +1,20 @@

+Traceback (most recent call last):
+  File "C:\Users\sandr\IRIS2026\IRIS_FULL\backend\recalculate_scores.py", line 90, in <module>
+    asyncio.run(main())
+    ~~~~~~~~~~~^^^^^^^^
+  File "C:\Users\sandr\AppData\Local\Programs\Python\Python313\Lib\asyncio\runners.py", line 195, in run
+    return runner.run(main)
+           ~~~~~~~~~~^^^^^^
+  File "C:\Users\sandr\AppData\Local\Programs\Python\Python313\Lib\asyncio\runners.py", line 118, in run
+    return self._loop.run_until_complete(task)
+           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
+  File "C:\Users\sandr\AppData\Local\Programs\Python\Python313\Lib\asyncio\base_events.py", line 725, in run_until_complete
+    return future.result()
+           ~~~~~~~~~~~~~^^
+  File "C:\Users\sandr\IRIS2026\IRIS_FULL\backend\recalculate_scores.py", line 23, in main
+    print("\U0001f50d Fetching all applications from Supabase...")
+    ~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+  File "C:\Users\sandr\AppData\Local\Programs\Python\Python313\Lib\encodings\cp1252.py", line 19, in encode
+    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
+           ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f50d' in position 0: character maps to <undefined>

backend/realistic_synthetic_resumes.json ADDED Viewed

The diff for this file is too large to render. See raw diff

backend/remove_triggers_for_profile_embeddings.sql ADDED Viewed

	@@ -0,0 +1,19 @@

+-- remove_triggers_for_profile_embeddings.sql
+-- Run this in your Supabase SQL Editor to completely disable the triggers
+-- causing the "embedding generation failed" error.
+-- 1. Drop the trigger that refreshes recommendations
+DROP TRIGGER IF EXISTS on_profile_embedding_change ON public.profile_embeddings;
+-- 2. Drop the redundant webhook trigger
+DROP TRIGGER IF EXISTS on_profile_embedding_upsert ON public.profile_embeddings;
+-- 3. Drop the function that refreshes recommendations
+DROP FUNCTION IF EXISTS public.trg_refresh_recommendations_for_user CASCADE;
+-- 4. Drop the function for the webhook trigger
+DROP FUNCTION IF EXISTS public.trg_on_profile_embedding_update CASCADE;
+-- Now the Python upsert:
+-- client.table("profile_embeddings").upsert(payload).execute()
+-- will run purely as a database insert without any hidden functions interrupting it.

backend/repair_system_mismatches.sql ADDED Viewed

	@@ -0,0 +1,104 @@

+-- repair_system_mismatches.sql
+-- Run this in Supabase SQL Editor to resolve the "j_emb" error and restore automatic matching.
+-- 1. FIX THE MATCHING FUNCTION (The "j_emb" bug fix)
+-- This function is used by triggers and the RPC.
+-- We ENSURE it uses 'work_experience' for jobs and 'experience' for profiles.
+CREATE OR REPLACE FUNCTION public.match_profile_job(p_id uuid, j_id uuid)
+ RETURNS json
+ LANGUAGE plpgsql
+ AS $function$
+ DECLARE
+   p_rec record;
+   j_rec record; -- Consistency check: Job record MUST use its real columns
+   s_sim float := 0; t_sim float := 0; exp_sim float := 0;
+   edu_sim float := 0; cert_sim float := 0; proj_sim float := 0;
+   s_score int := 0; t_score int := 0; e_score int := 0;
+   ed_score int := 0; c_score int := 0; p_score int := 0;
+ BEGIN
+   -- Fetch Profile Embeddings
+   SELECT * INTO p_rec FROM public.profile_embeddings WHERE id = p_id;
+   IF NOT FOUND THEN RETURN json_build_object('error', 'Profile embeddings not found'); END IF;
+   -- Fetch Job Embeddings
+   SELECT * INTO j_rec FROM public.job_embeddings WHERE job_id = j_id;
+   IF NOT FOUND THEN RETURN json_build_object('error', 'Job embeddings not found'); END IF;
+   -- Similarity with Cosine Distance (<=>)
+   IF p_rec.skills IS NOT NULL AND j_rec.skills IS NOT NULL THEN
+     s_sim := coalesce(nullif(1 - (p_rec.skills <=> j_rec.skills), 'NaN'), 0);
+   END IF;
+   IF p_rec.technical_skills IS NOT NULL AND j_rec.technical_skills IS NOT NULL THEN
+     t_sim := coalesce(nullif(1 - (p_rec.technical_skills <=> j_rec.technical_skills), 'NaN'), 0);
+   END IF;
+   -- FIX: Profile column is 'experience', Job column is 'work_experience'
+   IF p_rec.experience IS NOT NULL AND j_rec.work_experience IS NOT NULL THEN
+     exp_sim := coalesce(nullif(1 - (p_rec.experience <=> j_rec.work_experience), 'NaN'), 0);
+   END IF;
+   IF p_rec.education IS NOT NULL AND j_rec.education IS NOT NULL THEN
+     edu_sim := coalesce(nullif(1 - (p_rec.education <=> j_rec.education), 'NaN'), 0);
+   END IF;
+   IF p_rec.certifications IS NOT NULL THEN
+     cert_sim := coalesce(nullif(1 - (p_rec.certifications <=> coalesce(j_rec.technical_skills, j_rec.skills)), 'NaN'), 0);
+   END IF;
+   IF p_rec.projects IS NOT NULL AND j_rec.technical_skills IS NOT NULL THEN
+     proj_sim := coalesce(nullif(1 - (p_rec.projects <=> j_rec.technical_skills), 'NaN'), 0);
+   END IF;
+   -- Scaling to 0-100
+   s_score := (greatest(0, least(1, s_sim)) * 100)::int;
+   t_score := (greatest(0, least(1, t_sim)) * 100)::int;
+   e_score := (greatest(0, least(1, exp_sim)) * 100)::int;
+   ed_score := (greatest(0, least(1, edu_sim)) * 100)::int;
+   c_score := (greatest(0, least(1, cert_sim)) * 100)::int;
+   p_score := (greatest(0, least(1, proj_sim)) * 100)::int;
+   RETURN json_build_object(
+     'match_score', ((t_score * 0.35) + (e_score * 0.20) + (p_score * 0.15) + (s_score * 0.10) + (ed_score * 0.10) + (c_score * 0.10))::int,
+     'skills_match', s_score,
+     'technical_skills_match', t_score,
+     'work_experience_match', e_score,
+     'education_match', ed_score,
+     'certifications_match', c_score,
+     'project_match', p_score
+   );
+ END;
+ $function$;
+-- 2. CREATE THE JOB RECOMMENDATIONS RPC (Ranked Jobs)
+-- We drop it first because changing the return schema requires it in Postgres.
+DROP FUNCTION IF EXISTS public.get_job_recommendations(uuid, int);
+CREATE OR REPLACE FUNCTION public.get_job_recommendations(p_user_id uuid, p_limit int DEFAULT 10)
+ RETURNS json
+ LANGUAGE plpgsql
+ AS $function$
+ DECLARE
+   results_json JSON;
+ BEGIN
+   SELECT json_agg(r) INTO results_json
+   FROM (
+     SELECT
+       j.id,
+       j.title,
+       j.location,
+       j.job_type,
+       j.salary_range,
+       c.name as company_name,
+       c.logo_url as company_logo,
+       (match_profile_job(p_user_id, j.id)->>'match_score')::int as match_score
+     FROM public.jobs j
+     JOIN public.companies c ON j.company_id = c.id
+     WHERE j.status = 'Active'
+     ORDER BY match_score DESC
+     LIMIT p_limit
+   ) r;
+   RETURN coalesce(results_json, '[]'::json);
+ END;
+ $function$;

backend/requirements.txt CHANGED Viewed

@@ -26,3 +26,4 @@ fastapi>=0.109.0
 uvicorn>=0.27.0
 python-multipart>=0.0.9
 google-genai>=0.2.0

 uvicorn>=0.27.0
 python-multipart>=0.0.9
 google-genai>=0.2.0
+scikit-learn>=1.3.0

backend/script_output.txt ADDED Viewed

File without changes

backend/src/embeddings/benchmark_bge.py ADDED Viewed

	@@ -0,0 +1,55 @@

+import time
+import torch
+import numpy as np
+from sentence_transformers import SentenceTransformer
+import psutil
+import os
+def benchmark_bge():
+    print("🚀 Starting BGE-M3 Efficiency Benchmark...")
+    device = "cuda" if torch.cuda.is_available() else "cpu"
+    print(f"💻 Device: {device}")
+    print("📥 Loading BAAI/bge-m3...")
+    start_load = time.time()
+    model = SentenceTransformer('BAAI/bge-m3', device=device)
+    print(f"⏱️ Load Time: {time.time() - start_load:.2f}s")
+    process = psutil.Process(os.getpid())
+    mem_info = process.memory_info()
+    print(f"📊 Memory Usage (RAM): {mem_info.rss / 1024 / 1024:.2f} MB")
+    sentences = [
+        "The quick brown fox jumps over the lazy dog.",
+        "Artificial intelligence is transforming the recruitment industry.",
+        "Candidate has 5 years of experience in Python and FastAPI.",
+        "Looking for a Senior Software Engineer with cloud expertise."
+    ] * 25 # 100 sentences
+    batch_sizes = [1, 4, 8, 16, 32]
+    print("\n--- Latency vs Batch Size ---")
+    print(f"{'Batch Size':<12} | {'Time (s)':<10} | {'Sec/Sent':<10} | {'Throughput (sent/s)':<20}")
+    print("-" * 65)
+    for bs in batch_sizes:
+        start_time = time.time()
+        # Warmup
+        model.encode(sentences[:bs], batch_size=bs, show_progress_bar=False)
+        # Actual benchmark
+        start_time = time.time()
+        model.encode(sentences, batch_size=bs, show_progress_bar=False)
+        end_time = time.time()
+        total_time = end_time - start_time
+        sec_per_sent = total_time / len(sentences)
+        throughput = len(sentences) / total_time
+        print(f"{bs:<12} | {total_time:<10.3f} | {sec_per_sent:<10.4f} | {throughput:<20.2f}")
+    print("\n✅ Benchmark Complete.")
+if __name__ == "__main__":
+    benchmark_bge()

backend/src/embeddings/evaluate_quality.py ADDED Viewed

	@@ -0,0 +1,197 @@

+import sys
+import os
+import time
+import json
+import random
+import numpy as np
+# Set encoding for Windows terminals
+if sys.platform == "win32":
+    import io
+    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
+# Add backend to path
+sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '../../..')))
+from backend.src.embeddings.local_embedder import generate_embedding
+def cosine_similarity(v1, v2):
+    return np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))
+def inject_noise(text, is_skill=False):
+    """Simulates real-world messy resumes with abbreviations, typos, and lowercasing."""
+    if random.random() < 0.3: # 30% chance to leave perfectly clean
+        return text
+    abbreviations = {
+        "Python": "Py", "PostgreSQL": "Postgres", "JavaScript": "JS",
+        "React": "ReactJS", "Machine Learning": "ML", "Amazon Web Services": "AWS",
+        "Kubernetes": "K8s", "TypeScript": "TS", "User Experience": "UX"
+    }
+    if is_skill and text in abbreviations and random.random() > 0.5:
+        return abbreviations[text]
+    # Randomly lowercase everything (common in lazy resumes)
+    if random.random() > 0.7:
+        text = text.lower()
+    return text
+def generate_adversarial_dataset():
+    """Generates 200 candidates with intentional distractors and noise."""
+    print("Building N=200 Adversarial Candidate Pool...")
+    domains = [
+        ("Frontend_React", ["React", "JavaScript", "Tailwind", "CSS", "TypeScript"]),
+        ("Frontend_Angular", ["Angular", "JavaScript", "SCSS", "HTML", "TypeScript"]),
+        ("Backend_Python", ["Python", "FastAPI", "PostgreSQL", "Docker", "Linux"]),
+        ("Backend_Java", ["Java", "Spring Boot", "MySQL", "Kafka", "Kubernetes"]),
+        ("Data_Science", ["Python", "Pandas", "PyTorch", "SQL", "Machine Learning"]),
+        ("Data_Engineer", ["Spark", "Airflow", "Python", "SQL", "AWS"]),
+        ("DevOps", ["Kubernetes", "Docker", "Terraform", "CI/CD", "AWS"]),
+        ("Mobile_iOS", ["Swift", "Objective-C", "iOS", "XCode", "CoreData"]),
+        ("Mobile_Android", ["Kotlin", "Java", "Android Studio", "Jetpack", "Firebase"]),
+        ("Cybersecurity", ["Network Security", "Penetration Testing", "Firewalls", "Linux", "Python"])
+    ]
+    levels = ["Junior", "Mid-Level", "Senior", "Lead"]
+    candidates = []
+    golden_dataset = []
+    cand_counter = 1
+    # Generate 40 Queries (10 domains x 4 levels)
+    for domain_name, base_skills in domains:
+        for level in levels:
+            # 1. The Target Candidate (Golden)
+            target_id = f"cand_{cand_counter}_TARGET_{level}_{domain_name}"
+            target_skills = [inject_noise(s, True) for s in base_skills]
+            candidates.append({
+                "id": target_id,
+                "headline": f"{level} {domain_name.replace('_', ' ')} Engineer",
+                "summary": inject_noise(f"Experienced {level} professional in {domain_name}. Passionate about building scalable architectures."),
+                "skills": target_skills,
+                "experience": [inject_noise(f"Built systems using {target_skills[0]} and {target_skills[1]}.")]
+            })
+            cand_counter += 1
+            # The Query (Clean, formal HR language)
+            query = f"Hiring a {level} professional in {domain_name.replace('_', ' ')}. Must have strong experience with {base_skills[0]}, {base_skills[1]}, and {base_skills[2]}."
+            golden_dataset.append({"query": query, "relevant_id": target_id})
+            # 2. Seniority Distractor (Wrong level, perfect skills)
+            distractor_level = "Senior" if level == "Junior" else "Junior"
+            candidates.append({
+                "id": f"cand_{cand_counter}_DISTRACTOR_LEVEL_{domain_name}",
+                "headline": f"{distractor_level} {domain_name.replace('_', ' ')} Engineer",
+                "summary": f"A {distractor_level} developer specializing in {domain_name}.",
+                "skills": base_skills, # Same exact skills to confuse the model
+                "experience": [f"Worked extensively with {base_skills[0]}."]
+            })
+            cand_counter += 1
+            # 3. Skill Distractor (Right level, missing core skill, has similar skill)
+            altered_skills = base_skills.copy()
+            altered_skills[0] = "C++" # Replace core skill with something irrelevant
+            candidates.append({
+                "id": f"cand_{cand_counter}_DISTRACTOR_SKILL_{domain_name}",
+                "headline": f"{level} Software Engineer",
+                "summary": f"Focuses on {altered_skills[0]} and backend architecture.",
+                "skills": altered_skills,
+                "experience": [f"Maintained legacy {altered_skills[0]} codebases."]
+            })
+            cand_counter += 1
+            # 4 & 5. Random Noise Candidates (Fill out the 200)
+            for _ in range(2):
+                rand_domain = random.choice(domains)
+                candidates.append({
+                    "id": f"cand_{cand_counter}_RANDOM",
+                    "headline": f"{random.choice(levels)} {rand_domain[0]} Dev",
+                    "summary": "Looking for new opportunities. Hobbies: hiking, dog walking, photography.",
+                    "skills": [inject_noise(s, True) for s in rand_domain[1]],
+                    "experience": ["General software development tasks."]
+                })
+                cand_counter += 1
+    return candidates, golden_dataset
+def evaluate_adversarial():
+    print("🚀 Starting Adversarial Robustness Evaluation...")
+    candidates, golden_dataset = generate_adversarial_dataset()
+    print(f"📊 Dataset: {len(golden_dataset)} Queries | {len(candidates)} Candidates")
+    print("⚠️  Warning: Embedding 200 profiles on CPU will take time. Please wait...\n")
+    # 1. Embed Candidates (Flattening)
+    candidate_embeddings = []
+    start_time = time.time()
+    for i, c in enumerate(candidates):
+        rich_text = f"Headline: {c['headline']}. Summary: {c['summary']} Skills: {', '.join(c['skills'])}. Experience: {' '.join(c['experience'])}"
+        candidate_embeddings.append({
+            "id": c["id"],
+            "vec": generate_embedding(rich_text)
+        })
+        if (i+1) % 20 == 0:
+            print(f"  -> Embedded {i+1}/200 candidates...")
+    print(f"✅ Embedding complete in {time.time() - start_time:.2f} seconds.\n")
+    # 2. Evaluate Queries
+    mrr_total = 0
+    hits_at_1 = 0
+    hits_at_3 = 0
+    hits_at_5 = 0
+    for item in golden_dataset:
+        query_vec = generate_embedding(item["query"])
+        target_id = item["relevant_id"]
+        scores = [(c_emb["id"], cosine_similarity(query_vec, c_emb["vec"])) for c_emb in candidate_embeddings]
+        scores.sort(key=lambda x: x[1], reverse=True)
+        rank = -1
+        for idx, (cid, sim) in enumerate(scores):
+            if cid == target_id:
+                rank = idx + 1
+                break
+        if rank != -1:
+            mrr_total += (1.0 / rank)
+            if rank == 1: hits_at_1 += 1
+            if rank <= 3: hits_at_3 += 1
+            if rank <= 5: hits_at_5 += 1
+    # 3. Final Aggregation
+    num_queries = len(golden_dataset)
+    final_mrr = mrr_total / num_queries
+    recall_1 = hits_at_1 / num_queries
+    recall_3 = hits_at_3 / num_queries
+    recall_5 = hits_at_5 / num_queries
+    print("="*45)
+    print("🛡️ ADVERSARIAL RETRIEVAL METRICS (N=200)")
+    print("="*45)
+    print(f"MRR (Mean Reciprocal Rank):  {final_mrr:.4f}")
+    print("-" * 45)
+    print(f"Recall@1  (R@1):             {recall_1*100:.1f}%")
+    print(f"Recall@3  (R@3):             {recall_3*100:.1f}%")
+    print(f"Recall@5  (R@5):             {recall_5*100:.1f}%")
+    print("="*45)
+    # Save to JSON for the guide/paper
+    with open("quality_metrics_adversarial.json", "w") as f:
+        json.dump({
+            "dataset": "N=200 Adversarial (Noise + Distractors)",
+            "mrr": final_mrr,
+            "recall_1": recall_1,
+            "recall_3": recall_3
+        }, f, indent=4)
+    print("📄 Results securely saved to 'quality_metrics_adversarial.json'")
+if __name__ == "__main__":
+    evaluate_adversarial()

backend/src/embeddings/job_embed.py CHANGED Viewed

@@ -92,7 +92,7 @@ def safe_generate_and_store_job_embeddings(client, job_id: str) -> None:
             "skills": generate_list_embedding(skills),
             "technical_skills": generate_list_embedding(technical_skills),
             "tools": generate_list_embedding(tools),
-            "experience": generate_embedding(experience),
             "education": generate_embedding(education),
             "certifications": generate_list_embedding(certifications),
             "updated_at": "now()"

             "skills": generate_list_embedding(skills),
             "technical_skills": generate_list_embedding(technical_skills),
             "tools": generate_list_embedding(tools),
+            "work_experience": generate_embedding(experience),
             "education": generate_embedding(education),
             "certifications": generate_list_embedding(certifications),
             "updated_at": "now()"

backend/src/embeddings/match_benchmark_granular.py ADDED Viewed

	@@ -0,0 +1,228 @@

+import sys
+import os
+import time
+import json
+import random
+import numpy as np
+import torch
+from sentence_transformers import SentenceTransformer
+# Set encoding for Windows terminals
+# Removing potentially problematic wrapper for background logging
+# if sys.platform == "win32":
+#     import io
+#     sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
+# Add backend to path
+sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '../../..')))
+# ---------------------------------------------------------------------
+# UTILS & NOISE SIMULATION
+# ---------------------------------------------------------------------
+def cosine_similarity(v1, v2):
+    if v1 is None or v2 is None: return 0.0
+    norm1 = np.linalg.norm(v1)
+    norm2 = np.linalg.norm(v2)
+    if norm1 == 0 or norm2 == 0: return 0.0
+    return np.dot(v1, v2) / (norm1 * norm2)
+def jaccard_similarity(list1, list2):
+    s1 = set([str(x).lower().strip() for x in list1])
+    s2 = set([str(x).lower().strip() for x in list2])
+    if not s1 or not s2: return 0.0
+    return len(s1.intersection(s2)) / len(s1.union(s2))
+def inject_real_world_noise(text, is_skill=False):
+    """Simulates typos, abbreviations, and informal language."""
+    if random.random() < 0.2: return text # 20% keep clean
+    abbrev = {
+        "Python": "Py", "PostgreSQL": "Postgres", "JavaScript": "JS",
+        "React": "ReactJS", "Machine Learning": "ML", "Kubernetes": "K8s",
+        "TypeScript": "TS", "Amazon Web Services": "AWS", "Google Cloud": "GCP"
+    }
+    # Apply abbreviation
+    if is_skill and text in abbrev and random.random() > 0.4:
+        return abbrev[text]
+    # Inject "Messy" Resume fillers
+    fillers = ["Highly skilled in", "Practical knowledge of", "Working with", "Extensive experience in"]
+    if random.random() > 0.7 and not is_skill:
+        text = f"{random.choice(fillers)} {text}"
+    # Random case noise
+    if random.random() > 0.8:
+        text = text.lower()
+    return text
+# ---------------------------------------------------------------------
+# DATASET GENERATION
+# ---------------------------------------------------------------------
+def generate_bench_dataset(num_candidates=100):
+    print(f"🛠️ Generating N={num_candidates} Real-World Synthetic Dataset...")
+    domains = [
+        ("Cloud_Architect", ["AWS", "Terraform", "Kubernetes", "Docker"], ["Solutions Associate", "AWS Architect"]),
+        ("Backend_Dev", ["Python", "FastAPI", "PostgreSQL", "Redis"], ["Python Cert", "FastAPI Expert"]),
+        ("Frontend_Dev", ["React", "TypeScript", "Tailwind", "Next.js"], ["Meta React Cert", "JS Expert"]),
+        ("Data_Science", ["Python", "PyTorch", "SQL", "Pandas"], ["TensorFlow Cert", "Data Pro"]),
+    ]
+    candidates = []
+    queries = [] # JDs
+    # We generate balanced pairs
+    for i in range(num_candidates):
+        domain_name, skills, certs = domains[i % len(domains)]
+        level = random.choice(["Junior", "Senior", "Lead"])
+        # 1. The Candidate Data
+        cand_id = f"cand_{i}_{domain_name}"
+        noisy_skills = [inject_real_world_noise(s, True) for s in skills]
+        candidates.append({
+            "id": cand_id,
+            "skills": noisy_skills,
+            "tech_skills": noisy_skills, # Project uses both
+            "experience": [f"Developed {domain_name} solutions at Tech {i}."],
+            "certifications": [certs[0]] if random.random() > 0.5 else [],
+            "full_text": f"{level} {domain_name}. Skills: {', '.join(noisy_skills)}"
+        })
+        # 2. The Matching Query (JD) - Formal Clean Version
+        jd_text = f"We are looking for a {level} {domain_name.replace('_', ' ')}. Must have expertise in {skills[0]}, {skills[1]}, and {skills[2]}."
+        queries.append({
+            "query": jd_text,
+            "relevant_id": cand_id,
+            "jd_structured": {
+                "skills": skills,
+                "tech_skills": skills,
+                "experience": [f"{level} {domain_name} experience."],
+                "certifications": certs
+            }
+        })
+    return candidates, queries
+# ---------------------------------------------------------------------
+# BENCHMARK RUNNER
+# ---------------------------------------------------------------------
+def run_benchmark():
+    device = "cuda" if torch.cuda.is_available() else "cpu"
+    print(f"🚀 Loading Models on {device}...", flush=True)
+    # Load Models
+    bert_model = SentenceTransformer('all-MiniLM-L6-v2', device=device)
+    bge_model = SentenceTransformer('BAAI/bge-m3', device=device)
+    candidates, queries = generate_bench_dataset(250)
+    # Save the synthetic dataset to a JSON file for inspection
+    with open("synthetic_dataset_adversarial.json", "w", encoding="utf-8") as f:
+        json.dump({"candidates": candidates, "queries": queries}, f, indent=4)
+    print(f"💾 Saved generated synthetic dataset to 'synthetic_dataset_adversarial.json'", flush=True)
+    # Pre-calculate Candidate Embeddings
+    print("🧠 Indexing Candidates...")
+    start_idx = time.time()
+    for i, c in enumerate(candidates):
+        # BERT Flattened
+        c["bert_vec"] = bert_model.encode(c["full_text"])
+        # BGE Flattened
+        c["bge_flat_vec"] = bge_model.encode(c["full_text"])
+        # BGE Granular (Project Method)
+        c["bge_granular"] = {
+            "skills": bge_model.encode(" ".join(c["skills"])),
+            "tech_skills": bge_model.encode(" ".join(c["tech_skills"])),
+            "experience": bge_model.encode(" ".join(c["experience"])),
+            "certs": bge_model.encode(" ".join(c["certifications"])) if c["certifications"] else np.zeros(1024)
+        }
+        if (i+1) % 50 == 0:
+            print(f"  -> Indexed {i+1}/{len(candidates)} candidates...", flush=True)
+    print(f"✅ Indexed in {time.time() - start_idx:.2f}s")
+    # Evaluation Loops
+    methods = ["Jaccard_Baseline", "BERT_Flattened", "BGE_Flattened", "BGE_Granular_Weighted"]
+    results = {m: {"mrr": 0, "r1": 0, "r3": 0} for m in methods}
+    weights = {"skills": 0.35, "tech_skills": 0.35, "experience": 0.20, "certs": 0.10}
+    print("\nEvaluating Queries...")
+    for i, q in enumerate(queries):
+        target_id = q["relevant_id"]
+        jd_text = q["query"]
+        jd_s = q["jd_structured"]
+        # Embed Query
+        q_bert = bert_model.encode(jd_text)
+        q_bge_flat = bge_model.encode(jd_text)
+        q_bge_g = {
+            "skills": bge_model.encode(" ".join(jd_s["skills"])),
+            "tech_skills": bge_model.encode(" ".join(jd_s["tech_skills"])),
+            "experience": bge_model.encode(" ".join(jd_s["experience"])),
+            "certs": bge_model.encode(" ".join(jd_s["certifications"]))
+        }
+        if (i+1) % 25 == 0:
+            print(f"  -> Evaluated {i+1}/{len(queries)} queries...", flush=True)
+        # Calculate scores for all candidates
+        cand_scores = []
+        for c in candidates:
+            # 1. Jaccard
+            jac = jaccard_similarity(jd_s["skills"], c["skills"])
+            # 2. BERT
+            ber = cosine_similarity(q_bert, c["bert_vec"])
+            # 3. BGE Flat
+            bgf = cosine_similarity(q_bge_flat, c["bge_flat_vec"])
+            # 4. BGE Granular Weighted
+            bgg = (
+                cosine_similarity(q_bge_g["skills"], c["bge_granular"]["skills"]) * weights["skills"] +
+                cosine_similarity(q_bge_g["tech_skills"], c["bge_granular"]["tech_skills"]) * weights["tech_skills"] +
+                cosine_similarity(q_bge_g["experience"], c["bge_granular"]["experience"]) * weights["experience"] +
+                cosine_similarity(q_bge_g["certs"], c["bge_granular"]["certs"]) * weights["certs"]
+            )
+            cand_scores.append({
+                "id": c["id"],
+                "Jaccard_Baseline": jac,
+                "BERT_Flattened": ber,
+                "BGE_Flattened": bgf,
+                "BGE_Granular_Weighted": bgg
+            })
+        # Rank and Calc Metrics
+        for m in methods:
+            sorted_cands = sorted(cand_scores, key=lambda x: x[m], reverse=True)
+            rank = next(i for i, x in enumerate(sorted_cands) if x["id"] == target_id) + 1
+            results[m]["mrr"] += (1.0 / rank)
+            if rank == 1: results[m]["r1"] += 1
+            if rank <= 3: results[m]["r3"] += 1
+    # Print Results Table
+    num_q = len(queries)
+    print("\n" + "="*65)
+    print(f"{'Method':<25} | {'MRR':<8} | {'Recall@1':<10} | {'Recall@3':<10}")
+    print("-" * 65)
+    for m in methods:
+        mrr = results[m]["mrr"] / num_q
+        r1 = (results[m]["r1"] / num_q) * 100
+        r3 = (results[m]["r3"] / num_q) * 100
+        print(f"{m:<25} | {mrr:.4f}   | {r1:>8.1f}%  | {r3:>8.1f}%", flush=True)
+    print("="*65, flush=True)
+    # Save to file
+    summary = {m: {"mrr": results[m]["mrr"]/num_q, "r1": results[m]["r1"]/num_q, "r3": results[m]["r3"]/num_q} for m in methods}
+    with open("match_benchmark_results.json", "w") as f:
+        json.dump(summary, f, indent=4)
+    print(f"\n📄 Results saved to 'match_benchmark_results.json'", flush=True)
+if __name__ == "__main__":
+    run_benchmark()

backend/src/embeddings/profile_entities_bench.py ADDED Viewed

	@@ -0,0 +1,115 @@

+import sys
+import os
+import time
+import numpy as np
+import json
+# Add backend to path
+sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '../../..')))
+from backend.src.embeddings.local_embedder import generate_embedding, generate_list_embedding
+def generate_structured_profiles(num_samples=50):
+    """Generates synthetic resumes split into specific entity fields."""
+    print(f"Generating {num_samples} structured synthetic profiles...")
+    domains = [
+        ("Frontend", ["React", "JavaScript", "Tailwind", "CSS", "HTML", "Redux", "TypeScript", "Jest"]),
+        ("Backend", ["Python", "FastAPI", "PostgreSQL", "Docker", "AWS", "Linux", "Redis", "Kafka"]),
+        ("Data Science", ["Python", "Pandas", "PyTorch", "SQL", "Machine Learning", "NLP", "TensorFlow", "R"]),
+        ("DevOps", ["Kubernetes", "Docker", "Terraform", "CI/CD", "Jenkins", "AWS", "Bash", "Ansible"]),
+        ("Mobile", ["Swift", "Kotlin", "React Native", "Flutter", "iOS", "Android", "Firebase", "SQLite"])
+    ]
+    levels = ["Junior", "Mid-Level", "Senior", "Lead", "Principal"]
+    profiles = []
+    for i in range(num_samples):
+        domain_name, domain_skills = domains[i % len(domains)]
+        level = levels[i % len(levels)]
+        # Randomize skills count slightly per profile (5 to 8 skills)
+        np.random.seed(i)
+        skills_subset = list(np.random.choice(domain_skills, size=np.random.randint(5, 9), replace=False))
+        profile = {
+            "profile_id": f"cand_{i+1}_{domain_name.lower()}",
+            "headline": f"{level} {domain_name} Engineer",
+            "summary": f"Dedicated {level} {domain_name} professional with a proven track record of building scalable systems and working in agile environments. Passionate about clean code and modern architectures.",
+            "skills": skills_subset,
+            "experience": [
+                f"{level} Engineer at TechCorp: Spearheaded the migration to cloud infrastructure and improved system performance by 40%.",
+                f"Software Developer at Startup Inc: Developed RESTful APIs and collaborated with the frontend team to deliver features.",
+                f"Intern at Legacy Systems: Assisted in maintaining codebases and writing unit tests."
+            ]
+        }
+        profiles.append(profile)
+    return profiles
+def profile_entities_scaled():
+    num_samples = 50
+    profiles = generate_structured_profiles(num_samples)
+    print(f"\n🚀 Starting Entity-to-Embedding Efficiency Benchmark (N={num_samples})...")
+    # Tracking arrays
+    summary_times = []
+    headline_times = []
+    skills_times = []
+    exp_times = []
+    total_times = []
+    for i, p in enumerate(profiles):
+        start_total = time.time()
+        # 1. Profile Headline
+        start = time.time()
+        generate_embedding(p["headline"])
+        headline_times.append((time.time() - start) * 1000)
+        # 2. Profile Summary
+        start = time.time()
+        generate_embedding(p["summary"])
+        summary_times.append((time.time() - start) * 1000)
+        # 3. Profile Skills (Batch)
+        start = time.time()
+        generate_list_embedding(p["skills"])
+        skills_times.append((time.time() - start) * 1000)
+        # 4. Profile Experience (Batch)
+        start = time.time()
+        generate_list_embedding(p["experience"])
+        exp_times.append((time.time() - start) * 1000)
+        # Total
+        total_times.append((time.time() - start_total) * 1000)
+        if (i + 1) % 10 == 0:
+            print(f"  -> Processed {i + 1}/{num_samples} profiles...")
+    # Calculate statistics
+    results = [
+        "IRIS Entity-to-Embedding Efficiency Results (Scaled)",
+        f"Total Profiles Evaluated: {num_samples}",
+        "-" * 60,
+        f"{'Entity Type':<15} | {'Mean Latency (ms)':<20} | {'Std Dev (ms)':<15}",
+        "-" * 60,
+        f"{'Headline':<15} | {np.mean(headline_times):<20.2f} | {np.std(headline_times):<15.2f}",
+        f"{'Summary':<15} | {np.mean(summary_times):<20.2f} | {np.std(summary_times):<15.2f}",
+        f"{'Skills (List)':<15} | {np.mean(skills_times):<20.2f} | {np.std(skills_times):<15.2f}",
+        f"{'Experience (List)':<15}| {np.mean(exp_times):<20.2f} | {np.std(exp_times):<15.2f}",
+        "-" * 60,
+        f"MEAN TOTAL PER PROFILE: {np.mean(total_times):.2f} ms",
+        f"Average Throughput:     {1000 / np.mean(total_times):.3f} profiles/sec"
+    ]
+    output_text = "\n".join(results)
+    print("\n" + output_text)
+    with open("entity_benchmark_scaled_results.txt", "w") as f:
+        f.write(output_text)
+    print("\n📄 Results saved to 'entity_benchmark_scaled_results.txt'.")
+if __name__ == "__main__":
+    profile_entities_scaled()

backend/src/matching/similarity.py CHANGED Viewed

@@ -3,13 +3,25 @@ import numpy as np
 from typing import Dict, Any, List
 from supabase import Client
-def cosine_similarity(v1: List[float], v2: List[float]) -> float:
-    """Calculates cosine similarity between two vectors."""
-    if not v1 or not v2 or len(v1) != len(v2):
         return 0.0
-    a = np.array(v1)
-    b = np.array(v2)
     # Check if vectors are zero vectors
     if np.all(a == 0) or np.all(b == 0):
@@ -51,27 +63,37 @@ async def calculate_granular_match_score(client: Client, candidate_id: str, job_
         print(f"❌ Database error in match score: {e}")
         return {"total_score": 0, "breakdown": {}, "error": str(e)}
-    # 2. Define Weights
-    # These could eventually be user-defined
     WEIGHTS = {
-        "skills": 0.35,
         "technical_skills": 0.35,
         "experience": 0.20,
         "certifications": 0.10
     }
     # 3. Calculate Individual Similarities
     scores = {}
-    # Skill matching
-    scores["skills"] = cosine_similarity(profile_emb.get("skills"), job_emb.get("skills"))
     scores["technical_skills"] = cosine_similarity(profile_emb.get("technical_skills"), job_emb.get("technical_skills"))
-    # Experience matching
-    scores["experience"] = cosine_similarity(profile_emb.get("experience"), job_emb.get("experience"))
-    # Certifications matching
-    scores["certifications"] = cosine_similarity(profile_emb.get("certifications"), job_emb.get("certifications"))
     # 4. Calculate Weighted Total
     total_score = 0
@@ -79,12 +101,13 @@ async def calculate_granular_match_score(client: Client, candidate_id: str, job_
     for key, weight in WEIGHTS.items():
         if scores.get(key) is not None:
-            total_score += scores[key] * weight
             available_weight += weight
-    # Normalize if some fields were missing (though WEIGHTS sums to 1.0)
     if available_weight > 0:
-        final_score = (total_score / available_weight) * 100
     else:
         final_score = 0

 from typing import Dict, Any, List
 from supabase import Client
+def cosine_similarity(v1: Any, v2: Any) -> float:
+    """Calculates cosine similarity between two vectors, handling both lists and pgvector strings."""
+    def parse_vector(v):
+        if isinstance(v, str):
+            try:
+                # Remove brackets and split by comma
+                return [float(x.strip()) for x in v.strip('[]').split(',') if x.strip()]
+            except Exception:
+                return []
+        return v if isinstance(v, list) else []
+    vec1 = parse_vector(v1)
+    vec2 = parse_vector(v2)
+    if not vec1 or not vec2 or len(vec1) != len(vec2):
         return 0.0
+    a = np.array(vec1)
+    b = np.array(vec2)
     # Check if vectors are zero vectors
     if np.all(a == 0) or np.all(b == 0):
         print(f"❌ Database error in match score: {e}")
         return {"total_score": 0, "breakdown": {}, "error": str(e)}
+    # 2. Define Weights (Matching SQL function public.match_profile_job)
     WEIGHTS = {
         "technical_skills": 0.35,
         "experience": 0.20,
+        "projects": 0.15,
+        "skills": 0.10,
+        "education": 0.10,
         "certifications": 0.10
     }
     # 3. Calculate Individual Similarities
     scores = {}
+    # Technical Skills
     scores["technical_skills"] = cosine_similarity(profile_emb.get("technical_skills"), job_emb.get("technical_skills"))
+    # Experience
+    scores["experience"] = cosine_similarity(profile_emb.get("experience"), job_emb.get("work_experience"))
+    # Projects (Compare profile projects vs job technical skills)
+    scores["projects"] = cosine_similarity(profile_emb.get("projects"), job_emb.get("technical_skills"))
+    # Skills
+    scores["skills"] = cosine_similarity(profile_emb.get("skills"), job_emb.get("skills"))
+    # Education
+    scores["education"] = cosine_similarity(profile_emb.get("education"), job_emb.get("education"))
+    # Certifications (Compare profile certs vs job technical skills or skills)
+    job_target = job_emb.get("technical_skills") if job_emb.get("technical_skills") else job_emb.get("skills")
+    scores["certifications"] = cosine_similarity(profile_emb.get("certifications"), job_target)
     # 4. Calculate Weighted Total
     total_score = 0
     for key, weight in WEIGHTS.items():
         if scores.get(key) is not None:
+            # Scale to 100 like SQL
+            total_score += (scores[key] * 100) * weight
             available_weight += weight
+    # Normalize
     if available_weight > 0:
+        final_score = total_score / available_weight
     else:
         final_score = 0

backend/src/services/clustering_service.py ADDED Viewed

	@@ -0,0 +1,148 @@

+import os
+import numpy as np
+from sklearn.cluster import KMeans
+from typing import List, Dict, Any
+from google import genai
+import google.genai.types as types
+from supabase import create_client, Client
+from dotenv import load_dotenv
+# Load environment variables
+load_dotenv()
+class ClusteringService:
+    def __init__(self):
+        url = os.environ.get("SUPABASE_URL")
+        key = os.environ.get("SUPABASE_SERVICE_ROLE_KEY") or os.environ.get("SUPABASE_KEY")
+        self.client: Client = create_client(url, key)
+        self.gemini_client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))
+    def fetch_all_embeddings(self) -> List[Dict[str, Any]]:
+        """Fetch IDs and concatenated embeddings for all profiles."""
+        print("🔍 Fetching profile embeddings...")
+        # We'll use 'technical_skills' or 'headline' as a representative embedding for clustering
+        # Or concatenate multiple if available. For simplicity, we use 'technical_skills'
+        resp = self.client.table("profile_embeddings").select("id, technical_skills").execute()
+        return resp.data
+    def perform_clustering(self, data: List[Dict[str, Any]], n_clusters: int = 5):
+        """Perform K-Means clustering on the fetched embeddings."""
+        if not data:
+            print("⚠️ No data to cluster.")
+            return []
+        # Extract vectors
+        X = []
+        ids = []
+        import json
+        for item in data:
+            raw_vec = item.get("technical_skills")
+            if raw_vec:
+                try:
+                    # If it's a string, parse it
+                    if isinstance(raw_vec, str):
+                        # Some versions of postgrest return vectors as strings like '[0.1, 0.2]'
+                        vec = json.loads(raw_vec)
+                    else:
+                        vec = raw_vec
+                    X.append(vec)
+                    ids.append(item["id"])
+                except Exception as e:
+                    print(f"⚠️ Failed to parse embedding for {item['id']}: {e}")
+        if len(X) < n_clusters:
+            n_clusters = max(1, len(X))
+        print(f"🤖 Performing K-Means clustering (K={n_clusters})...")
+        kmeans = KMeans(n_clusters=n_clusters, random_state=42, n_init=10)
+        labels = kmeans.fit_predict(X)
+        return [{"id": ids[i], "cluster": int(labels[i])} for i in range(len(ids))]
+    def generate_labels_for_clusters(self, clustered_data: List[Dict[str, Any]]) -> Dict[int, str]:
+        """Generate human-readable labels for each cluster using Gemini."""
+        cluster_groups = {}
+        for item in clustered_data:
+            c = item["cluster"]
+            if c not in cluster_groups:
+                cluster_groups[c] = []
+            cluster_groups[c].append(item["id"])
+        labels = {}
+        for cluster_id, user_ids in cluster_groups.items():
+            # Fetch sample details for these users to describe the cluster
+            sample_ids = user_ids[:5]
+            profiles_resp = self.client.table("profiles").select("headline, technical_skills").in_("id", sample_ids).execute()
+            sample_text = "\n".join([
+                f"- {p.get('headline')} (Skills: {p.get('technical_skills')})"
+                for p in profiles_resp.data
+            ])
+            prompt = f"""
+            You are an expert HR Talent Acquisition Specialist.
+            Analyze the following representative professional profiles from a talent pool and provide a perfect, professional job title that best encapsulates the entire group.
+            CRITERIA:
+            - Concise: Exactly 2-4 words.
+            - Professional: Use industry-standard terminology (e.g., "Full Stack Engineer", "DevOps Architect").
+            - Accurate: Reflect the common denominator in seniority and technical domain.
+            - Formatting: Return ONLY the title string, no quotes, no extra text.
+            REPRESENTATIVE PROFILES:
+            {sample_text}
+            PERFECT JOB TITLE:
+            """
+            import time
+            max_retries = 3
+            label = "Unknown Group"
+            for attempt in range(max_retries):
+                try:
+                    response = self.gemini_client.models.generate_content(
+                        model="gemini-2.5-flash-lite",
+                        contents=prompt,
+                        config=types.GenerateContentConfig(temperature=0)
+                    )
+                    label = response.text.strip().replace('"', '')
+                    break
+                except Exception as e:
+                    if attempt < max_retries - 1:
+                        wait = 2 ** (attempt + 1)
+                        print(f"⚠️ Labeling failed for Cluster {cluster_id}. Retrying in {wait}s... ({e})")
+                        time.sleep(wait)
+                    else:
+                        print(f"❌ Labeling failed for Cluster {cluster_id} after {max_retries} attempts.")
+            labels[cluster_id] = label
+            print(f"✅ Cluster {cluster_id} Label: {label}")
+            time.sleep(1) # Small pause between clusters
+        return labels
+    def update_database_with_labels(self, clustered_data: List[Dict[str, Any]], cluster_labels: Dict[int, str]):
+        """Update the profiles table with the new cluster labels."""
+        print("💾 Updating database with cluster labels...")
+        for item in clustered_data:
+            user_id = item["id"]
+            label = cluster_labels[item["cluster"]]
+            self.client.table("profiles").update({"cluster_label": label}).eq("id", user_id).execute()
+        print("✨ Database successfully updated.")
+    def run_clustering_pipeline(self, n_clusters: int = 5):
+        """Orchestrate the full clustering pipeline."""
+        data = self.fetch_all_embeddings()
+        clustered_results = self.perform_clustering(data, n_clusters)
+        if not clustered_results:
+            return
+        labels = self.generate_labels_for_clusters(clustered_results)
+        self.update_database_with_labels(clustered_results, labels)
+if __name__ == "__main__":
+    service = ClusteringService()
+    service.run_clustering_pipeline(n_clusters=5)

backend/src/services/test_clustering.py ADDED Viewed

	@@ -0,0 +1,21 @@

+import sys
+import os
+# Add backend/src to path
+sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '../../')))
+from src.services.clustering_service import ClusteringService
+def test_clustering_pipeline():
+    print("🚀 Starting Clustering Pipeline Test...")
+    service = ClusteringService()
+    try:
+        # Run clustering with 5 clusters for more granular grouping
+        service.run_clustering_pipeline(n_clusters=5)
+        print("✅ Pipeline test completed successfully.")
+    except Exception as e:
+        print(f"❌ Pipeline test failed: {e}")
+if __name__ == "__main__":
+    test_clustering_pipeline()

backend/src/services/verify_labels.py ADDED Viewed

	@@ -0,0 +1,41 @@

+import sys
+import os
+from supabase import create_client, Client
+from dotenv import load_dotenv
+# Add backend/src to path
+sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '../../')))
+# Load environment variables
+load_dotenv()
+def verify_labels():
+    print("🔍 Fetching generated cluster labels from database...")
+    url = os.environ.get("SUPABASE_URL")
+    key = os.environ.get("SUPABASE_SERVICE_ROLE_KEY") or os.environ.get("SUPABASE_KEY")
+    client: Client = create_client(url, key)
+    resp = client.table("profiles").select("full_name, headline, cluster_label").not_.is_("cluster_label", "null").order("cluster_label").execute()
+    if not resp or not hasattr(resp, 'data') or resp.data is None:
+        print("⚠️ No cluster labels found or database error.")
+        return
+    print(f"\n{'Name':<25} | {'Original Headline':<35} | {'Cluster Label'}")
+    print("-" * 85)
+    for p in resp.data[:15]: # Show first 15
+        name = (p.get('full_name') or "Unknown")[:25]
+        headline = (p.get('headline') or "N/A")[:35]
+        label = p.get('cluster_label') or "Unknown"
+        print(f"{name:<25} | {headline:<35} | {label}")
+    # Show distinct labels
+    all_labels = [p.get('cluster_label') for p in resp.data if p.get('cluster_label')]
+    distinct_labels = sorted(list(set(all_labels)))
+    print("\n📦 Distinct Talent Pools (Clusters):")
+    for idx, l in enumerate(distinct_labels, 1):
+        count = all_labels.count(l)
+        print(f"{idx}. {l} ({count} candidates)")
+if __name__ == "__main__":
+    verify_labels()

backend/supabase_ingest.py CHANGED Viewed

@@ -57,7 +57,7 @@ if SUPABASE_URL and SUPABASE_KEY:
 else:
     print("⚠️ Warning: Supabase Credentials not found in environment. Only library functions will fail if called without a client.")
-ALLOWED_EXTENSIONS = {".pdf", ".docx"}
 # ---------------------------------------------------------------------
 # UTILS
@@ -212,15 +212,15 @@ def upsert_profile(client, payload: Dict[str, Any]):
 # UNIFIED PROCESSING FUNCTION (Called by API and Main)
 # ---------------------------------------------------------------------
-def process_resume(client, user_id: str, file_path: str, temp_dir: str = "data/resumes/raw") -> Dict[str, Any]:
     """
     Downloads, extracts, and upserts a resume.
     Used by both the API (real-time) and the main script (batch).
     """
     try:
         # 1. Download
-        print(f"⬇️ Downloading {file_path}...")
-        local_path = download_object(client, "resume", file_path, temp_dir)
         # 2. Extract
         print("🧠 Sending to Gemini...")
@@ -312,6 +312,11 @@ def main():
             except Exception as e:
                 print(f"   ⚠️ Embedding generation failed (non-critical): {e}")
         except Exception as e:
             print(f"   ❌ Pipeline failed for this file: {e}")

 else:
     print("⚠️ Warning: Supabase Credentials not found in environment. Only library functions will fail if called without a client.")
+ALLOWED_EXTENSIONS = {".pdf", ".docx", ".doc"}
 # ---------------------------------------------------------------------
 # UTILS
 # UNIFIED PROCESSING FUNCTION (Called by API and Main)
 # ---------------------------------------------------------------------
+def process_resume(client, user_id: str, file_path: str, bucket: str = "resume", temp_dir: str = "data/resumes/raw") -> Dict[str, Any]:
     """
     Downloads, extracts, and upserts a resume.
     Used by both the API (real-time) and the main script (batch).
     """
     try:
         # 1. Download
+        print(f"⬇️ Downloading {file_path} from bucket '{bucket}'...")
+        local_path = download_object(client, bucket, file_path, temp_dir)
         # 2. Extract
         print("🧠 Sending to Gemini...")
             except Exception as e:
                 print(f"   ⚠️ Embedding generation failed (non-critical): {e}")
+            # 8. Cleanup
+            if os.path.exists(local_path):
+                os.remove(local_path)
+                print("   🗑️ Cleaned up temporary file.")
         except Exception as e:
             print(f"   ❌ Pipeline failed for this file: {e}")

backend/test_ingest_output.txt ADDED Viewed

File without changes

debug_log.txt ADDED Viewed

	@@ -0,0 +1,15 @@

+Testing 896d6c15-2d98-4435-9869-0f11e4db48bd against 45bcca29-4e12-45bf-97d4-0b77ff55472f
+--- Profile Lengths ---
+skills: 1024
+technical_skills: 1024
+experience: 1024
+certifications: 1024
+--- Job Lengths ---
+skills: 1024
+technical_skills: 1024
+work_experience: 1024
+certifications: None
+Result: {"total_score": 80.5, "breakdown": {"technical_skills": 95.8, "experience": 62.7, "projects": 93.3, "skills": 75.5, "education": 59.6, "certifications": 69.1}, "weights": {"technical_skills": 0.35, "experience": 0.2, "projects": 0.15, "skills": 0.1, "education": 0.1, "certifications": 0.1}}

entity_benchmark_scaled_results.txt ADDED Viewed

	@@ -0,0 +1,12 @@

+IRIS Entity-to-Embedding Efficiency Results (Scaled)
+Total Profiles Evaluated: 50
+------------------------------------------------------------
+Entity Type     | Mean Latency (ms)    | Std Dev (ms)
+------------------------------------------------------------
+Headline        | 965.78               | 2969.16
+Summary         | 785.70               | 141.60
+Skills (List)   | 780.01               | 160.76
+Experience (List)| 1005.30              | 185.11
+------------------------------------------------------------
+MEAN TOTAL PER PROFILE: 3536.80 ms
+Average Throughput:     0.283 profiles/sec

experimental_results.tex ADDED Viewed

	@@ -0,0 +1,53 @@

+\section{Experimental Results}
+\label{sec:experimental_results}
+In this section, we present the empirical evaluation of the IRIS system, focusing on two key dimensions: computational efficiency (latency and throughput) and retrieval accuracy.
+\subsection{Computational Efficiency}
+The efficiency of the entity extraction and embedding pipeline was evaluated using a dataset of 50 candidate profiles. The pipeline consists of extracting specific entities—Headline, Summary, Skills, and Experience—and generating their corresponding embeddings using the BGE-M3 model.
+Table~\ref{tab:latency_results} summarizes the mean latency and standard deviation for each entity type.
+\begin{table}[h]
+\centering
+\caption{Mean Latency and Standard Deviation per Entity Extraction (N=50)}
+\label{tab:latency_results}
+\begin{tabular}{lrr}
+\hline
+\textbf{Entity Type} & \textbf{Mean Latency (ms)} & \textbf{Std. Dev. (ms)} \\ \hline
+Headline             & 965.78                     & 2969.16                 \\
+Summary              & 785.70                     & 141.60                  \\
+Skills (List)        & 780.01                     & 160.76                  \\
+Experience (List)    & 1005.30                    & 185.11                  \\ \hline
+\textbf{Total per Profile} & \textbf{3536.80}           & --                      \\ \hline
+\end{tabular}
+\end{table}
+The average total processing time per profile is approximately 3.54 seconds, resulting in a throughput of \textbf{0.283 profiles per second}. While the Headline extraction shows high variance, possibly due to network latency or cold-start issues in the embedding service, the overall pipeline maintains a consistent performance suitable for near-real-time recruitment tasks.
+\subsection{Retrieval Performance}
+We compared the proposed IRIS matching methods against standard baselines using Mean Reciprocal Rank (MRR) and Recall@K ($R@k$). The evaluation included:
+\begin{itemize}
+    \item \textbf{Jaccard Baseline}: A keyword-based overlap method.
+    \item \textbf{BERT Flattened}: Dense retrieval using BERT embeddings on concatenated profile text.
+    \item \textbf{BGE Flattened}: Dense retrieval using BGE-M3 embeddings on concatenated profile text.
+    \item \textbf{BGE Granular Weighted}: Our proposed method using weighted cosine similarity across specific entities.
+\end{itemize}
+Table~\ref{tab:retrieval_results} presents the results of this comparison.
+\begin{table}[h]
+\centering
+\caption{Comparison of Retrieval Accuracy Metrics}
+\label{tab:retrieval_results}
+\begin{tabular}{lccc}
+\hline
+\textbf{Method}             & \textbf{MRR}    & \textbf{R@1}   & \textbf{R@3}   \\ \hline
+Jaccard Baseline            & 0.0755          & 0.016          & 0.048          \\
+BERT Flattened              & 0.1708          & 0.048          & \textbf{0.144} \\
+BGE Flattened               & \textbf{0.1729} & \textbf{0.048} & \textbf{0.144} \\
+BGE Granular Weighted       & 0.0749          & 0.016          & 0.040          \\ \hline
+\end{tabular}
+\end{table}
+The results indicate that the \textbf{BGE Flattened} approach achieves the highest MRR (0.1729) and Recall@1/Recall@3. Notably, the granular weighted approach currently underperforms compared to the flattened embedding methods, suggesting that the aggregation logic or weight distribution for specific entities requires further optimization.

match_benchmark_results.json ADDED Viewed

	@@ -0,0 +1,22 @@

+{
+    "Jaccard_Baseline": {
+        "mrr": 0.07552527033230824,
+        "r1": 0.016,
+        "r3": 0.048
+    },
+    "BERT_Flattened": {
+        "mrr": 0.1688751043476369,
+        "r1": 0.048,
+        "r3": 0.144
+    },
+    "BGE_Flattened": {
+        "mrr": 0.17255959067443694,
+        "r1": 0.048,
+        "r3": 0.144
+    },
+    "BGE_Granular_Weighted": {
+        "mrr": 0.07297651022436405,
+        "r1": 0.012,
+        "r3": 0.044
+    }
+}

matching_analysis_report.md ADDED Viewed

File without changes

quality_metrics_adversarial.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+    "dataset": "N=200 Adversarial (Noise + Distractors)",
+    "mrr": 0.70625,
+    "recall_1": 0.525,
+    "recall_3": 0.775
+}

schema_dump.txt ADDED Viewed

	@@ -0,0 +1,22 @@

+--- Profile Embeddings ---
+- certifications
+- created_at
+- education
+- experience
+- headline
+- id
+- projects
+- skills
+- summary
+- technical_skills
+- updated_at
+--- Job Embeddings ---
+- created_at
+- education
+- job_id
+- skills
+- technical_skills
+- tools
+- updated_at
+- work_experience

src/components/Admin/AdminLayout.jsx CHANGED Viewed

@@ -1,23 +1,24 @@
 import React from 'react';
 import { motion } from 'framer-motion';
-import { supabase } from '../../supabaseClient';
 // --- Icons ---
-const HomeIcon = () => ( <svg style={{ width: '24px', height: '24px' }} viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M3 9l9-7 9 7v11a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2z"></path><polyline points="9 22 9 12 15 12 15 22"></polyline></svg> );
-const BriefcaseIcon = () => ( <svg style={{ width: '24px', height: '24px' }} viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><rect x="2" y="7" width="20" height="14" rx="2" ry="2"></rect><path d="M16 21V5a2 2 0 0 0-2-2h-4a2 2 0 0 0-2 2v16"></path></svg> );
-const MessageSquareIcon = () => ( <svg style={{ width: '24px', height: '24px' }} viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M21 15a2 2 0 0 1-2 2H7l-4 4V5a2 2 0 0 1 2-2h14a2 2 0 0 1 2 2z"></path></svg> );
 // ✅ UPDATED: Complete, robust Settings Icon (Gear)
-const SettingsIcon = () => (
     <svg style={{ width: '24px', height: '24px' }} viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round">
         <path d="M12.22 2h-.44a2 2 0 0 0-2 2v.18a2 2 0 0 1-1 1.73l-.43.25a2 2 0 0 1-2 0l-.15-.08a2 2 0 0 0-2.73.73l-.22.38a2 2 0 0 0 .73 2.73l.15.1a2 2 0 0 1 1 1.72v.51a2 2 0 0 1-1 1.74l-.15.09a2 2 0 0 0-.73 2.73l.22.38a2 2 0 0 0 2.73.73l.15-.08a2 2 0 0 1 2 0l.43.25a2 2 0 0 1 1 1.73V20a2 2 0 0 0 2 2h.44a2 2 0 0 0 2-2v-.18a2 2 0 0 1 1-1.73l.43-.25a2 2 0 0 1 2 0l.15.08a2 2 0 0 0 2.73-.73l.22-.38a2 2 0 0 0-.73-2.73l-.15-.1a2 2 0 0 1-1-1.72v-.51a2 2 0 0 1 1-1.74l.15-.09a2 2 0 0 0 .73-2.73l-.22-.38a2 2 0 0 0-2.73-.73l-.15.08a2 2 0 0 1-2 0l-.43-.25a2 2 0 0 1-1-1.73V4a2 2 0 0 0-2-2z"></path>
         <circle cx="12" cy="12" r="3"></circle>
-    </svg>
 );
-const BriefcasePlusIcon = () => ( <svg style={{ width: '24px', height: '24px' }} viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><rect x="2" y="7" width="20" height="14" rx="2" ry="2"></rect><path d="M16 21V5a2 2 0 0 0-2-2h-4a2 2 0 0 0-2 2v16"></path><line x1="12" y1="11" x2="12" y2="17"></line><line x1="9" y1="14" x2="15" y2="14"></line></svg>);
-const LogoutIcon = () => ( <svg style={{ width: '20px', height: '20px', marginRight: '8px' }} viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M9 21H5a2 2 0 0 1-2-2V5a2 2 0 0 1 2-2h4"></path><polyline points="16 17 21 12 16 7"></polyline><line x1="21" y1="12" x2="9" y2="12"></line></svg> );
 export default function AdminLayout({ children, activeTab, setActiveTab, onNavigate }) {
     // Global Logout Handler
     const handleLogout = async () => {
         const { error } = await supabase.auth.signOut();
@@ -27,7 +28,7 @@ export default function AdminLayout({ children, activeTab, setActiveTab, onNavig
     return (
         <div style={{ height: '100vh', width: '100%', backgroundColor: '#020617', color: 'white', fontFamily: "'Montserrat', sans-serif", display: 'flex', position: 'relative', overflow: 'hidden' }}>
             {/* Background Effects */}
             <div style={{ position: 'fixed', top: 0, left: 0, right: 0, bottom: 0, zIndex: 0 }}>
                 <div style={{ position: 'absolute', borderRadius: '50%', filter: 'blur(80px)', opacity: 0.3, width: '400px', height: '400px', backgroundColor: '#EF4444', top: '-50px', left: '-100px' }}></div>
@@ -37,14 +38,15 @@ export default function AdminLayout({ children, activeTab, setActiveTab, onNavig
             {/* Sidebar */}
             <aside style={{ width: '100px', padding: '2rem 0', display: 'flex', flexDirection: 'column', alignItems: 'center', zIndex: 10 }}>
                 <div style={{ fontSize: '1.5rem', fontWeight: 'bold', color: '#EF4444', marginBottom: '2rem' }}>IRIS</div>
-                <nav style={{
-                    display: 'flex', flexDirection: 'column', alignItems: 'center', gap: '1.5rem',
-                    backgroundColor: 'rgba(239, 68, 68, 0.05)', border: '1px solid rgba(239, 68, 68, 0.2)',
-                    borderRadius: '9999px', padding: '2rem 1rem'
                 }}>
                     <NavButton active={activeTab === 'dashboard'} onClick={() => setActiveTab('dashboard')} icon={<HomeIcon />} />
                     <NavButton active={activeTab === 'job-management'} onClick={() => setActiveTab('job-management')} icon={<BriefcasePlusIcon />} />
                     <NavButton active={activeTab === 'jobs'} onClick={() => setActiveTab('jobs')} icon={<BriefcaseIcon />} />
                     <NavButton active={activeTab === 'messages'} onClick={() => setActiveTab('messages')} icon={<MessageSquareIcon />} />
                     <NavButton active={activeTab === 'settings'} onClick={() => setActiveTab('settings')} icon={<SettingsIcon />} />
                 </nav>
@@ -52,26 +54,26 @@ export default function AdminLayout({ children, activeTab, setActiveTab, onNavig
             {/* Main Content Area */}
             <div style={{ flex: 1, padding: '2rem', overflowY: 'auto', height: '100vh', boxSizing: 'border-box', position: 'relative', zIndex: 1 }}>
                 {/* ✅ GLOBAL LOGOUT BUTTON - Updated Styles for Alignment */}
                 <div style={{ position: 'absolute', top: '2rem', right: '2rem', zIndex: 50 }}>
-                    <motion.button
-                        onClick={handleLogout}
-                        whileHover={{ scale: 1.05 }}
-                        whileTap={{ scale: 0.95 }}
-                        style={{
-                            backgroundColor: '#EF4444',
-                            color: 'white',
-                            display: 'flex',
-                            alignItems: 'center',
                             justifyContent: 'center',
-                            padding: '0.75rem 1.5rem',
-                            borderRadius: '0.5rem',
-                            fontWeight: 'bold',
-                            cursor: 'pointer',
-                            border: 'none',
                             // Matches the visual weight of "Post New Job"
-                            minWidth: '160px'
                         }}
                     >
                         <LogoutIcon /> Logout
@@ -86,10 +88,10 @@ export default function AdminLayout({ children, activeTab, setActiveTab, onNavig
 // Helper Component for Navigation Buttons
 const NavButton = ({ active, onClick, icon }) => (
-    <motion.button
-        whileHover={{ scale: 1.1 }}
-        whileTap={{ scale: 0.9 }}
-        onClick={onClick}
         style={{ background: 'none', border: 'none', color: active ? '#EF4444' : '#d1d5db', cursor: 'pointer' }}
     >
         {icon}

 import React from 'react';
 import { motion } from 'framer-motion';
+import { supabase } from '../../supabaseClient';
 // --- Icons ---
+const HomeIcon = () => (<svg style={{ width: '24px', height: '24px' }} viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M3 9l9-7 9 7v11a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2z"></path><polyline points="9 22 9 12 15 12 15 22"></polyline></svg>);
+const BriefcaseIcon = () => (<svg style={{ width: '24px', height: '24px' }} viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><rect x="2" y="7" width="20" height="14" rx="2" ry="2"></rect><path d="M16 21V5a2 2 0 0 0-2-2h-4a2 2 0 0 0-2 2v16"></path></svg>);
+const MessageSquareIcon = () => (<svg style={{ width: '24px', height: '24px' }} viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M21 15a2 2 0 0 1-2 2H7l-4 4V5a2 2 0 0 1 2-2h14a2 2 0 0 1 2 2z"></path></svg>);
 // ✅ UPDATED: Complete, robust Settings Icon (Gear)
+const SettingsIcon = () => (
     <svg style={{ width: '24px', height: '24px' }} viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round">
         <path d="M12.22 2h-.44a2 2 0 0 0-2 2v.18a2 2 0 0 1-1 1.73l-.43.25a2 2 0 0 1-2 0l-.15-.08a2 2 0 0 0-2.73.73l-.22.38a2 2 0 0 0 .73 2.73l.15.1a2 2 0 0 1 1 1.72v.51a2 2 0 0 1-1 1.74l-.15.09a2 2 0 0 0-.73 2.73l.22.38a2 2 0 0 0 2.73.73l.15-.08a2 2 0 0 1 2 0l.43.25a2 2 0 0 1 1 1.73V20a2 2 0 0 0 2 2h.44a2 2 0 0 0 2-2v-.18a2 2 0 0 1 1-1.73l.43-.25a2 2 0 0 1 2 0l.15.08a2 2 0 0 0 2.73-.73l.22-.38a2 2 0 0 0-.73-2.73l-.15-.1a2 2 0 0 1-1-1.72v-.51a2 2 0 0 1 1-1.74l.15-.09a2 2 0 0 0 .73-2.73l-.22-.38a2 2 0 0 0-2.73-.73l-.15.08a2 2 0 0 1-2 0l-.43-.25a2 2 0 0 1-1-1.73V4a2 2 0 0 0-2-2z"></path>
         <circle cx="12" cy="12" r="3"></circle>
+    </svg>
 );
+const BriefcasePlusIcon = () => (<svg style={{ width: '24px', height: '24px' }} viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><rect x="2" y="7" width="20" height="14" rx="2" ry="2"></rect><path d="M16 21V5a2 2 0 0 0-2-2h-4a2 2 0 0 0-2 2v16"></path><line x1="12" y1="11" x2="12" y2="17"></line><line x1="9" y1="14" x2="15" y2="14"></line></svg>);
+const ClustersIcon = () => (<svg style={{ width: '24px', height: '24px' }} viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><circle cx="12" cy="12" r="3" /><circle cx="4" cy="6" r="2" /><circle cx="20" cy="6" r="2" /><circle cx="4" cy="18" r="2" /><circle cx="20" cy="18" r="2" /><line x1="12" y1="9" x2="5" y2="7" /><line x1="12" y1="9" x2="19" y2="7" /><line x1="12" y1="15" x2="5" y2="17" /><line x1="12" y1="15" x2="19" y2="17" /></svg>);
+const LogoutIcon = () => (<svg style={{ width: '20px', height: '20px', marginRight: '8px' }} viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M9 21H5a2 2 0 0 1-2-2V5a2 2 0 0 1 2-2h4"></path><polyline points="16 17 21 12 16 7"></polyline><line x1="21" y1="12" x2="9" y2="12"></line></svg>);
 export default function AdminLayout({ children, activeTab, setActiveTab, onNavigate }) {
     // Global Logout Handler
     const handleLogout = async () => {
         const { error } = await supabase.auth.signOut();
     return (
         <div style={{ height: '100vh', width: '100%', backgroundColor: '#020617', color: 'white', fontFamily: "'Montserrat', sans-serif", display: 'flex', position: 'relative', overflow: 'hidden' }}>
             {/* Background Effects */}
             <div style={{ position: 'fixed', top: 0, left: 0, right: 0, bottom: 0, zIndex: 0 }}>
                 <div style={{ position: 'absolute', borderRadius: '50%', filter: 'blur(80px)', opacity: 0.3, width: '400px', height: '400px', backgroundColor: '#EF4444', top: '-50px', left: '-100px' }}></div>
             {/* Sidebar */}
             <aside style={{ width: '100px', padding: '2rem 0', display: 'flex', flexDirection: 'column', alignItems: 'center', zIndex: 10 }}>
                 <div style={{ fontSize: '1.5rem', fontWeight: 'bold', color: '#EF4444', marginBottom: '2rem' }}>IRIS</div>
+                <nav style={{
+                    display: 'flex', flexDirection: 'column', alignItems: 'center', gap: '1.5rem',
+                    backgroundColor: 'rgba(239, 68, 68, 0.05)', border: '1px solid rgba(239, 68, 68, 0.2)',
+                    borderRadius: '9999px', padding: '2rem 1rem'
                 }}>
                     <NavButton active={activeTab === 'dashboard'} onClick={() => setActiveTab('dashboard')} icon={<HomeIcon />} />
                     <NavButton active={activeTab === 'job-management'} onClick={() => setActiveTab('job-management')} icon={<BriefcasePlusIcon />} />
                     <NavButton active={activeTab === 'jobs'} onClick={() => setActiveTab('jobs')} icon={<BriefcaseIcon />} />
+                    <NavButton active={activeTab === 'clusters'} onClick={() => setActiveTab('clusters')} icon={<ClustersIcon />} />
                     <NavButton active={activeTab === 'messages'} onClick={() => setActiveTab('messages')} icon={<MessageSquareIcon />} />
                     <NavButton active={activeTab === 'settings'} onClick={() => setActiveTab('settings')} icon={<SettingsIcon />} />
                 </nav>
             {/* Main Content Area */}
             <div style={{ flex: 1, padding: '2rem', overflowY: 'auto', height: '100vh', boxSizing: 'border-box', position: 'relative', zIndex: 1 }}>
                 {/* ✅ GLOBAL LOGOUT BUTTON - Updated Styles for Alignment */}
                 <div style={{ position: 'absolute', top: '2rem', right: '2rem', zIndex: 50 }}>
+                    <motion.button
+                        onClick={handleLogout}
+                        whileHover={{ scale: 1.05 }}
+                        whileTap={{ scale: 0.95 }}
+                        style={{
+                            backgroundColor: '#EF4444',
+                            color: 'white',
+                            display: 'flex',
+                            alignItems: 'center',
                             justifyContent: 'center',
+                            padding: '0.75rem 1.5rem',
+                            borderRadius: '0.5rem',
+                            fontWeight: 'bold',
+                            cursor: 'pointer',
+                            border: 'none',
                             // Matches the visual weight of "Post New Job"
+                            minWidth: '160px'
                         }}
                     >
                         <LogoutIcon /> Logout
 // Helper Component for Navigation Buttons
 const NavButton = ({ active, onClick, icon }) => (
+    <motion.button
+        whileHover={{ scale: 1.1 }}
+        whileTap={{ scale: 0.9 }}
+        onClick={onClick}
         style={{ background: 'none', border: 'none', color: active ? '#EF4444' : '#d1d5db', cursor: 'pointer' }}
     >
         {icon}

src/components/Admin/TalentClusters.jsx ADDED Viewed

	@@ -0,0 +1,496 @@

+import React, { useState, useEffect } from 'react';
+import { motion, AnimatePresence } from 'framer-motion';
+import { supabase } from '../../supabaseClient';
+import FullProfileOverlay from '../FullProfileOverlay';
+// ─── Icons ───────────────────────────────────────────────────────────────────
+const ClusterIcon = () => (
+    <svg width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round">
+        <circle cx="12" cy="12" r="3" /><circle cx="4" cy="6" r="3" /><circle cx="20" cy="6" r="3" />
+        <circle cx="4" cy="18" r="3" /><circle cx="20" cy="18" r="3" />
+        <line x1="12" y1="9" x2="4" y2="7" /><line x1="12" y1="9" x2="20" y2="7" />
+        <line x1="12" y1="15" x2="4" y2="17" /><line x1="12" y1="15" x2="20" y2="17" />
+    </svg>
+);
+const UsersIcon = () => (
+    <svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round">
+        <path d="M17 21v-2a4 4 0 0 0-4-4H5a4 4 0 0 0-4 4v2" /><circle cx="9" cy="7" r="4" />
+        <path d="M23 21v-2a4 4 0 0 0-3-3.87" /><path d="M16 3.13a4 4 0 0 1 0 7.75" />
+    </svg>
+);
+const SearchIcon = () => (
+    <svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round">
+        <circle cx="11" cy="11" r="8" /><line x1="21" y1="21" x2="16.65" y2="16.65" />
+    </svg>
+);
+const ChevronDown = ({ open }) => (
+    <svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2.5"
+        style={{ transform: open ? 'rotate(180deg)' : 'rotate(0deg)', transition: 'transform 0.3s ease' }}>
+        <polyline points="6 9 12 15 18 9" />
+    </svg>
+);
+const XIcon = () => (
+    <svg width="18" height="18" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round">
+        <line x1="18" y1="6" x2="6" y2="18" /><line x1="6" y1="6" x2="18" y2="18" />
+    </svg>
+);
+// ─── Cluster colour palette ───────────────────────────────────────────────────
+const CLUSTER_COLORS = [
+    { accent: '#EF4444', glow: 'rgba(239,68,68,0.15)', border: 'rgba(239,68,68,0.3)' },
+    { accent: '#8B5CF6', glow: 'rgba(139,92,246,0.15)', border: 'rgba(139,92,246,0.3)' },
+    { accent: '#06B6D4', glow: 'rgba(6,182,212,0.15)', border: 'rgba(6,182,212,0.3)' },
+    { accent: '#10B981', glow: 'rgba(16,185,129,0.15)', border: 'rgba(16,185,129,0.3)' },
+    { accent: '#F59E0B', glow: 'rgba(245,158,11,0.15)', border: 'rgba(245,158,11,0.3)' },
+    { accent: '#EC4899', glow: 'rgba(236,72,153,0.15)', border: 'rgba(236,72,153,0.3)' },
+];
+const getColor = (idx) => CLUSTER_COLORS[idx % CLUSTER_COLORS.length];
+// ─── Profile Card ─────────────────────────────────────────────────────────────
+const ProfileCard = ({ profile, accent, onView }) => {
+    const [hovered, setHovered] = useState(false);
+    const skills = Array.isArray(profile.technical_skills)
+        ? profile.technical_skills.slice(0, 4)
+        : typeof profile.technical_skills === 'string'
+            ? profile.technical_skills.split(',').slice(0, 4).map(s => s.trim())
+            : [];
+    return (
+        <motion.div
+            onMouseEnter={() => setHovered(true)}
+            onMouseLeave={() => setHovered(false)}
+            whileHover={{ y: -4, scale: 1.01 }}
+            onClick={() => onView(profile)}
+            style={{
+                backgroundColor: hovered ? 'rgba(255,255,255,0.06)' : 'rgba(255,255,255,0.03)',
+                border: `1px solid ${hovered ? accent : 'rgba(255,255,255,0.08)'}`,
+                borderRadius: '12px',
+                padding: '1rem',
+                cursor: 'pointer',
+                transition: 'border-color 0.2s',
+                boxShadow: hovered ? `0 4px 20px ${accent}30` : 'none',
+            }}
+        >
+            <div style={{ display: 'flex', alignItems: 'center', gap: '0.75rem', marginBottom: '0.6rem' }}>
+                <img
+                    src={profile.avatar_url || `https://ui-avatars.com/api/?name=${encodeURIComponent(profile.full_name || 'User')}&background=random&size=48`}
+                    alt={profile.full_name}
+                    style={{ width: 40, height: 40, borderRadius: '50%', objectFit: 'cover', border: `2px solid ${accent}55` }}
+                />
+                <div>
+                    <p style={{ fontWeight: '700', color: '#fff', fontSize: '0.9rem', marginBottom: 2 }}>{profile.full_name || 'Unknown'}</p>
+                    <p style={{ fontSize: '0.75rem', color: '#94a3b8' }}>{profile.headline || profile.role || '—'}</p>
+                </div>
+            </div>
+            <p style={{ fontSize: '0.75rem', color: '#64748b', marginBottom: '0.5rem' }}>
+                {profile.experience_years ? `${profile.experience_years} yrs exp` : 'No experience listed'}
+            </p>
+            {skills.length > 0 && (
+                <div style={{ display: 'flex', flexWrap: 'wrap', gap: '0.3rem' }}>
+                    {skills.map((s, i) => (
+                        <span key={i} style={{
+                            fontSize: '0.7rem', padding: '2px 8px', borderRadius: '4px',
+                            backgroundColor: `${accent}20`, color: accent,
+                            border: `1px solid ${accent}40`
+                        }}>{s}</span>
+                    ))}
+                </div>
+            )}
+        </motion.div>
+    );
+};
+// ─── Cluster Card ─────────────────────────────────────────────────────────────
+const ClusterCard = ({ label, profiles, colorIdx, searchQuery, onViewProfile }) => {
+    const [expanded, setExpanded] = useState(true);
+    const color = getColor(colorIdx);
+    const filtered = profiles.filter(p => {
+        const q = searchQuery.toLowerCase();
+        return (
+            (p.full_name || '').toLowerCase().includes(q) ||
+            (p.headline || '').toLowerCase().includes(q) ||
+            (p.role || '').toLowerCase().includes(q)
+        );
+    });
+    if (searchQuery && filtered.length === 0) return null;
+    return (
+        <motion.div
+            initial={{ opacity: 0, y: 20 }}
+            animate={{ opacity: 1, y: 0 }}
+            style={{
+                backgroundColor: color.glow,
+                border: `1px solid ${color.border}`,
+                borderRadius: '16px',
+                overflow: 'hidden',
+                marginBottom: '1.5rem',
+            }}
+        >
+            {/* Header */}
+            <button
+                onClick={() => setExpanded(e => !e)}
+                style={{
+                    width: '100%', background: 'none', border: 'none', cursor: 'pointer',
+                    padding: '1.25rem 1.5rem',
+                    display: 'flex', alignItems: 'center', justifyContent: 'space-between',
+                    color: '#fff',
+                }}
+            >
+                <div style={{ display: 'flex', alignItems: 'center', gap: '0.75rem' }}>
+                    <div style={{
+                        width: 36, height: 36, borderRadius: '10px',
+                        backgroundColor: `${color.accent}22`, display: 'flex', alignItems: 'center', justifyContent: 'center',
+                        border: `1px solid ${color.accent}55`
+                    }}>
+                        <ClusterIcon style={{ color: color.accent }} />
+                    </div>
+                    <div style={{ textAlign: 'left' }}>
+                        <h3 style={{ fontSize: '1.05rem', fontWeight: '700', color: '#fff', margin: 0 }}>{label}</h3>
+                        <div style={{ display: 'flex', alignItems: 'center', gap: '4px', color: '#94a3b8', fontSize: '0.8rem', marginTop: 2 }}>
+                            <UsersIcon />
+                            <span>{filtered.length} {filtered.length === 1 ? 'profile' : 'profiles'}</span>
+                        </div>
+                    </div>
+                </div>
+                <div style={{ color: color.accent }}>
+                    <ChevronDown open={expanded} />
+                </div>
+            </button>
+            {/* Body */}
+            <AnimatePresence initial={false}>
+                {expanded && (
+                    <motion.div
+                        key="body"
+                        initial={{ height: 0, opacity: 0 }}
+                        animate={{ height: 'auto', opacity: 1 }}
+                        exit={{ height: 0, opacity: 0 }}
+                        transition={{ duration: 0.3 }}
+                        style={{ overflow: 'hidden' }}
+                    >
+                        <div style={{
+                            padding: '0 1.5rem 1.5rem',
+                            display: 'grid',
+                            gridTemplateColumns: 'repeat(auto-fill, minmax(220px, 1fr))',
+                            gap: '0.75rem'
+                        }}>
+                            {filtered.map(p => (
+                                <ProfileCard
+                                    key={p.id}
+                                    profile={p}
+                                    accent={color.accent}
+                                    onView={onViewProfile}
+                                />
+                            ))}
+                        </div>
+                    </motion.div>
+                )}
+            </AnimatePresence>
+        </motion.div>
+    );
+};
+// ─── Profile Detail Modal ────────────────────��────────────────────────────────
+const ProfileModal = ({ profile, onClose }) => {
+    if (!profile) return null;
+    const skills = Array.isArray(profile.technical_skills)
+        ? profile.technical_skills
+        : typeof profile.technical_skills === 'string'
+            ? profile.technical_skills.split(',').map(s => s.trim())
+            : [];
+    return (
+        <AnimatePresence>
+            <motion.div
+                initial={{ opacity: 0 }}
+                animate={{ opacity: 1 }}
+                exit={{ opacity: 0 }}
+                onClick={onClose}
+                style={{
+                    position: 'fixed', inset: 0, backgroundColor: 'rgba(0,0,0,0.7)',
+                    backdropFilter: 'blur(6px)', zIndex: 100, display: 'flex',
+                    alignItems: 'center', justifyContent: 'center', padding: '1rem'
+                }}
+            >
+                <motion.div
+                    initial={{ scale: 0.9, opacity: 0 }}
+                    animate={{ scale: 1, opacity: 1 }}
+                    exit={{ scale: 0.9, opacity: 0 }}
+                    onClick={e => e.stopPropagation()}
+                    style={{
+                        backgroundColor: '#0f172a',
+                        backgroundImage: `
+                            radial-gradient(at 0% 0%, rgba(139,92,246,0.2) 0px, transparent 50%),
+                            radial-gradient(at 100% 100%, rgba(239,68,68,0.2) 0px, transparent 50%)
+                        `,
+                        border: '1px solid rgba(255,255,255,0.1)',
+                        borderRadius: '20px',
+                        width: '100%', maxWidth: '540px',
+                        maxHeight: '80vh', overflowY: 'auto',
+                        boxShadow: '0 25px 50px rgba(0,0,0,0.5)',
+                        padding: '2rem',
+                    }}
+                >
+                    {/* Close */}
+                    <div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', marginBottom: '1.5rem' }}>
+                        <div style={{ display: 'flex', alignItems: 'center', gap: '1rem' }}>
+                            <img
+                                src={profile.avatar_url || `https://ui-avatars.com/api/?name=${encodeURIComponent(profile.full_name || 'User')}&background=random&size=80`}
+                                alt={profile.full_name}
+                                style={{ width: 56, height: 56, borderRadius: '50%', objectFit: 'cover', border: '2px solid rgba(239,68,68,0.4)' }}
+                            />
+                            <div>
+                                <h2 style={{ fontSize: '1.4rem', fontWeight: '800', color: '#fff', margin: 0 }}>{profile.full_name}</h2>
+                                <p style={{ color: '#94a3b8', fontSize: '0.85rem', margin: 0 }}>{profile.headline || profile.role || '—'}</p>
+                            </div>
+                        </div>
+                        <button onClick={onClose} style={{ background: 'none', border: 'none', color: '#64748b', cursor: 'pointer' }}>
+                            <XIcon />
+                        </button>
+                    </div>
+                    {/* Stats Row */}
+                    <div style={{ display: 'grid', gridTemplateColumns: 'repeat(3, 1fr)', gap: '0.75rem', marginBottom: '1.5rem' }}>
+                        {[
+                            { label: 'Experience', value: profile.experience_years ? `${profile.experience_years} yrs` : '—' },
+                            { label: 'Cluster', value: profile.cluster_label || '—' },
+                            { label: 'Email', value: profile.email ? profile.email.split('@')[0] : '—' },
+                        ].map(({ label, value }) => (
+                            <div key={label} style={{
+                                backgroundColor: 'rgba(255,255,255,0.05)', borderRadius: '10px',
+                                padding: '0.75rem', border: '1px solid rgba(255,255,255,0.08)'
+                            }}>
+                                <p style={{ fontSize: '0.7rem', color: '#64748b', textTransform: 'uppercase', letterSpacing: '0.05em', marginBottom: 4 }}>{label}</p>
+                                <p style={{ fontSize: '0.85rem', fontWeight: '600', color: '#e2e8f0', wordBreak: 'break-all' }}>{value}</p>
+                            </div>
+                        ))}
+                    </div>
+                    {/* Summary */}
+                    {profile.summary && (
+                        <div style={{ marginBottom: '1.5rem' }}>
+                            <h4 style={{ fontSize: '0.85rem', color: '#94a3b8', fontWeight: '600', marginBottom: '0.5rem', textTransform: 'uppercase', letterSpacing: '0.05em' }}>Summary</h4>
+                            <p style={{ fontSize: '0.9rem', lineHeight: '1.6', color: '#cbd5e1', backgroundColor: 'rgba(255,255,255,0.04)', padding: '0.75rem', borderRadius: '8px', border: '1px solid rgba(255,255,255,0.05)' }}>
+                                {profile.summary}
+                            </p>
+                        </div>
+                    )}
+                    {/* Skills */}
+                    {skills.length > 0 && (
+                        <div style={{ marginBottom: '1.5rem' }}>
+                            <h4 style={{ fontSize: '0.85rem', color: '#94a3b8', fontWeight: '600', marginBottom: '0.5rem', textTransform: 'uppercase', letterSpacing: '0.05em' }}>Technical Skills</h4>
+                            <div style={{ display: 'flex', flexWrap: 'wrap', gap: '0.4rem' }}>
+                                {skills.map((s, i) => (
+                                    <span key={i} style={{
+                                        fontSize: '0.8rem', padding: '4px 10px', borderRadius: '6px',
+                                        backgroundColor: 'rgba(239,68,68,0.1)', color: '#EF4444',
+                                        border: '1px solid rgba(239,68,68,0.2)'
+                                    }}>{s}</span>
+                                ))}
+                            </div>
+                        </div>
+                    )}
+                    {/* Education */}
+                    {profile.education && (
+                        <div>
+                            <h4 style={{ fontSize: '0.85rem', color: '#94a3b8', fontWeight: '600', marginBottom: '0.5rem', textTransform: 'uppercase', letterSpacing: '0.05em' }}>Education</h4>
+                            <p style={{ fontSize: '0.85rem', color: '#cbd5e1' }}>
+                                {typeof profile.education === 'string' ? profile.education : JSON.stringify(profile.education)}
+                            </p>
+                        </div>
+                    )}
+                </motion.div>
+            </motion.div>
+        </AnimatePresence>
+    );
+};
+// ─── MAIN PAGE ────────────────────────────────────────────────────────────────
+export default function TalentClusters() {
+    const [clusters, setClusters] = useState({});       // { labelName: [profiles] }
+    const [isLoading, setIsLoading] = useState(true);
+    const [searchQuery, setSearchQuery] = useState('');
+    const [selectedProfile, setSelectedProfile] = useState(null);
+    const [error, setError] = useState(null);
+    useEffect(() => {
+        fetchClusters();
+    }, []);
+    const fetchClusters = async () => {
+        setIsLoading(true);
+        setError(null);
+        try {
+            const { data, error } = await supabase
+                .from('profiles')
+                .select('id, full_name, email, avatar_url, headline, role, experience_years, technical_skills, summary, education, cluster_label')
+                .not('cluster_label', 'is', null);
+            if (error) throw error;
+            // Group by cluster_label
+            const grouped = {};
+            data.forEach(profile => {
+                const label = profile.cluster_label || 'Uncategorized';
+                if (!grouped[label]) grouped[label] = [];
+                grouped[label].push(profile);
+            });
+            setClusters(grouped);
+        } catch (err) {
+            console.error('Failed to fetch clusters:', err);
+            setError('Failed to load talent clusters. Please try again.');
+        } finally {
+            setIsLoading(false);
+        }
+    };
+    const clusterEntries = Object.entries(clusters).sort((a, b) => b[1].length - a[1].length);
+    const totalProfiles = Object.values(clusters).reduce((s, arr) => s + arr.length, 0);
+    return (
+        <div style={{ paddingBottom: '4rem' }}>
+            <style>{`
+                .hide-scrollbar::-webkit-scrollbar { display: none; }
+                .hide-scrollbar { -ms-overflow-style: none; scrollbar-width: none; }
+                @keyframes spin { 100% { transform: rotate(360deg); } }
+                @keyframes pulse-dot { 0%,100% { opacity: 1; } 50% { opacity: 0.3; } }
+            `}</style>
+            {/* Header */}
+            <header style={{ marginBottom: '2rem' }}>
+                <div style={{ display: 'flex', alignItems: 'center', gap: '0.75rem', marginBottom: '0.5rem' }}>
+                    <div style={{ color: '#EF4444' }}><ClusterIcon /></div>
+                    <h1 style={{ fontSize: '1.875rem', fontWeight: 'bold', margin: 0 }}>Talent Clusters</h1>
+                </div>
+                <p style={{ color: '#64748b', fontSize: '0.9rem' }}>
+                    AI-grouped candidate profiles based on skills and experience similarity.
+                </p>
+            </header>
+            {/* Stats Bar */}
+            <div style={{ display: 'flex', gap: '1rem', marginBottom: '2rem', flexWrap: 'wrap' }}>
+                {[
+                    { label: 'Total Clusters', value: clusterEntries.length, color: '#EF4444' },
+                    { label: 'Total Profiles', value: totalProfiles, color: '#8B5CF6' },
+                    { label: 'Avg. Cluster Size', value: clusterEntries.length ? Math.round(totalProfiles / clusterEntries.length) : 0, color: '#06B6D4' },
+                ].map(({ label, value, color }) => (
+                    <div key={label} style={{
+                        flex: 1, minWidth: 140,
+                        backgroundColor: 'rgba(255,255,255,0.03)',
+                        border: '1px solid rgba(255,255,255,0.08)',
+                        borderRadius: '12px', padding: '1rem 1.25rem',
+                    }}>
+                        <p style={{ fontSize: '0.75rem', color: '#64748b', textTransform: 'uppercase', letterSpacing: '0.05em', marginBottom: 4 }}>{label}</p>
+                        <p style={{ fontSize: '1.8rem', fontWeight: '800', color, margin: 0, lineHeight: 1 }}>{isLoading ? '—' : value}</p>
+                    </div>
+                ))}
+            </div>
+            {/* Search + Refresh */}
+            <div style={{ display: 'flex', gap: '0.75rem', marginBottom: '2rem', alignItems: 'center' }}>
+                <div style={{ position: 'relative', flexGrow: 1 }}>
+                    <div style={{ position: 'absolute', left: 12, top: '50%', transform: 'translateY(-50%)', color: '#64748b' }}>
+                        <SearchIcon />
+                    </div>
+                    <input
+                        type="text"
+                        placeholder="Search by name, role, or headline..."
+                        value={searchQuery}
+                        onChange={e => setSearchQuery(e.target.value)}
+                        style={{
+                            width: '100%', padding: '0.75rem 0.75rem 0.75rem 2.25rem',
+                            borderRadius: '0.5rem', border: '1px solid rgba(239,68,68,0.3)',
+                            backgroundColor: 'rgba(255,255,255,0.04)', color: 'white',
+                            fontSize: '0.9rem', outline: 'none', boxSizing: 'border-box'
+                        }}
+                    />
+                </div>
+                <motion.button
+                    onClick={fetchClusters}
+                    whileHover={{ scale: 1.04 }}
+                    whileTap={{ scale: 0.96 }}
+                    style={{
+                        backgroundColor: 'rgba(239,68,68,0.15)', border: '1px solid rgba(239,68,68,0.4)',
+                        color: '#EF4444', padding: '0.75rem 1.25rem', borderRadius: '0.5rem',
+                        cursor: 'pointer', fontWeight: '600', fontSize: '0.85rem', whiteSpace: 'nowrap'
+                    }}
+                >
+                    ↻ Refresh
+                </motion.button>
+            </div>
+            {/* Content */}
+            {isLoading ? (
+                <div style={{ display: 'flex', flexDirection: 'column', alignItems: 'center', justifyContent: 'center', height: '300px', gap: '1rem' }}>
+                    <div style={{
+                        width: 40, height: 40, border: '3px solid rgba(239,68,68,0.2)',
+                        borderTopColor: '#EF4444', borderRadius: '50%',
+                        animation: 'spin 0.8s linear infinite'
+                    }} />
+                    <p style={{ color: '#64748b' }}>Loading talent clusters…</p>
+                </div>
+            ) : error ? (
+                <div style={{ textAlign: 'center', padding: '3rem', color: '#EF4444' }}>
+                    <p>{error}</p>
+                    <button onClick={fetchClusters} style={{ marginTop: '1rem', backgroundColor: '#EF4444', color: 'white', border: 'none', padding: '0.5rem 1.5rem', borderRadius: '6px', cursor: 'pointer', fontWeight: '600' }}>
+                        Retry
+                    </button>
+                </div>
+            ) : clusterEntries.length === 0 ? (
+                <div style={{ textAlign: 'center', padding: '4rem', color: '#64748b' }}>
+                    <ClusterIcon />
+                    <p style={{ marginTop: '1rem' }}>No clusters found. Run the clustering pipeline first.</p>
+                </div>
+            ) : (
+                <>
+                    {/* Cluster grid legend */}
+                    <div style={{ display: 'flex', flexWrap: 'wrap', gap: '0.5rem', marginBottom: '1.5rem' }}>
+                        {clusterEntries.map(([label, profiles], idx) => {
+                            const color = getColor(idx);
+                            return (
+                                <span key={label} style={{
+                                    fontSize: '0.78rem', padding: '4px 12px', borderRadius: '99px',
+                                    backgroundColor: `${color.accent}18`, color: color.accent,
+                                    border: `1px solid ${color.accent}44`, fontWeight: '600'
+                                }}>
+                                    {label} ({profiles.length})
+                                </span>
+                            );
+                        })}
+                    </div>
+                    {/* Cluster cards */}
+                    {clusterEntries.map(([label, profiles], idx) => (
+                        <ClusterCard
+                            key={label}
+                            label={label}
+                            profiles={profiles}
+                            colorIdx={idx}
+                            searchQuery={searchQuery}
+                            onViewProfile={setSelectedProfile}
+                        />
+                    ))}
+                </>
+            )}
+            {/* Profile modal */}
+            <AnimatePresence>
+                {selectedProfile && (
+                    <ProfileModal profile={selectedProfile} onClose={() => setSelectedProfile(null)} />
+                )}
+            </AnimatePresence>
+        </div>
+    );
+}

src/components/JobListings.jsx CHANGED Viewed

@@ -1,18 +1,18 @@
 import React, { useState, useEffect } from 'react';
 import { motion, AnimatePresence } from 'framer-motion';
-import { supabase } from '../supabaseClient';
-import { SearchIcon } from './Icons';
-import JobDetail from './JobDetail';
-import ApplyModel from './ApplyModel';
-import JobCard from './JobCard';
 import VerificationModal from './VerificationModal'; // ✅ Import the new modal
 export default function JobListings({ searchQuery, setSearchQuery, isSearching, filteredJobListings }) {
     const [selectedJob, setSelectedJob] = useState(null);
     const [appliedJobIds, setAppliedJobIds] = useState(new Set());
-    const [applying, setApplying] = useState(null);
     // State for the Apply Modal
     const [jobToApply, setJobToApply] = useState(null);
@@ -28,7 +28,7 @@ export default function JobListings({ searchQuery, setSearchQuery, isSearching,
                     .from('applications')
                     .select('job_id')
                     .eq('user_id', user.id);
                 if (data) {
                     setAppliedJobIds(new Set(data.map(app => app.job_id)));
                 }
@@ -40,7 +40,7 @@ export default function JobListings({ searchQuery, setSearchQuery, isSearching,
     // 2. Open Apply Modal
     const initiateApply = (jobId) => {
         const job = filteredJobListings.find(j => j.id === jobId);
-        if(job) {
             setJobToApply(job);
         }
     };
@@ -48,12 +48,12 @@ export default function JobListings({ searchQuery, setSearchQuery, isSearching,
     // 3. Submit Application (With Verification Gatekeeper)
     const handleFinalSubmit = async (formData) => {
         if (!jobToApply) return;
         setApplying(jobToApply.id);
         try {
             const { data: { user } } = await supabase.auth.getUser();
             if (!user) {
                 alert("Please log in to apply.");
                 return;
@@ -69,30 +69,30 @@ export default function JobListings({ searchQuery, setSearchQuery, isSearching,
             if (profileError) throw profileError;
             // If NOT verified, stop the application and show modal
-            if (!profile.is_phone_verified) {
-                setApplying(null);        // Stop loading spinner
-                setJobToApply(null);      // Close application form
-                setShowVerificationModal(true); // Open Verification Modal
-                return;                   // 🛑 Stop execution here
-            }
             // --- ✅ IF VERIFIED: Proceed with Application ---
             const { error } = await supabase
                 .from('applications')
-                .insert([{
-                    job_id: jobToApply.id,
                     user_id: user.id,
                     status: 'Pending',
-                    resume_url: formData.resume_url,
-                    cover_letter: formData.cover_letter
                 }]);
             if (error) throw error;
             setAppliedJobIds(prev => new Set(prev).add(jobToApply.id));
-            alert("Application submitted successfully!");
-            setJobToApply(null);
         } catch (error) {
             console.error("Error applying:", error.message);
@@ -144,16 +144,16 @@ export default function JobListings({ searchQuery, setSearchQuery, isSearching,
                     <input type="text" value={searchQuery} onChange={(e) => setSearchQuery(e.target.value)} placeholder="Search by job title..." style={{ width: '100%', padding: '0.75rem 1rem 0.75rem 2.5rem', borderRadius: '0.5rem', border: '1px solid rgba(251, 191, 36, 0.3)', backgroundColor: 'rgba(255,255,255,0.1)', color: 'white' }} />
                 </div>
             </div>
             {/* Job Grid */}
             <motion.main layout style={{ display: 'grid', gridTemplateColumns: 'repeat(auto-fit, minmax(300px, 1fr))', gap: '2rem' }}>
                 <AnimatePresence>
                     {filteredJobListings.length > 0 ? (
                         filteredJobListings.map((job) => (
                             <motion.div key={job.id} layout initial={{ opacity: 0, scale: 0.8 }} animate={{ opacity: 1, scale: 1 }} exit={{ opacity: 0, scale: 0.8 }} transition={{ duration: 0.2 }}>
-                                <JobCard
-                                    {...job}
-                                    onViewDetails={() => setSelectedJob(job)}
                                     onApply={() => initiateApply(job.id)}
                                     onWithdraw={() => handleWithdraw(job.id)}
                                     isApplied={appliedJobIds.has(job.id)}
@@ -162,16 +162,16 @@ export default function JobListings({ searchQuery, setSearchQuery, isSearching,
                             </motion.div>
                         ))
                     ) : (
-                        <motion.p initial={{opacity: 0}} animate={{opacity: 1}} style={{ color: '#d1d5db' }}>No jobs found.</motion.p>
                     )}
                 </AnimatePresence>
             </motion.main>
             {/* Job Detail Modal */}
             {selectedJob && (
-                <JobDetail
-                    job={selectedJob}
-                    onClose={() => setSelectedJob(null)}
                     onApply={() => initiateApply(selectedJob.id)}
                     isApplied={appliedJobIds.has(selectedJob.id)}
                     isApplying={applying === selectedJob.id}
@@ -180,7 +180,7 @@ export default function JobListings({ searchQuery, setSearchQuery, isSearching,
             {/* Apply Form Modal */}
             {jobToApply && (
-                <ApplyModel
                     job={jobToApply}
                     isSubmitting={applying === jobToApply.id}
                     onClose={() => setJobToApply(null)}
@@ -190,7 +190,7 @@ export default function JobListings({ searchQuery, setSearchQuery, isSearching,
             {/* ✅ OTP Verification Modal */}
             {showVerificationModal && (
-                <VerificationModal
                     onClose={() => setShowVerificationModal(false)}
                     onVerified={() => {
                         setShowVerificationModal(false);

 import React, { useState, useEffect } from 'react';
 import { motion, AnimatePresence } from 'framer-motion';
+import { supabase } from '../supabaseClient';
+import { SearchIcon } from './Icons';
+import JobDetail from './JobDetail';
+import ApplyModel from './ApplyModel';
+import JobCard from './JobCard';
 import VerificationModal from './VerificationModal'; // ✅ Import the new modal
 export default function JobListings({ searchQuery, setSearchQuery, isSearching, filteredJobListings }) {
     const [selectedJob, setSelectedJob] = useState(null);
     const [appliedJobIds, setAppliedJobIds] = useState(new Set());
+    const [applying, setApplying] = useState(null);
     // State for the Apply Modal
     const [jobToApply, setJobToApply] = useState(null);
                     .from('applications')
                     .select('job_id')
                     .eq('user_id', user.id);
                 if (data) {
                     setAppliedJobIds(new Set(data.map(app => app.job_id)));
                 }
     // 2. Open Apply Modal
     const initiateApply = (jobId) => {
         const job = filteredJobListings.find(j => j.id === jobId);
+        if (job) {
             setJobToApply(job);
         }
     };
     // 3. Submit Application (With Verification Gatekeeper)
     const handleFinalSubmit = async (formData) => {
         if (!jobToApply) return;
         setApplying(jobToApply.id);
         try {
             const { data: { user } } = await supabase.auth.getUser();
             if (!user) {
                 alert("Please log in to apply.");
                 return;
             if (profileError) throw profileError;
             // If NOT verified, stop the application and show modal
+            /** if (!profile.is_phone_verified) {
+                 setApplying(null);        // Stop loading spinner
+                 setJobToApply(null);      // Close application form
+                 setShowVerificationModal(true); // Open Verification Modal
+                 return;                   // 🛑 Stop execution here
+             } **/
             // --- ✅ IF VERIFIED: Proceed with Application ---
             const { error } = await supabase
                 .from('applications')
+                .insert([{
+                    job_id: jobToApply.id,
                     user_id: user.id,
                     status: 'Pending',
+                    resume_url: formData.resume_url,
+                    cover_letter: formData.cover_letter
                 }]);
             if (error) throw error;
             setAppliedJobIds(prev => new Set(prev).add(jobToApply.id));
+            alert("Application submitted successfully!");
+            setJobToApply(null);
         } catch (error) {
             console.error("Error applying:", error.message);
                     <input type="text" value={searchQuery} onChange={(e) => setSearchQuery(e.target.value)} placeholder="Search by job title..." style={{ width: '100%', padding: '0.75rem 1rem 0.75rem 2.5rem', borderRadius: '0.5rem', border: '1px solid rgba(251, 191, 36, 0.3)', backgroundColor: 'rgba(255,255,255,0.1)', color: 'white' }} />
                 </div>
             </div>
             {/* Job Grid */}
             <motion.main layout style={{ display: 'grid', gridTemplateColumns: 'repeat(auto-fit, minmax(300px, 1fr))', gap: '2rem' }}>
                 <AnimatePresence>
                     {filteredJobListings.length > 0 ? (
                         filteredJobListings.map((job) => (
                             <motion.div key={job.id} layout initial={{ opacity: 0, scale: 0.8 }} animate={{ opacity: 1, scale: 1 }} exit={{ opacity: 0, scale: 0.8 }} transition={{ duration: 0.2 }}>
+                                <JobCard
+                                    {...job}
+                                    onViewDetails={() => setSelectedJob(job)}
                                     onApply={() => initiateApply(job.id)}
                                     onWithdraw={() => handleWithdraw(job.id)}
                                     isApplied={appliedJobIds.has(job.id)}
                             </motion.div>
                         ))
                     ) : (
+                        <motion.p initial={{ opacity: 0 }} animate={{ opacity: 1 }} style={{ color: '#d1d5db' }}>No jobs found.</motion.p>
                     )}
                 </AnimatePresence>
             </motion.main>
             {/* Job Detail Modal */}
             {selectedJob && (
+                <JobDetail
+                    job={selectedJob}
+                    onClose={() => setSelectedJob(null)}
                     onApply={() => initiateApply(selectedJob.id)}
                     isApplied={appliedJobIds.has(selectedJob.id)}
                     isApplying={applying === selectedJob.id}
             {/* Apply Form Modal */}
             {jobToApply && (
+                <ApplyModel
                     job={jobToApply}
                     isSubmitting={applying === jobToApply.id}
                     onClose={() => setJobToApply(null)}
             {/* ✅ OTP Verification Modal */}
             {showVerificationModal && (
+                <VerificationModal
                     onClose={() => setShowVerificationModal(false)}
                     onVerified={() => {
                         setShowVerificationModal(false);

src/pages/Admindashboard.jsx CHANGED Viewed

@@ -1,6 +1,6 @@
 import React, { useState } from 'react';
 import { motion, AnimatePresence } from 'framer-motion';
-import { supabase } from '../supabaseClient';
 // Import the new split modules
 import AdminLayout from '../components/admin/AdminLayout';
@@ -9,6 +9,7 @@ import AdminSortingPage from '../components/admin/AdminSortingPage';
 import AdminInterviewManagement from '../components/admin/AdminInterviewManagement';
 import AdminProfile from '../components/admin/AdminProfile';
 import JobPosting from './JobPosting'; // Import your existing JobPosting component
 export default function AdminDashboard({ onNavigate }) {
     const [activeTab, setActiveTab] = useState('dashboard');
@@ -18,23 +19,25 @@ export default function AdminDashboard({ onNavigate }) {
         switch (activeTab) {
             case 'dashboard':
                 return <AdminSummary onNavigate={onNavigate} setIsModalOpen={setIsModalOpen} />;
-            case 'jobs':
                 return <AdminSortingPage />;
-            case 'messages':
                 return <AdminInterviewManagement />;
-            case 'job-management':
                 return <JobPosting />;
-            case 'settings':
                 return <AdminProfile onNavigate={onNavigate} />;
-            default:
                 return null;
         }
     };
-    const contentVariants = {
-        hidden: { opacity: 0, y: 10 },
-        visible: { opacity: 1, y: 0 },
-        exit: { opacity: 0, y: -10 }
     };
     return (

 import React, { useState } from 'react';
 import { motion, AnimatePresence } from 'framer-motion';
+import { supabase } from '../supabaseClient';
 // Import the new split modules
 import AdminLayout from '../components/admin/AdminLayout';
 import AdminInterviewManagement from '../components/admin/AdminInterviewManagement';
 import AdminProfile from '../components/admin/AdminProfile';
 import JobPosting from './JobPosting'; // Import your existing JobPosting component
+import TalentClusters from '../components/Admin/TalentClusters';
 export default function AdminDashboard({ onNavigate }) {
     const [activeTab, setActiveTab] = useState('dashboard');
         switch (activeTab) {
             case 'dashboard':
                 return <AdminSummary onNavigate={onNavigate} setIsModalOpen={setIsModalOpen} />;
+            case 'jobs':
                 return <AdminSortingPage />;
+            case 'messages':
                 return <AdminInterviewManagement />;
+            case 'job-management':
                 return <JobPosting />;
+            case 'clusters':
+                return <TalentClusters />;
+            case 'settings':
                 return <AdminProfile onNavigate={onNavigate} />;
+            default:
                 return null;
         }
     };
+    const contentVariants = {
+        hidden: { opacity: 0, y: 10 },
+        visible: { opacity: 1, y: 0 },
+        exit: { opacity: 0, y: -10 }
     };
     return (

src/pages/ApplicantProfile.jsx CHANGED Viewed

@@ -25,7 +25,7 @@ export default function ApplicantProfile({ onNavigate }) {
             try {
                 // Get current user
                 const { data: { user } } = await supabase.auth.getUser();
                 if (user) {
                     // Fetch Profile using maybeSingle() to avoid errors if empty
                     const { data: profile, error } = await supabase
@@ -44,7 +44,7 @@ export default function ApplicantProfile({ onNavigate }) {
                         setFormData(combinedData);
                         setOriginalFormData(combinedData);
                         if (profile.avatar_url) {
-                           setAvatarUrl(profile.avatar_url);
                         }
                     } else {
                         // New user - Initialize with just email
@@ -90,7 +90,7 @@ export default function ApplicantProfile({ onNavigate }) {
         const newValue = type === 'checkbox' ? checked : value;
         setFormData(prev => ({ ...prev, [name]: newValue }));
     };
     const handleAddExperience = () => {
         const newExperience = { id: Date.now(), company: '', role: '', years: '' };
         setFormData(prev => ({
@@ -98,7 +98,7 @@ export default function ApplicantProfile({ onNavigate }) {
             work_experience: [...(prev.work_experience || []), newExperience]
         }));
     };
     const handleExperienceChange = (index, e) => {
         const { name, value } = e.target;
         const updatedExperience = [...(formData.work_experience || [])];
@@ -110,7 +110,7 @@ export default function ApplicantProfile({ onNavigate }) {
         if (!isEditing || !e.target.files || e.target.files.length === 0) return;
         setResumeFile(e.target.files[0]);
     };
     const handleAvatarFileChange = (e) => {
         if (!isEditing || !e.target.files || e.target.files.length === 0) return;
         const file = e.target.files[0];
@@ -135,12 +135,23 @@ export default function ApplicantProfile({ onNavigate }) {
             }
             if (resumeFile) {
                 const filePath = `${user.id}/${Date.now()}_${resumeFile.name}`;
                 // Make sure your bucket is named 'resumes' (plural) or 'resume' (singular) to match your Supabase Storage
                 await supabase.storage.from('resume').upload(filePath, resumeFile, { upsert: true });
                 updates.resume_url = filePath;
             }
             const { error } = await supabase.from('profiles').upsert(updates);
             if (error) throw error;

             try {
                 // Get current user
                 const { data: { user } } = await supabase.auth.getUser();
                 if (user) {
                     // Fetch Profile using maybeSingle() to avoid errors if empty
                     const { data: profile, error } = await supabase
                         setFormData(combinedData);
                         setOriginalFormData(combinedData);
                         if (profile.avatar_url) {
+                            setAvatarUrl(profile.avatar_url);
                         }
                     } else {
                         // New user - Initialize with just email
         const newValue = type === 'checkbox' ? checked : value;
         setFormData(prev => ({ ...prev, [name]: newValue }));
     };
     const handleAddExperience = () => {
         const newExperience = { id: Date.now(), company: '', role: '', years: '' };
         setFormData(prev => ({
             work_experience: [...(prev.work_experience || []), newExperience]
         }));
     };
     const handleExperienceChange = (index, e) => {
         const { name, value } = e.target;
         const updatedExperience = [...(formData.work_experience || [])];
         if (!isEditing || !e.target.files || e.target.files.length === 0) return;
         setResumeFile(e.target.files[0]);
     };
     const handleAvatarFileChange = (e) => {
         if (!isEditing || !e.target.files || e.target.files.length === 0) return;
         const file = e.target.files[0];
             }
             if (resumeFile) {
+                // Delete old resume if it exists to prevent duplication
+                if (originalFormData?.resume_url) {
+                    try {
+                        const oldPath = originalFormData.resume_url;
+                        const { error: removeError } = await supabase.storage.from('resume').remove([oldPath]);
+                        if (removeError) console.warn("Could not delete old resume:", removeError.message);
+                    } catch (e) {
+                         console.warn("Exception during old resume removal:", e);
+                    }
+                }
                 const filePath = `${user.id}/${Date.now()}_${resumeFile.name}`;
                 // Make sure your bucket is named 'resumes' (plural) or 'resume' (singular) to match your Supabase Storage
                 await supabase.storage.from('resume').upload(filePath, resumeFile, { upsert: true });
                 updates.resume_url = filePath;
             }
             const { error } = await supabase.from('profiles').upsert(updates);
             if (error) throw error;

system_architecture.txt ADDED Viewed

	@@ -0,0 +1,66 @@

+# IRIS Detailed System Architecture
+This document provides a comprehensive look at the IRIS architecture, broken down by functional layers and individual process steps.
+## Overall System Flow
+This tiered diagram shows how data flows through the three main layers of the system.
+```mermaid
+graph TD
+    subgraph "1. Ingestion & Preprocessing"
+        UC[User/Admin] -->|Upload| SS[Supabase Storage]
+        SS -->|Webhook| BE[FastAPI Backend]
+        BE -->|Download| PC[Text Cleaning]
+        PC -->|Anonymize| PA[PII Removal]
+    end
+    subgraph "2. NLP Processing Layer"
+        PA -->|Raw Text| EX[Gemini Extraction]
+        EX -->|JSON| DB[(Supabase DB)]
+        DB -->|Text Fields| EM[BGE-M3 Embedding]
+        EM -->|Vectors| DB
+    end
+    subgraph "3. Matching & AI Analysis"
+        DB -->|Job vs Resume| MS[Semantic Matching]
+        MS -->|Score| MG[Skill Gap Analysis]
+        MG -->|Insights| AI[Gemini Analysis]
+        AI -->|Final Report| UI[Admin Dashboard]
+    end
+```
+---
+## 1. Data Ingestion & Preprocessing
+This layer ensures that incoming data is clean, secure, and ready for AI processing.
+*   **File Upload**: Resumes and Job Descriptions are stored securely in Supabase buckets.
+*   **Event Trigger**: Database Webhooks instantly notify the backend when a new file arrives.
+*   **Text Cleaning**: Standardizes encoding, removes special characters, and handles whitespace.
+*   **PII Anonymization**: Uses Regex and NLP patterns to detect and protect sensitive personal information (phone, address) before deep processing.
+## 2. NLP Processing Pipeline
+The "Intelligence" layer that understands the meaning behind the text.
+*   **Structured Extraction**: Google Gemini parses unstructured text into logical objects (Skills, Experience, Education).
+*   **Relational Storage**: Structured data is saved into dedicated PostgreSQL tables for rapid querying.
+*   **Vector Embedding**: The BGE-M3 model creates "mathematical summaries" (vectors) of the candidate's profile and the job requirements.
+*   **Vector Search Index**: These vectors allow the system to find matches based on *meaning* rather than just keywords (e.g., matching "Software Engineer" with "Full Stack Developer").
+## 3. Matching & AI Analysis Layer
+The decision-making layer that provides final value to the recruiter.
+*   **Semantic Scoring**: Calculates the mathematical distance between a candidate's vector and a job's vector.
+*   **Skill Gap Analysis**: Compares the extracted skill sets to identify exactly what is missing or where the candidate excels.
+*   **AI Insight Generation**: A second pass with Gemini generates a human-readable summary, custom strengths, and potential weaknesses.
+*   **Final Ranking**: Aggregates all scores into a prioritized list for the Admin dashboard.
+## Technology Stack
+| Layer | Technologies |
+| :--- | :--- |
+| **Frontend** | React, Vite, Framer Motion, Lucide Icons |
+| **Backend** | FastAPI, Python, SQLAlchemy/Supabase-py |
+| **Data** | Supabase (Postgres), pgvector, Supabase Storage |
+| **AI/ML** | Google Gemini (LLM), BGE-M3 (Embeddings), Sentence Transformers |