galihboy commited on
Commit
61ccccb
Β·
verified Β·
1 Parent(s): 6b644a1

Upload 3 files

Browse files
Files changed (3) hide show
  1. README.md +82 -14
  2. app.py +412 -8
  3. requirements.txt +2 -0
README.md CHANGED
@@ -1,14 +1,82 @@
1
- ---
2
- title: Semantic Embedding Api
3
- emoji: πŸ†
4
- colorFrom: red
5
- colorTo: blue
6
- sdk: gradio
7
- sdk_version: 6.0.1
8
- app_file: app.py
9
- pinned: false
10
- license: mit
11
- short_description: Api semantic similarity data proposal skripsi
12
- ---
13
-
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Semantic Embedding API
3
+ emoji: πŸ€–
4
+ colorFrom: blue
5
+ colorTo: green
6
+ sdk: gradio
7
+ sdk_version: "4.44.0"
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ short_description: Embedding + LLM Analysis untuk deteksi kemiripan proposal
12
+ ---
13
+
14
+ # πŸ€– Semantic Embedding & LLM Analysis API
15
+
16
+ API untuk deteksi kemiripan proposal skripsi menggunakan AI embedding dan Google Gemini.
17
+
18
+ ## Fitur
19
+
20
+ ### Embedding (Sentence Transformers)
21
+ - **Single/Batch Embedding** - Generate embedding vektor 384 dimensi
22
+ - **Similarity Check** - Hitung kemiripan semantik
23
+ - **Supabase Cache** - Shared cache untuk performa
24
+
25
+ ### LLM Analysis (Google Gemini)
26
+ - **Analisis Mendalam** - Reasoning seperti penilai manusia
27
+ - **Verdict** - AMAN / PERLU_REVIEW / BERMASALAH
28
+ - **Saran Konkret** - Rekomendasi untuk mahasiswa
29
+ - **Auto Cache** - Hasil disimpan ke Supabase
30
+
31
+ ## Model & Tech
32
+
33
+ | Komponen | Teknologi |
34
+ |----------|-----------|
35
+ | Embedding | `paraphrase-multilingual-MiniLM-L12-v2` (384 dim) |
36
+ | LLM | Google Gemini 2.5 Pro |
37
+ | Cache | Supabase (PostgreSQL) |
38
+ | API | Gradio |
39
+
40
+ ## Required Secrets
41
+
42
+ Set di **Settings > Repository secrets**:
43
+
44
+ ```
45
+ SUPABASE_URL - URL project Supabase
46
+ SUPABASE_KEY - Supabase anon/service key
47
+ GEMINI_API_KEY_1 - API key Gemini #1
48
+ GEMINI_API_KEY_2 - API key Gemini #2 (opsional)
49
+ GEMINI_API_KEY_3 - API key Gemini #3 (opsional)
50
+ GEMINI_API_KEY_4 - API key Gemini #4 (opsional)
51
+ ```
52
+
53
+ ## API Endpoints
54
+
55
+ | Endpoint | Fungsi |
56
+ |----------|--------|
57
+ | `/get_embedding` | Single text embedding |
58
+ | `/get_embeddings_batch` | Batch embeddings |
59
+ | `/calculate_similarity` | Cosine similarity |
60
+ | `/db_get_all_embeddings` | Get cached embeddings |
61
+ | `/db_save_embedding` | Save embedding (API only) |
62
+ | `/llm_check_status` | Check Gemini status |
63
+ | `/llm_analyze_pair` | Full LLM analysis |
64
+
65
+ ## Dibuat Untuk
66
+
67
+ **Monitoring Proposal Skripsi**
68
+ KK E (Ilmu Komputer) - Prodi Teknik Informatika
69
+ Universitas Komputer Indonesia (UNIKOM)
70
+
71
+ πŸ”— [Website](https://galih-hermawan-unikom.github.io/monitoring-proksi/)
72
+
73
+ ## Pengembang
74
+
75
+ **Galih Hermawan**
76
+ 🌐 [galih.eu](https://galih.eu) β€’ πŸ™ [github.com/galihboy](https://github.com/galihboy) β€’ πŸ™ [github.com/Galih-Hermawan-Unikom](https://github.com/Galih-Hermawan-Unikom)
77
+
78
+ πŸ“… Terakhir diperbarui: 30 November 2025
79
+
80
+ ## License
81
+
82
+ MIT License
app.py CHANGED
@@ -4,6 +4,15 @@ import json
4
  import numpy as np
5
  import os
6
  import httpx
 
 
 
 
 
 
 
 
 
7
 
8
  # ==================== CONFIGURATION ====================
9
 
@@ -19,6 +28,83 @@ LOCAL_MODEL_PATH = r"E:\huggingface_models\hub\models--sentence-transformers--pa
19
  SUPABASE_URL = os.environ.get("SUPABASE_URL", "")
20
  SUPABASE_KEY = os.environ.get("SUPABASE_KEY", "")
21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  def get_model_path():
23
  """Deteksi environment dan return path model yang sesuai"""
24
  # Cek apakah folder lokal ada
@@ -212,6 +298,269 @@ def db_check_connection():
212
  return {"connected": False, "error": str(e)}
213
 
214
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
215
  # Gradio Interface
216
  with gr.Blocks(title="Semantic Embedding API") as demo:
217
  gr.Markdown("# πŸ”€ Semantic Embedding API")
@@ -252,6 +601,7 @@ with gr.Blocks(title="Semantic Embedding API") as demo:
252
  with gr.Tab("πŸ’Ύ Database (Supabase)"):
253
  gr.Markdown("### Supabase Cache Operations")
254
  gr.Markdown("Proxy untuk akses Supabase (API key aman di server)")
 
255
 
256
  with gr.Row():
257
  db_check_btn = gr.Button("πŸ”Œ Check Connection", variant="secondary")
@@ -274,18 +624,72 @@ with gr.Blocks(title="Semantic Embedding API") as demo:
274
  db_get_btn = gr.Button("πŸ” Get Embedding", variant="primary")
275
  db_get_output = gr.JSON(label="Embedding Result")
276
  db_get_btn.click(fn=db_get_embedding, inputs=[db_nim_input, db_hash_input], outputs=db_get_output)
 
 
 
 
 
 
 
 
 
277
 
278
  gr.Markdown("---")
279
 
280
- gr.Markdown("#### Save Embedding")
281
- db_save_input = gr.Textbox(
282
- label="Embedding Data (JSON)",
283
- placeholder='{"nim": "123", "content_hash": "abc", "embedding_combined": [...], ...}',
284
- lines=4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
285
  )
286
- db_save_btn = gr.Button("πŸ’Ύ Save Embedding", variant="primary")
287
- db_save_output = gr.JSON(label="Save Result")
288
- db_save_btn.click(fn=db_save_embedding, inputs=db_save_input, outputs=db_save_output)
 
 
 
 
 
 
 
 
289
 
290
  with gr.Accordion("πŸ“‘ API Usage (untuk Developer)", open=False):
291
  gr.Markdown("""
 
4
  import numpy as np
5
  import os
6
  import httpx
7
+ import hashlib
8
+ from dotenv import load_dotenv
9
+
10
+ # Load environment variables from .env file
11
+ load_dotenv()
12
+
13
+ # Google GenAI SDK (new library)
14
+ from google import genai
15
+ from google.genai import types
16
 
17
  # ==================== CONFIGURATION ====================
18
 
 
28
  SUPABASE_URL = os.environ.get("SUPABASE_URL", "")
29
  SUPABASE_KEY = os.environ.get("SUPABASE_KEY", "")
30
 
31
+ # Gemini API configuration with key rotation
32
+ GEMINI_MODEL = os.environ.get("GEMINI_MODEL", "gemini-2.5-pro") # atau gemini-2.5-flash, gemini-2.5-flash-lite
33
+
34
+ # Load multiple API keys for rotation
35
+ GEMINI_API_KEYS = []
36
+ for i in range(1, 10): # Support up to 9 keys
37
+ key = os.environ.get(f"GEMINI_API_KEY_{i}", "")
38
+ if key:
39
+ GEMINI_API_KEYS.append(key)
40
+
41
+ # Fallback to single key if no numbered keys found
42
+ if not GEMINI_API_KEYS:
43
+ single_key = os.environ.get("GEMINI_API_KEY", "")
44
+ if single_key:
45
+ GEMINI_API_KEYS.append(single_key)
46
+
47
+ # Track current key index for rotation
48
+ current_key_index = 0
49
+
50
+ def get_gemini_client():
51
+ """Get Gemini client with current API key"""
52
+ global current_key_index
53
+ if not GEMINI_API_KEYS:
54
+ return None
55
+ return genai.Client(api_key=GEMINI_API_KEYS[current_key_index])
56
+
57
+ def rotate_api_key():
58
+ """Rotate to next API key"""
59
+ global current_key_index
60
+ if len(GEMINI_API_KEYS) > 1:
61
+ current_key_index = (current_key_index + 1) % len(GEMINI_API_KEYS)
62
+ print(f"πŸ”„ Rotated to API key #{current_key_index + 1}")
63
+ return current_key_index
64
+
65
+ def call_gemini_with_retry(prompt: str, max_retries: int = None):
66
+ """Call Gemini API with automatic key rotation on rate limit"""
67
+ global current_key_index
68
+
69
+ if not GEMINI_API_KEYS:
70
+ return None, "No API keys configured"
71
+
72
+ if max_retries is None:
73
+ max_retries = len(GEMINI_API_KEYS)
74
+
75
+ last_error = None
76
+
77
+ for attempt in range(max_retries):
78
+ try:
79
+ client = get_gemini_client()
80
+ response = client.models.generate_content(
81
+ model=GEMINI_MODEL,
82
+ contents=prompt
83
+ )
84
+ return response, None
85
+
86
+ except Exception as e:
87
+ error_str = str(e).lower()
88
+ last_error = str(e)
89
+
90
+ # Check if rate limit error
91
+ if "429" in error_str or "rate" in error_str or "quota" in error_str or "resource" in error_str:
92
+ print(f"⚠️ Rate limit hit on key #{current_key_index + 1}: {e}")
93
+ rotate_api_key()
94
+ continue
95
+ else:
96
+ # Non-rate-limit error, don't retry
97
+ return None, str(e)
98
+
99
+ return None, f"All API keys exhausted. Last error: {last_error}"
100
+
101
+ # Initialize and print status
102
+ if GEMINI_API_KEYS:
103
+ print(f"βœ… Gemini configured with {len(GEMINI_API_KEYS)} API key(s)")
104
+ print(f" Model: {GEMINI_MODEL}")
105
+ else:
106
+ print("⚠️ No Gemini API keys found")
107
+
108
  def get_model_path():
109
  """Deteksi environment dan return path model yang sesuai"""
110
  # Cek apakah folder lokal ada
 
298
  return {"connected": False, "error": str(e)}
299
 
300
 
301
+ # ==================== LLM CACHE FUNCTIONS (SUPABASE) ====================
302
+
303
+ def db_get_llm_analysis(pair_hash: str):
304
+ """Ambil cached LLM analysis dari Supabase by pair_hash"""
305
+ if not SUPABASE_URL or not SUPABASE_KEY:
306
+ return None
307
+
308
+ try:
309
+ url = f"{SUPABASE_URL}/rest/v1/llm_analysis?pair_hash=eq.{pair_hash}&select=*"
310
+
311
+ with httpx.Client(timeout=10.0) as client:
312
+ response = client.get(url, headers=get_supabase_headers())
313
+
314
+ if response.status_code == 200:
315
+ data = response.json()
316
+ if data and len(data) > 0:
317
+ result = data[0]
318
+ # Parse similar_aspects from JSONB
319
+ if isinstance(result.get('similar_aspects'), str):
320
+ result['similar_aspects'] = json.loads(result['similar_aspects'])
321
+ result['from_cache'] = True
322
+ return result
323
+ return None
324
+ except Exception as e:
325
+ print(f"Error getting cached LLM analysis: {e}")
326
+ return None
327
+
328
+
329
+ def db_save_llm_analysis(pair_hash: str, proposal1_judul: str, proposal2_judul: str, result: dict):
330
+ """Simpan LLM analysis result ke Supabase"""
331
+ if not SUPABASE_URL or not SUPABASE_KEY:
332
+ return False
333
+
334
+ try:
335
+ url = f"{SUPABASE_URL}/rest/v1/llm_analysis"
336
+ headers = get_supabase_headers()
337
+ headers["Prefer"] = "resolution=merge-duplicates" # Upsert
338
+
339
+ payload = {
340
+ "pair_hash": pair_hash,
341
+ "proposal1_judul": proposal1_judul[:500] if proposal1_judul else "",
342
+ "proposal2_judul": proposal2_judul[:500] if proposal2_judul else "",
343
+ "similarity_score": result.get("similarity_score"),
344
+ "verdict": result.get("verdict"),
345
+ "reasoning": result.get("reasoning"),
346
+ "saran": result.get("saran"),
347
+ "similar_aspects": json.dumps(result.get("similar_aspects", {})),
348
+ "differentiator": result.get("differentiator"),
349
+ "model_used": result.get("model_used", GEMINI_MODEL)
350
+ }
351
+
352
+ with httpx.Client(timeout=10.0) as client:
353
+ response = client.post(url, headers=headers, json=payload)
354
+
355
+ if response.status_code in [200, 201]:
356
+ print(f"βœ… LLM result cached: {pair_hash[:8]}...")
357
+ return True
358
+ else:
359
+ print(f"⚠️ Failed to cache LLM result: {response.status_code}")
360
+ return False
361
+ except Exception as e:
362
+ print(f"Error saving LLM analysis: {e}")
363
+ return False
364
+
365
+
366
+ # ==================== LLM FUNCTIONS (GEMINI) ====================
367
+
368
+ def generate_pair_hash(proposal1: dict, proposal2: dict) -> str:
369
+ """Generate unique hash untuk pasangan proposal"""
370
+ def proposal_hash(p):
371
+ content = f"{p.get('nim', '')}|{p.get('judul', '')}|{p.get('deskripsi', '')}|{p.get('problem', '')}|{p.get('metode', '')}"
372
+ return hashlib.md5(content.encode()).hexdigest()[:16]
373
+
374
+ h1 = proposal_hash(proposal1)
375
+ h2 = proposal_hash(proposal2)
376
+ # Sort untuk konsistensi (A,B = B,A)
377
+ sorted_hashes = sorted([h1, h2])
378
+ return hashlib.md5(f"{sorted_hashes[0]}|{sorted_hashes[1]}".encode()).hexdigest()[:32]
379
+
380
+
381
+ def llm_analyze_pair(proposal1_json: str, proposal2_json: str, use_cache: bool = True):
382
+ """Analisis kemiripan dua proposal menggunakan Gemini LLM"""
383
+ if not GEMINI_API_KEYS:
384
+ return {"error": "Gemini API key not configured. Set GEMINI_API_KEY_1, GEMINI_API_KEY_2, etc in .env file"}
385
+
386
+ try:
387
+ proposal1 = json.loads(proposal1_json)
388
+ proposal2 = json.loads(proposal2_json)
389
+ except json.JSONDecodeError:
390
+ return {"error": "Invalid JSON format for proposals"}
391
+
392
+ # Generate pair hash untuk caching
393
+ pair_hash = generate_pair_hash(proposal1, proposal2)
394
+
395
+ # Check cache first
396
+ if use_cache:
397
+ cached_result = db_get_llm_analysis(pair_hash)
398
+ if cached_result:
399
+ print(f"πŸ“¦ Using cached LLM result: {pair_hash[:8]}...")
400
+ return cached_result
401
+
402
+ # Build prompt
403
+ prompt = f"""Anda adalah penilai kemiripan proposal skripsi yang ahli dan berpengalaman. Analisis dua proposal berikut dengan KRITERIA AKADEMIK yang benar.
404
+
405
+ ATURAN PENILAIAN PENTING:
406
+ 1. Proposal skripsi dianggap BERMASALAH hanya jika KETIGA aspek ini SAMA: Topik/Domain + Dataset/Objek Penelitian + Metode/Algoritma
407
+ 2. Jika METODE BERBEDA (walaupun topik & dataset sama) β†’ AMAN, karena memberikan kontribusi ilmiah berbeda
408
+ 3. Jika DATASET/OBJEK BERBEDA (walaupun topik & metode sama) β†’ AMAN, karena studi kasus berbeda
409
+ 4. Jika TOPIK/DOMAIN BERBEDA β†’ AMAN
410
+ 5. Penelitian replikasi dengan variasi adalah HAL YANG WAJAR dalam dunia akademik
411
+
412
+ PROPOSAL 1:
413
+ - NIM: {proposal1.get('nim', 'N/A')}
414
+ - Nama: {proposal1.get('nama', 'N/A')}
415
+ - Judul: {proposal1.get('judul', 'N/A')}
416
+ - Deskripsi: {proposal1.get('deskripsi', 'N/A')[:500] if proposal1.get('deskripsi') else 'N/A'}
417
+ - Problem Statement: {proposal1.get('problem', 'N/A')[:500] if proposal1.get('problem') else 'N/A'}
418
+ - Metode: {proposal1.get('metode', 'N/A')}
419
+
420
+ PROPOSAL 2:
421
+ - NIM: {proposal2.get('nim', 'N/A')}
422
+ - Nama: {proposal2.get('nama', 'N/A')}
423
+ - Judul: {proposal2.get('judul', 'N/A')}
424
+ - Deskripsi: {proposal2.get('deskripsi', 'N/A')[:500] if proposal2.get('deskripsi') else 'N/A'}
425
+ - Problem Statement: {proposal2.get('problem', 'N/A')[:500] if proposal2.get('problem') else 'N/A'}
426
+ - Metode: {proposal2.get('metode', 'N/A')}
427
+
428
+ ANALISIS dengan cermat, lalu berikan output JSON (HANYA JSON, tanpa markdown):
429
+ {{
430
+ "similarity_score": <0-100, tinggi HANYA jika topik+dataset+metode SEMUA sama>,
431
+ "verdict": "<BERMASALAH jika score>=80, PERLU_REVIEW jika 50-79, AMAN jika <50>",
432
+ "similar_aspects": {{
433
+ "topik": <true/false - apakah tema/domain penelitian sama>,
434
+ "dataset": <true/false - apakah objek/data penelitian sama>,
435
+ "metode": <true/false - apakah algoritma/metode sama>,
436
+ "pendekatan": <true/false - apakah framework/pendekatan sama>
437
+ }},
438
+ "differentiator": "<aspek pembeda utama: metode/dataset/domain/tidak_ada>",
439
+ "reasoning": "<analisis mendalam 4-5 kalimat: jelaskan persamaan dan perbedaan dari aspek topik, dataset, dan metode. Jelaskan mengapa proposal ini aman/bermasalah berdasarkan kriteria akademik>",
440
+ "saran": "<nasihat konstruktif 2-3 kalimat untuk mahasiswa: jika aman, beri saran penguatan diferensiasi. Jika bermasalah, beri warning dan alternatif arah penelitian>"
441
+ }}"""
442
+
443
+ # Call Gemini API with retry/rotation
444
+ response, error = call_gemini_with_retry(prompt)
445
+
446
+ if error:
447
+ return {"error": f"Gemini API error: {error}"}
448
+
449
+ try:
450
+ # Parse response
451
+ response_text = response.text.strip()
452
+
453
+ # Clean response (remove markdown code blocks if present)
454
+ if response_text.startswith("```"):
455
+ lines = response_text.split("\n")
456
+ response_text = "\n".join(lines[1:-1]) # Remove first and last lines
457
+
458
+ result = json.loads(response_text)
459
+ result["pair_hash"] = pair_hash
460
+ result["model_used"] = GEMINI_MODEL
461
+ result["api_key_used"] = current_key_index + 1
462
+ result["from_cache"] = False
463
+
464
+ # Save to cache
465
+ db_save_llm_analysis(
466
+ pair_hash=pair_hash,
467
+ proposal1_judul=proposal1.get('judul', ''),
468
+ proposal2_judul=proposal2.get('judul', ''),
469
+ result=result
470
+ )
471
+
472
+ return result
473
+
474
+ except json.JSONDecodeError as e:
475
+ return {
476
+ "error": "Failed to parse LLM response as JSON",
477
+ "raw_response": response_text if 'response_text' in dir() else "No response",
478
+ "parse_error": str(e)
479
+ }
480
+
481
+
482
+ def llm_check_status():
483
+ """Check Gemini API status"""
484
+ if not GEMINI_API_KEYS:
485
+ return {
486
+ "configured": False,
487
+ "error": "No GEMINI_API_KEY found in environment"
488
+ }
489
+
490
+ response, error = call_gemini_with_retry("Respond with only: OK")
491
+
492
+ if error:
493
+ return {
494
+ "configured": True,
495
+ "total_keys": len(GEMINI_API_KEYS),
496
+ "model": GEMINI_MODEL,
497
+ "status": "error",
498
+ "error": error
499
+ }
500
+
501
+ return {
502
+ "configured": True,
503
+ "total_keys": len(GEMINI_API_KEYS),
504
+ "current_key": current_key_index + 1,
505
+ "model": GEMINI_MODEL,
506
+ "status": "connected",
507
+ "test_response": response.text.strip()[:50]
508
+ }
509
+
510
+
511
+ def llm_analyze_simple(judul1: str, judul2: str, metode1: str, metode2: str):
512
+ """Simplified analysis - hanya judul dan metode (untuk testing cepat)"""
513
+ if not GEMINI_API_KEYS:
514
+ return {"error": "Gemini API key not configured"}
515
+
516
+ prompt = f"""Anda adalah penilai kemiripan proposal skripsi yang ahli. Bandingkan dua proposal berikut dengan KRITERIA AKADEMIK yang benar.
517
+
518
+ ATURAN PENILAIAN PENTING:
519
+ 1. Proposal skripsi dianggap BERMASALAH hanya jika KETIGA aspek ini SAMA: Topik/Domain + Dataset + Metode
520
+ 2. Jika METODE BERBEDA (walaupun topik sama) β†’ AMAN, karena kontribusi berbeda
521
+ 3. Jika DATASET BERBEDA (walaupun topik & metode sama) β†’ AMAN, karena studi kasus berbeda
522
+ 4. Jika TOPIK/DOMAIN BERBEDA β†’ AMAN
523
+
524
+ Proposal 1:
525
+ - Judul: {judul1}
526
+ - Metode: {metode1}
527
+
528
+ Proposal 2:
529
+ - Judul: {judul2}
530
+ - Metode: {metode2}
531
+
532
+ ANALISIS dengan cermat, lalu berikan output JSON (HANYA JSON, tanpa markdown):
533
+ {{
534
+ "similarity_score": <0-100, tinggi HANYA jika topik+dataset+metode SEMUA sama>,
535
+ "verdict": "<BERMASALAH jika score>=80, PERLU_REVIEW jika 50-79, AMAN jika <50>",
536
+ "topik_sama": <true/false>,
537
+ "metode_sama": <true/false>,
538
+ "differentiator": "<aspek pembeda utama: metode/dataset/domain/tidak_ada>",
539
+ "reasoning": "<analisis mendalam 3-4 kalimat: jelaskan persamaan, perbedaan, dan mengapa aman/bermasalah>",
540
+ "saran": "<nasihat konstruktif untuk mahasiswa, misal: cara memperkuat diferensiasi, atau warning jika terlalu mirip>"
541
+ }}"""
542
+
543
+ response, error = call_gemini_with_retry(prompt)
544
+
545
+ if error:
546
+ return {"error": error}
547
+
548
+ try:
549
+ response_text = response.text.strip()
550
+
551
+ if response_text.startswith("```"):
552
+ lines = response_text.split("\n")
553
+ response_text = "\n".join(lines[1:-1])
554
+
555
+ result = json.loads(response_text)
556
+ result["model_used"] = GEMINI_MODEL
557
+ result["api_key_used"] = current_key_index + 1
558
+ return result
559
+
560
+ except json.JSONDecodeError as e:
561
+ return {"error": f"Failed to parse response: {e}", "raw": response_text}
562
+
563
+
564
  # Gradio Interface
565
  with gr.Blocks(title="Semantic Embedding API") as demo:
566
  gr.Markdown("# πŸ”€ Semantic Embedding API")
 
601
  with gr.Tab("πŸ’Ύ Database (Supabase)"):
602
  gr.Markdown("### Supabase Cache Operations")
603
  gr.Markdown("Proxy untuk akses Supabase (API key aman di server)")
604
+ gr.Markdown("*Note: Operasi write (save) hanya tersedia melalui API untuk keamanan.*")
605
 
606
  with gr.Row():
607
  db_check_btn = gr.Button("πŸ”Œ Check Connection", variant="secondary")
 
624
  db_get_btn = gr.Button("πŸ” Get Embedding", variant="primary")
625
  db_get_output = gr.JSON(label="Embedding Result")
626
  db_get_btn.click(fn=db_get_embedding, inputs=[db_nim_input, db_hash_input], outputs=db_get_output)
627
+
628
+ with gr.Tab("πŸ€– LLM Analysis (Gemini)"):
629
+ gr.Markdown("### Analisis Kemiripan dengan LLM")
630
+ gr.Markdown("Menggunakan Google Gemini untuk analisis mendalam dengan penjelasan")
631
+
632
+ with gr.Row():
633
+ llm_check_btn = gr.Button("πŸ”Œ Check Gemini Status", variant="secondary")
634
+ llm_check_output = gr.JSON(label="Gemini Status")
635
+ llm_check_btn.click(fn=llm_check_status, outputs=llm_check_output)
636
 
637
  gr.Markdown("---")
638
 
639
+ gr.Markdown("#### Quick Analysis (Judul + Metode saja)")
640
+ with gr.Row():
641
+ with gr.Column():
642
+ llm_judul1 = gr.Textbox(label="Judul Proposal 1", placeholder="Analisis Sentimen dengan SVM...", lines=2)
643
+ llm_metode1 = gr.Textbox(label="Metode 1", placeholder="Support Vector Machine")
644
+ with gr.Column():
645
+ llm_judul2 = gr.Textbox(label="Judul Proposal 2", placeholder="Klasifikasi Sentimen dengan SVM...", lines=2)
646
+ llm_metode2 = gr.Textbox(label="Metode 2", placeholder="Support Vector Machine")
647
+
648
+ llm_simple_btn = gr.Button("πŸš€ Analyze (Quick)", variant="primary")
649
+ llm_simple_output = gr.JSON(label="Quick Analysis Result")
650
+ llm_simple_btn.click(
651
+ fn=llm_analyze_simple,
652
+ inputs=[llm_judul1, llm_judul2, llm_metode1, llm_metode2],
653
+ outputs=llm_simple_output
654
+ )
655
+
656
+ gr.Markdown("---")
657
+
658
+ gr.Markdown("#### Full Analysis (Complete Proposal Data)")
659
+ gr.Markdown("*Hasil di-cache ke Supabase. Request yang sama akan menggunakan cache.*")
660
+ with gr.Row():
661
+ llm_proposal1 = gr.Textbox(
662
+ label="Proposal 1 (JSON)",
663
+ placeholder='{"nim": "123", "nama": "Ahmad", "judul": "...", "deskripsi": "...", "problem": "...", "metode": "..."}',
664
+ lines=5
665
+ )
666
+ llm_proposal2 = gr.Textbox(
667
+ label="Proposal 2 (JSON)",
668
+ placeholder='{"nim": "456", "nama": "Budi", "judul": "...", "deskripsi": "...", "problem": "...", "metode": "..."}',
669
+ lines=5
670
+ )
671
+
672
+ with gr.Row():
673
+ llm_use_cache = gr.Checkbox(label="Gunakan Cache", value=True, info="Uncheck untuk force refresh dari Gemini")
674
+ llm_full_btn = gr.Button("πŸ” Analyze (Full)", variant="primary")
675
+
676
+ llm_full_output = gr.JSON(label="Full Analysis Result")
677
+ llm_full_btn.click(
678
+ fn=llm_analyze_pair,
679
+ inputs=[llm_proposal1, llm_proposal2, llm_use_cache],
680
+ outputs=llm_full_output
681
  )
682
+
683
+ gr.Markdown("""
684
+ **Output mencakup:**
685
+ - `similarity_score`: Skor 0-100 (tinggi hanya jika topik+dataset+metode sama)
686
+ - `verdict`: BERMASALAH / PERLU_REVIEW / AMAN
687
+ - `reasoning`: Analisis mendalam dari AI
688
+ - `similar_aspects`: Aspek yang mirip (topik/dataset/metode/pendekatan)
689
+ - `differentiator`: Pembeda utama
690
+ - `saran`: Nasihat untuk mahasiswa
691
+ - `from_cache`: true jika hasil dari cache
692
+ """)
693
 
694
  with gr.Accordion("πŸ“‘ API Usage (untuk Developer)", open=False):
695
  gr.Markdown("""
requirements.txt CHANGED
@@ -3,3 +3,5 @@ sentence-transformers>=2.2.0
3
  torch
4
  numpy
5
  httpx>=0.24.0
 
 
 
3
  torch
4
  numpy
5
  httpx>=0.24.0
6
+ google-genai>=1.0.0
7
+ python-dotenv>=1.0.0