vietexob commited on
Commit
5bfc72c
·
1 Parent(s): ee7f635

Adding LightRAG KG

Browse files
Files changed (6) hide show
  1. CLAUDE.md +79 -0
  2. app.py +30 -15
  3. app_old.py +0 -280
  4. llm_graph.py +125 -16
  5. main.py +0 -392
  6. requirements.txt +4 -2
CLAUDE.md ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CLAUDE.md
2
+
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+ ## Application Overview
6
+
7
+ This is a Text2Graph application that extracts knowledge graphs from natural language text. It's a Gradio web app that uses either OpenAI GPT-4.1-mini via Azure or Phi-3-mini-128k-instruct-graph via Hugging Face to extract entities and relationships from text, then visualizes them as interactive graphs.
8
+
9
+ ## Architecture
10
+
11
+ - **app.py**: Main Gradio application with UI components, visualization logic, and caching
12
+ - **llm_graph.py**: Core LLMGraph class that handles model selection and knowledge graph extraction
13
+ - **cache/**: Directory for caching visualization data (first example is pre-cached for performance)
14
+
15
+ ## Key Components
16
+
17
+ ### LLMGraph Class (llm_graph.py)
18
+ - Supports two model backends: Azure OpenAI (GPT-4.1-mini) and Hugging Face (Phi-3-mini-128k-instruct-graph)
19
+ - Uses LightRAG for Azure OpenAI integration
20
+ - Direct inference API calls for Hugging Face models
21
+ - Extracts structured JSON with nodes (entities) and edges (relationships)
22
+
23
+ ### Visualization Pipeline (app.py)
24
+ - Entity recognition visualization using spaCy's displacy
25
+ - Interactive knowledge graph using pyvis and NetworkX
26
+ - Caching system for performance optimization
27
+ - Color-coded entity types with random light colors
28
+
29
+ ## Environment Setup
30
+
31
+ Required environment variables:
32
+ ```
33
+ HF_TOKEN=<huggingface_token>
34
+ HF_API_ENDPOINT=<huggingface_inference_endpoint>
35
+ AZURE_OPENAI_API_KEY=<azure_openai_key>
36
+ AZURE_OPENAI_ENDPOINT=<azure_endpoint>
37
+ AZURE_OPENAI_API_VERSION=<api_version>
38
+ AZURE_OPENAI_DEPLOYMENT=<deployment_name>
39
+ AZURE_EMBEDDING_DEPLOYMENT=<embedding_deployment>
40
+ AZURE_EMBEDDING_API_VERSION=<embedding_api_version>
41
+ ```
42
+
43
+ ## Running the Application
44
+
45
+ ```bash
46
+ # Install dependencies
47
+ pip install -r requirements.txt
48
+
49
+ # Run the Gradio app
50
+ python app.py
51
+ ```
52
+
53
+ ## Key Dependencies
54
+
55
+ - **gradio**: Web interface framework
56
+ - **lightrag-hku**: RAG framework for Azure OpenAI integration
57
+ - **transformers**: Hugging Face model integration
58
+ - **pyvis**: Interactive network visualization
59
+ - **networkx**: Graph data structure and algorithms
60
+ - **spacy**: Natural language processing and entity visualization
61
+ - **openai**: Azure OpenAI client
62
+
63
+ ## Data Flow
64
+
65
+ 1. User inputs text and selects model
66
+ 2. LLMGraph.extract() processes text using selected model backend
67
+ 3. JSON response contains nodes (entities) and edges (relationships)
68
+ 4. Visualization functions create entity highlighting and interactive graph
69
+ 5. Results cached for performance (first example only)
70
+
71
+ ## Model Behavior
72
+
73
+ The application expects JSON output with this schema:
74
+ ```json
75
+ {
76
+ "nodes": [{"id": "entity", "type": "broad_type", "detailed_type": "specific_type"}],
77
+ "edges": [{"from": "entity1", "to": "entity2", "label": "relationship"}]
78
+ }
79
+ ```
app.py CHANGED
@@ -3,7 +3,10 @@ import os
3
  import spacy
4
  import pickle
5
  import random
 
6
  import rapidjson
 
 
7
  import gradio as gr
8
  import networkx as nx
9
 
@@ -12,6 +15,8 @@ from pyvis.network import Network
12
  from spacy import displacy
13
  from spacy.tokens import Span
14
 
 
 
15
  # Constants
16
  TITLE = "🌐 Text2Graph: Extract Knowledge Graphs from Natural Language"
17
  SUBTITLE = "✨ Extract and visualize knowledge graphs from texts in any language!"
@@ -53,7 +58,7 @@ def handle_text(text=""):
53
  return " ".join(text.split())
54
 
55
  # @spaces.GPU
56
- def extract_kg(text="", model=None):
57
  """
58
  Extract knowledge graph from text
59
  """
@@ -62,8 +67,9 @@ def extract_kg(text="", model=None):
62
  if not text or not model:
63
  raise gr.Error("⚠️ Both text and model must be provided!")
64
  try:
65
- model = LLMGraph(model=model)
66
- result = model.extract(text)
 
67
  return rapidjson.loads(result)
68
  except Exception as e:
69
  raise gr.Error(f"❌ Extraction error: {str(e)}")
@@ -217,7 +223,7 @@ def create_graph(json_data):
217
  allow-top-navigation-by-user-activation allow-downloads" allowfullscreen=""
218
  allowpaymentrequest="" frameborder="0" srcdoc='{html}'></iframe>"""
219
 
220
- def process_and_visualize(text, model, progress=gr.Progress()):
221
  """
222
  Process text and visualize knowledge graph and entities
223
  """
@@ -238,11 +244,12 @@ def process_and_visualize(text, model, progress=gr.Progress()):
238
  progress(1.0, desc="Loaded from cache!")
239
  return cache_data["graph_html"], cache_data["entities_viz"], cache_data["json_data"], cache_data["stats"]
240
  except Exception as e:
241
- print(f"Cache loading error: {str(e)}")
 
242
 
243
  # Continue with normal processing if cache fails
244
  progress(0, desc="Starting extraction...")
245
- json_data = extract_kg(text, model)
246
 
247
  progress(0.5, desc="Creating entity visualization...")
248
  entities_viz = create_custom_entity_viz(json_data, text)
@@ -266,7 +273,8 @@ def process_and_visualize(text, model, progress=gr.Progress()):
266
  with open(EXAMPLE_CACHE_FILE, 'wb') as f:
267
  pickle.dump(cache_data, f)
268
  except Exception as e:
269
- print(f"Cache saving error: {str(e)}")
 
270
 
271
  progress(1.0, desc="Complete!")
272
  return graph_html, entities_viz, json_data, stats
@@ -293,20 +301,21 @@ EXAMPLES = [
293
  les buis et à arroser les rosiers, perpétuant ainsi une tradition d'excellence horticole qui fait la fierté de la capitale française.""")],
294
  ]
295
 
296
- def generate_first_example_cache():
297
  """
298
  Generate cache for the first example if it doesn't exist when the app starts
299
  """
300
 
301
  if not os.path.exists(EXAMPLE_CACHE_FILE):
302
- print("Generating cache for first example...")
 
303
 
304
  try:
305
  text = EXAMPLES[0][0]
306
  model = MODEL_LIST[0] if MODEL_LIST else None
307
 
308
  # Extract data
309
- json_data = extract_kg(text, model)
310
  entities_viz = create_custom_entity_viz(json_data, text)
311
  graph_html = create_graph(json_data)
312
 
@@ -324,18 +333,24 @@ def generate_first_example_cache():
324
 
325
  with open(EXAMPLE_CACHE_FILE, 'wb') as f:
326
  pickle.dump(cached_data, f)
327
- print("First example cache generated successfully")
 
328
 
329
  return cached_data
330
  except Exception as e:
331
- print(f"Error generating first example cache: {str(e)}")
 
332
  else:
333
- print("First example cache already exists")
 
 
 
334
  try:
335
  with open(EXAMPLE_CACHE_FILE, 'rb') as f:
336
  return pickle.load(f)
337
  except Exception as e:
338
- print(f"Error loading existing cache: {str(e)}")
 
339
 
340
  return None
341
 
@@ -345,7 +360,7 @@ def create_ui():
345
  """
346
 
347
  # Try to generate/load the first example cache
348
- first_example_cache = generate_first_example_cache()
349
 
350
  with gr.Blocks(css=CUSTOM_CSS, title=TITLE) as demo:
351
  # Header
 
3
  import spacy
4
  import pickle
5
  import random
6
+ import logging
7
  import rapidjson
8
+ import asyncio
9
+
10
  import gradio as gr
11
  import networkx as nx
12
 
 
15
  from spacy import displacy
16
  from spacy.tokens import Span
17
 
18
+ logging.basicConfig(level=logging.INFO)
19
+
20
  # Constants
21
  TITLE = "🌐 Text2Graph: Extract Knowledge Graphs from Natural Language"
22
  SUBTITLE = "✨ Extract and visualize knowledge graphs from texts in any language!"
 
58
  return " ".join(text.split())
59
 
60
  # @spaces.GPU
61
+ async def extract_kg(text="", model=None):
62
  """
63
  Extract knowledge graph from text
64
  """
 
67
  if not text or not model:
68
  raise gr.Error("⚠️ Both text and model must be provided!")
69
  try:
70
+ model_instance = LLMGraph(model=model)
71
+ result = await model_instance.extract(text)
72
+
73
  return rapidjson.loads(result)
74
  except Exception as e:
75
  raise gr.Error(f"❌ Extraction error: {str(e)}")
 
223
  allow-top-navigation-by-user-activation allow-downloads" allowfullscreen=""
224
  allowpaymentrequest="" frameborder="0" srcdoc='{html}'></iframe>"""
225
 
226
+ async def process_and_visualize(text, model, progress=gr.Progress()):
227
  """
228
  Process text and visualize knowledge graph and entities
229
  """
 
244
  progress(1.0, desc="Loaded from cache!")
245
  return cache_data["graph_html"], cache_data["entities_viz"], cache_data["json_data"], cache_data["stats"]
246
  except Exception as e:
247
+ # print(f"Cache loading error: {str(e)}")
248
+ logging.error(f"Cache loading error: {str(e)}")
249
 
250
  # Continue with normal processing if cache fails
251
  progress(0, desc="Starting extraction...")
252
+ json_data = await extract_kg(text, model)
253
 
254
  progress(0.5, desc="Creating entity visualization...")
255
  entities_viz = create_custom_entity_viz(json_data, text)
 
273
  with open(EXAMPLE_CACHE_FILE, 'wb') as f:
274
  pickle.dump(cache_data, f)
275
  except Exception as e:
276
+ # print(f"Cache saving error: {str(e)}")
277
+ logging.error(f"Cache saving error: {str(e)}")
278
 
279
  progress(1.0, desc="Complete!")
280
  return graph_html, entities_viz, json_data, stats
 
301
  les buis et à arroser les rosiers, perpétuant ainsi une tradition d'excellence horticole qui fait la fierté de la capitale française.""")],
302
  ]
303
 
304
+ async def generate_first_example_cache():
305
  """
306
  Generate cache for the first example if it doesn't exist when the app starts
307
  """
308
 
309
  if not os.path.exists(EXAMPLE_CACHE_FILE):
310
+ # print("Generating cache for first example...")
311
+ logging.info("Generating cache for first example...")
312
 
313
  try:
314
  text = EXAMPLES[0][0]
315
  model = MODEL_LIST[0] if MODEL_LIST else None
316
 
317
  # Extract data
318
+ json_data = await extract_kg(text, model)
319
  entities_viz = create_custom_entity_viz(json_data, text)
320
  graph_html = create_graph(json_data)
321
 
 
333
 
334
  with open(EXAMPLE_CACHE_FILE, 'wb') as f:
335
  pickle.dump(cached_data, f)
336
+ # print("First example cache generated successfully")
337
+ logging.info("First example cache generated successfully")
338
 
339
  return cached_data
340
  except Exception as e:
341
+ # print(f"Error generating first example cache: {str(e)}")
342
+ logging.error(f"Error generating first example cache: {str(e)}")
343
  else:
344
+ # print("First example cache already exists")
345
+ logging.info("First example cache already exists")
346
+
347
+ # Load existing cache
348
  try:
349
  with open(EXAMPLE_CACHE_FILE, 'rb') as f:
350
  return pickle.load(f)
351
  except Exception as e:
352
+ # print(f"Error loading existing cache: {str(e)}")
353
+ logging.error(f"Error loading existing cache: {str(e)}")
354
 
355
  return None
356
 
 
360
  """
361
 
362
  # Try to generate/load the first example cache
363
+ first_example_cache = asyncio.run(generate_first_example_cache())
364
 
365
  with gr.Blocks(css=CUSTOM_CSS, title=TITLE) as demo:
366
  # Header
app_old.py DELETED
@@ -1,280 +0,0 @@
1
- # import spaces
2
- import gradio as gr
3
- from llm_graph import MODEL_LIST, LLMGraph
4
- import rapidjson
5
- from pyvis.network import Network
6
- import networkx as nx
7
- import spacy
8
- from spacy import displacy
9
- from spacy.tokens import Span
10
- import random
11
- from tqdm import tqdm
12
-
13
- # Constants
14
- TITLE = "🌐 GraphMind: Phi-3 Instruct Graph Explorer"
15
- SUBTITLE = "✨ Extract and visualize knowledge graphs from any text in multiple languages"
16
-
17
- # Custom CSS for styling
18
- CUSTOM_CSS = """
19
- .gradio-container {
20
- font-family: 'Inter', 'Segoe UI', Roboto, sans-serif;
21
- }
22
- .gr-button-primary {
23
- background-color: #6366f1 !important;
24
- }
25
- .gr-button-secondary {
26
- border-color: #6366f1 !important;
27
- color: #6366f1 !important;
28
- }
29
- """
30
-
31
- # Color utilities
32
- def get_random_light_color():
33
- r = random.randint(140, 255)
34
- g = random.randint(140, 255)
35
- b = random.randint(140, 255)
36
- return f"#{r:02x}{g:02x}{b:02x}"
37
-
38
- # Text preprocessing
39
- def handle_text(text):
40
- return " ".join(text.split())
41
-
42
- # Main processing functions
43
- # @spaces.GPU
44
- def extract(text, model):
45
- try:
46
- model = LLMGraph(model=model)
47
- result = model.extract(text)
48
- return rapidjson.loads(result)
49
- except Exception as e:
50
- raise gr.Error(f"Extraction error: {str(e)}")
51
-
52
- def find_token_indices(doc, substring, text):
53
- result = []
54
- start_index = text.find(substring)
55
-
56
- while start_index != -1:
57
- end_index = start_index + len(substring)
58
- start_token = None
59
- end_token = None
60
-
61
- for token in doc:
62
- if token.idx == start_index:
63
- start_token = token.i
64
- if token.idx + len(token) == end_index:
65
- end_token = token.i + 1
66
-
67
- if start_token is not None and end_token is not None:
68
- result.append({
69
- "start": start_token,
70
- "end": end_token
71
- })
72
-
73
- # Search for next occurrence
74
- start_index = text.find(substring, end_index)
75
-
76
- return result
77
-
78
- def create_custom_entity_viz(data, full_text):
79
- nlp = spacy.blank("xx")
80
- doc = nlp(full_text)
81
-
82
- spans = []
83
- colors = {}
84
- for node in data["nodes"]:
85
- entity_spans = find_token_indices(doc, node["id"], full_text)
86
- for dataentity in entity_spans:
87
- start = dataentity["start"]
88
- end = dataentity["end"]
89
-
90
- if start < len(doc) and end <= len(doc):
91
- # Check for overlapping spans
92
- overlapping = any(s.start < end and start < s.end for s in spans)
93
- if not overlapping:
94
- span = Span(doc, start, end, label=node["type"])
95
- spans.append(span)
96
- if node["type"] not in colors:
97
- colors[node["type"]] = get_random_light_color()
98
-
99
- doc.set_ents(spans, default="unmodified")
100
- doc.spans["sc"] = spans
101
-
102
- options = {
103
- "colors": colors,
104
- "ents": list(colors.keys()),
105
- "style": "ent",
106
- "manual": True
107
- }
108
-
109
- html = displacy.render(doc, style="span", options=options)
110
- return html
111
-
112
- def create_graph(json_data):
113
- G = nx.Graph()
114
-
115
- # Add nodes with tooltips
116
- for node in json_data['nodes']:
117
- G.add_node(node['id'], title=f"{node['type']}: {node['detailed_type']}")
118
-
119
- # Add edges with labels
120
- for edge in json_data['edges']:
121
- G.add_edge(edge['from'], edge['to'], title=edge['label'], label=edge['label'])
122
-
123
- # Create network visualization
124
- nt = Network(
125
- width="720px",
126
- height="600px",
127
- directed=True,
128
- notebook=False,
129
- bgcolor="#f8fafc",
130
- font_color="#1e293b"
131
- )
132
-
133
- # Configure network display
134
- nt.from_nx(G)
135
- nt.barnes_hut(
136
- gravity=-3000,
137
- central_gravity=0.3,
138
- spring_length=50,
139
- spring_strength=0.001,
140
- damping=0.09,
141
- overlap=0,
142
- )
143
-
144
- # Customize edge appearance
145
- for edge in nt.edges:
146
- edge['width'] = 2
147
- edge['arrows'] = {'to': {'enabled': True, 'type': 'arrow'}}
148
- edge['color'] = {'color': '#6366f1', 'highlight': '#4f46e5'}
149
- edge['font'] = {'size': 12, 'color': '#4b5563', 'face': 'Arial'}
150
-
151
- # Customize node appearance
152
- for node in nt.nodes:
153
- node['color'] = {'background': '#e0e7ff', 'border': '#6366f1', 'highlight': {'background': '#c7d2fe', 'border': '#4f46e5'}}
154
- node['font'] = {'size': 14, 'color': '#1e293b'}
155
- node['shape'] = 'dot'
156
- node['size'] = 25
157
-
158
- # Generate HTML with iframe to isolate styles
159
- html = nt.generate_html()
160
- html = html.replace("'", '"')
161
-
162
- return f"""<iframe style="width: 100%; height: 620px; margin: 0 auto; border-radius: 8px; box-shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.1), 0 2px 4px -1px rgba(0, 0, 0, 0.06);"
163
- name="result" allow="midi; geolocation; microphone; camera; display-capture; encrypted-media;"
164
- sandbox="allow-modals allow-forms allow-scripts allow-same-origin allow-popups
165
- allow-top-navigation-by-user-activation allow-downloads" allowfullscreen=""
166
- allowpaymentrequest="" frameborder="0" srcdoc='{html}'></iframe>"""
167
-
168
- def process_and_visualize(text, model, progress=gr.Progress()):
169
- if not text or not model:
170
- raise gr.Error("⚠️ Both text and model must be provided.")
171
-
172
- progress(0, desc="Starting extraction...")
173
- json_data = extract(text, model)
174
-
175
- progress(0.5, desc="Creating entity visualization...")
176
- entities_viz = create_custom_entity_viz(json_data, text)
177
-
178
- progress(0.8, desc="Building knowledge graph...")
179
- graph_html = create_graph(json_data)
180
-
181
- node_count = len(json_data["nodes"])
182
- edge_count = len(json_data["edges"])
183
- stats = f"📊 Extracted {node_count} entities and {edge_count} relationships"
184
-
185
- progress(1.0, desc="Complete!")
186
- return graph_html, entities_viz, json_data, stats
187
-
188
- # Example texts in different languages
189
- EXAMPLES = [
190
- [handle_text("""Legendary rock band Aerosmith has officially announced their retirement from touring after 54 years, citing
191
- lead singer Steven Tyler's unrecoverable vocal cord injury.
192
- The decision comes after months of unsuccessful treatment for Tyler's fractured larynx,
193
- which he suffered in September 2023.""")],
194
-
195
- [handle_text("""Pop star Justin Timberlake, 43, had his driver's license suspended by a New York judge during a virtual
196
- court hearing on August 2, 2024. The suspension follows Timberlake's arrest for driving while intoxicated (DWI)
197
- in Sag Harbor on June 18. Timberlake, who is currently on tour in Europe,
198
- pleaded not guilty to the charges.""")],
199
-
200
- [handle_text("""세계적인 기술 기업 삼성전자는 새로운 인공지능 기반 스마트폰을 올해 하반기에 출시할 예정이라고 발표했다.
201
- 이 스마트폰은 현재 개발 중인 갤럭시 시리즈의 최신작으로, 강력한 AI 기능과 혁신적인 카메라 시스템을 탑재할 것으로 알려졌다.
202
- 삼성전자의 CEO는 이번 신제품이 스마트폰 시장에 새로운 혁신을 가져올 것이라고 전망했다.""")],
203
-
204
- [handle_text("""한국 영화 '기생충'은 2020년 아카데미 시상식에서 작품상, 감독상, 각본상, 국제영화상 등 4개 부문을 수상하며 역사를 새로 썼다.
205
- 봉준호 감독이 연출한 이 영화는 한국 영화 최초로 칸 영화제 황금종려상도 수상했으며, 전 세계적으로 엄청난 흥행과
206
- 평단의 호평을 받았다.""")]
207
- ]
208
-
209
- def create_ui():
210
- with gr.Blocks(css=CUSTOM_CSS, title=TITLE) as demo:
211
- # Header
212
- gr.Markdown(f"# {TITLE}")
213
- gr.Markdown(f"{SUBTITLE}")
214
-
215
- with gr.Row():
216
- gr.Markdown("🌍 **Multilingual Support Available** 🔤")
217
-
218
- # Main interface
219
- with gr.Row():
220
- # Input column
221
- with gr.Column(scale=1):
222
- input_model = gr.Dropdown(
223
- MODEL_LIST,
224
- label="🤖 Select Model",
225
- info="Choose a model to process your text",
226
- value=MODEL_LIST[0] if MODEL_LIST else None
227
- )
228
-
229
- input_text = gr.TextArea(
230
- label="📝 Input Text",
231
- info="Enter text in any language to extract a knowledge graph",
232
- placeholder="Enter text here...",
233
- lines=10
234
- )
235
-
236
- with gr.Row():
237
- submit_button = gr.Button("🚀 Extract & Visualize", variant="primary", scale=2)
238
- clear_button = gr.Button("🔄 Clear", variant="secondary", scale=1)
239
-
240
- gr.Examples(
241
- examples=EXAMPLES,
242
- inputs=input_text,
243
- label="📚 Example Texts (English & Korean)"
244
- )
245
-
246
- stats_output = gr.Markdown("", label="🔍 Analysis Results")
247
-
248
- # Output column
249
- with gr.Column(scale=1):
250
- with gr.Tab("🧩 Knowledge Graph"):
251
- output_graph = gr.HTML(label="")
252
-
253
- with gr.Tab("🏷️ Entities"):
254
- output_entity_viz = gr.HTML(label="")
255
-
256
- with gr.Tab("📊 JSON Data"):
257
- output_json = gr.JSON(label="")
258
-
259
- # Functionality
260
- submit_button.click(
261
- fn=process_and_visualize,
262
- inputs=[input_text, input_model],
263
- outputs=[output_graph, output_entity_viz, output_json, stats_output]
264
- )
265
-
266
- clear_button.click(
267
- fn=lambda: [None, None, None, ""],
268
- inputs=[],
269
- outputs=[output_graph, output_entity_viz, output_json, stats_output]
270
- )
271
-
272
- # Footer
273
- gr.Markdown("---")
274
- gr.Markdown("📋 **Instructions:** Enter text in any language, select a model, and click 'Extract & Visualize' to generate a knowledge graph.")
275
- gr.Markdown("🛠️ Powered by Phi-3 Instruct Graph | Emergent Methods")
276
-
277
- return demo
278
-
279
- demo = create_ui()
280
- demo.launch(share=False)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
llm_graph.py CHANGED
@@ -1,18 +1,31 @@
1
  import os
2
- from textwrap import dedent
 
3
 
4
- from huggingface_hub import InferenceClient
5
  from dotenv import load_dotenv
 
 
 
 
 
 
6
 
7
  load_dotenv()
 
 
8
  api_token = os.environ["HF_TOKEN"]
9
  endpoint_url = os.environ["HF_API_ENDPOINT"]
10
 
11
- # Initialize the client with your endpoint URL and token.
12
- client = InferenceClient(
13
- model=endpoint_url,
14
- token=api_token
15
- )
 
 
 
 
16
 
17
  MODEL_LIST = [
18
  "OpenAI/GPT-4.1-mini",
@@ -20,15 +33,71 @@ MODEL_LIST = [
20
  ]
21
 
22
  class LLMGraph:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  def __init__(self, model="OpenAI/GPT-4.1-mini"):
24
  """
25
  Initialize the Phi3InstructGraph with a specified model.
26
  """
27
-
28
  if model not in MODEL_LIST:
29
  raise ValueError(f"Model must be one of {MODEL_LIST}")
30
 
31
- self.model_path = model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
  def _generate(self, messages):
34
  """
@@ -36,7 +105,7 @@ class LLMGraph:
36
  """
37
 
38
  # Use the chat_completion method
39
- response = client.chat_completion(
40
  messages=messages,
41
  max_tokens=1024,
42
  )
@@ -85,7 +154,6 @@ class LLMGraph:
85
  -------Text end-------
86
  """)
87
 
88
- # if self.model_path == "EmergentMethods/Phi-3-medium-128k-instruct-graph":
89
  messages = [
90
  {
91
  "role": "system",
@@ -96,17 +164,58 @@ class LLMGraph:
96
  "content": user_message
97
  }
98
  ]
99
- # else:
100
- # # TODO: update for other models
101
 
102
  return messages
103
 
104
- def extract(self, text):
105
  """
106
  Extract knowledge graph from text
107
  """
108
 
109
- messages = self._get_messages(text)
110
- generated_text = self._generate(messages)
 
 
 
 
 
 
 
 
111
 
112
  return generated_text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  import os
2
+ import asyncio
3
+ import numpy as np
4
 
5
+ from textwrap import dedent
6
  from dotenv import load_dotenv
7
+ from openai import AzureOpenAI
8
+ from huggingface_hub import InferenceClient
9
+
10
+ from lightrag import LightRAG
11
+ from lightrag.utils import EmbeddingFunc
12
+ from lightrag.kg.shared_storage import initialize_pipeline_status
13
 
14
  load_dotenv()
15
+
16
+ # Load the environment variables
17
  api_token = os.environ["HF_TOKEN"]
18
  endpoint_url = os.environ["HF_API_ENDPOINT"]
19
 
20
+ AZURE_OPENAI_API_VERSION = os.environ["AZURE_OPENAI_API_VERSION"]
21
+ AZURE_OPENAI_DEPLOYMENT = os.environ["AZURE_OPENAI_DEPLOYMENT"]
22
+ AZURE_OPENAI_API_KEY = os.environ["AZURE_OPENAI_API_KEY"]
23
+ AZURE_OPENAI_ENDPOINT = os.environ["AZURE_OPENAI_ENDPOINT"]
24
+
25
+ AZURE_EMBEDDING_DEPLOYMENT = os.environ["AZURE_EMBEDDING_DEPLOYMENT"]
26
+ AZURE_EMBEDDING_API_VERSION = os.environ["AZURE_EMBEDDING_API_VERSION"]
27
+
28
+ WORKING_DIR = "./cache"
29
 
30
  MODEL_LIST = [
31
  "OpenAI/GPT-4.1-mini",
 
33
  ]
34
 
35
  class LLMGraph:
36
+ """
37
+ A class to interact with LLMs for knowledge graph extraction.
38
+ """
39
+
40
+ async def _initialize_rag(self, embedding_dimension=3072):
41
+ """
42
+ Initialize the LightRAG instance with the specified embedding dimension.
43
+ """
44
+
45
+ rag = LightRAG(
46
+ working_dir=WORKING_DIR,
47
+ llm_model_func=self._llm_model_func,
48
+ embedding_func=EmbeddingFunc(
49
+ embedding_dim=embedding_dimension,
50
+ max_token_size=8192,
51
+ func=self._embedding_func,
52
+ ),
53
+ )
54
+
55
+ await rag.initialize_storages()
56
+ await initialize_pipeline_status()
57
+
58
+ return rag
59
+
60
+ async def _get_rag(self):
61
+ """
62
+ Get or initialize the RAG instance (lazy loading).
63
+ """
64
+
65
+ if self.rag is None:
66
+ self.rag = await self._initialize_rag()
67
+
68
+ return self.rag
69
+
70
  def __init__(self, model="OpenAI/GPT-4.1-mini"):
71
  """
72
  Initialize the Phi3InstructGraph with a specified model.
73
  """
74
+
75
  if model not in MODEL_LIST:
76
  raise ValueError(f"Model must be one of {MODEL_LIST}")
77
 
78
+ self.model_name = model
79
+
80
+ if model == MODEL_LIST[0]:
81
+ # Use Azure OpenAI for GPT-4.1-mini
82
+ self.llm_client = AzureOpenAI(
83
+ api_key=AZURE_OPENAI_API_KEY,
84
+ api_version=AZURE_OPENAI_API_VERSION,
85
+ azure_endpoint=AZURE_OPENAI_ENDPOINT,
86
+ )
87
+
88
+ self.emb_client = AzureOpenAI(
89
+ api_key=AZURE_OPENAI_API_KEY,
90
+ api_version=AZURE_EMBEDDING_API_VERSION,
91
+ azure_endpoint=AZURE_OPENAI_ENDPOINT,
92
+ )
93
+
94
+ self.rag = None # Initialize as None for lazy loading
95
+ else:
96
+ # Use Hugging Face Inference API for Phi-3-mini-128k-instruct-graph
97
+ self.hf_client = InferenceClient(
98
+ model=endpoint_url,
99
+ token=api_token
100
+ )
101
 
102
  def _generate(self, messages):
103
  """
 
105
  """
106
 
107
  # Use the chat_completion method
108
+ response = self.hf_client.chat_completion(
109
  messages=messages,
110
  max_tokens=1024,
111
  )
 
154
  -------Text end-------
155
  """)
156
 
 
157
  messages = [
158
  {
159
  "role": "system",
 
164
  "content": user_message
165
  }
166
  ]
 
 
167
 
168
  return messages
169
 
170
+ async def extract(self, text):
171
  """
172
  Extract knowledge graph from text
173
  """
174
 
175
+ generated_text = ""
176
+
177
+ if self.model_name == MODEL_LIST[0]:
178
+ # Use LightRAG with Azure OpenAI
179
+ rag = await self._get_rag()
180
+ rag.insert(text)
181
+ else:
182
+ # Use Hugging Face Inference API with Phi-3-mini-128k-instruct-graph
183
+ messages = self._get_messages(text)
184
+ generated_text = self._generate(messages)
185
 
186
  return generated_text
187
+
188
+ async def _llm_model_func(self, prompt, system_prompt=None, history_messages=[], **kwargs) -> str:
189
+ """
190
+ Call the Azure OpenAI chat completion endpoint with the given prompt and optional system prompt and history messages.
191
+ """
192
+
193
+ messages = []
194
+
195
+ if system_prompt:
196
+ messages.append({"role": "system", "content": system_prompt})
197
+
198
+ if history_messages:
199
+ messages.extend(history_messages)
200
+
201
+ messages.append({"role": "user", "content": prompt})
202
+
203
+ chat_completion = self.llm_client.chat.completions.create(
204
+ model=AZURE_OPENAI_DEPLOYMENT,
205
+ messages=messages,
206
+ temperature=kwargs.get("temperature", 0),
207
+ top_p=kwargs.get("top_p", 1),
208
+ n=kwargs.get("n", 1),
209
+ )
210
+
211
+ return chat_completion.choices[0].message.content
212
+
213
+ async def _embedding_func(self, texts: list[str]) -> np.ndarray:
214
+ """
215
+ Call the Azure OpenAI embeddings endpoint with the given texts.
216
+ """
217
+
218
+ embedding = self.emb_client.embeddings.create(model=AZURE_EMBEDDING_DEPLOYMENT, input=texts)
219
+ embeddings = [item.embedding for item in embedding.data]
220
+
221
+ return np.array(embeddings)
main.py DELETED
@@ -1,392 +0,0 @@
1
- # import spaces
2
- import gradio as gr
3
- from llm_graph import MODEL_LIST, LLMGraph
4
- import rapidjson
5
- from pyvis.network import Network
6
- import networkx as nx
7
- import spacy
8
- from spacy import displacy
9
- from spacy.tokens import Span
10
- import random
11
- import time
12
-
13
- # Set up the theme and styling
14
- CUSTOM_CSS = """
15
- .gradio-container {
16
- font-family: 'Inter', 'Segoe UI', Roboto, sans-serif;
17
- }
18
- .gr-prose h1 {
19
- font-size: 2.5rem !important;
20
- margin-bottom: 0.5rem !important;
21
- background: linear-gradient(90deg, #4338ca, #a855f7);
22
- -webkit-background-clip: text;
23
- -webkit-text-fill-color: transparent;
24
- }
25
- .gr-prose h2 {
26
- font-size: 1.8rem !important;
27
- margin-top: 1rem !important;
28
- }
29
- .info-box {
30
- padding: 1rem;
31
- border-radius: 0.5rem;
32
- background-color: #f3f4f6;
33
- margin-bottom: 1rem;
34
- border-left: 4px solid #6366f1;
35
- }
36
- .language-badge {
37
- display: inline-block;
38
- padding: 0.25rem 0.5rem;
39
- border-radius: 9999px;
40
- font-size: 0.75rem;
41
- font-weight: 600;
42
- background-color: #e0e7ff;
43
- color: #4338ca;
44
- margin-right: 0.5rem;
45
- margin-bottom: 0.5rem;
46
- }
47
- .footer {
48
- text-align: center;
49
- margin-top: 2rem;
50
- padding-top: 1rem;
51
- border-top: 1px solid #e2e8f0;
52
- font-size: 0.875rem;
53
- color: #64748b;
54
- }
55
- """
56
-
57
- # Color utilities
58
- def get_random_light_color():
59
- r = random.randint(150, 255)
60
- g = random.randint(150, 255)
61
- b = random.randint(150, 255)
62
- return f"#{r:02x}{g:02x}{b:02x}"
63
-
64
- # Text processing helper
65
- def handle_text(text):
66
- return " ".join(text.split())
67
-
68
- # Core extraction function
69
- # @spaces.GPU
70
- def extract(text, model):
71
- model = LLMGraph(model=model)
72
- try:
73
- result = model.extract(text)
74
- return rapidjson.loads(result)
75
- except Exception as e:
76
- raise gr.Error(f"🚨 Extraction failed: {str(e)}")
77
-
78
- def find_token_indices(doc, substring, text):
79
- result = []
80
- start_index = text.find(substring)
81
-
82
- while start_index != -1:
83
- end_index = start_index + len(substring)
84
- start_token = None
85
- end_token = None
86
-
87
- for token in doc:
88
- if token.idx == start_index:
89
- start_token = token.i
90
- if token.idx + len(token) == end_index:
91
- end_token = token.i + 1
92
-
93
- if start_token is not None and end_token is not None:
94
- result.append({
95
- "start": start_token,
96
- "end": end_token
97
- })
98
-
99
- # Search for next occurrence
100
- start_index = text.find(substring, end_index)
101
-
102
- return result
103
-
104
- def create_custom_entity_viz(data, full_text):
105
- nlp = spacy.blank("xx")
106
- doc = nlp(full_text)
107
-
108
- spans = []
109
- colors = {}
110
-
111
- for node in data["nodes"]:
112
- entity_spans = find_token_indices(doc, node["id"], full_text)
113
- for dataentity in entity_spans:
114
- start = dataentity["start"]
115
- end = dataentity["end"]
116
-
117
- if start < len(doc) and end <= len(doc):
118
- # Check for overlapping spans
119
- overlapping = any(s.start < end and start < s.end for s in spans)
120
- if not overlapping:
121
- span = Span(doc, start, end, label=node["type"])
122
- spans.append(span)
123
- if node["type"] not in colors:
124
- colors[node["type"]] = get_random_light_color()
125
-
126
- doc.set_ents(spans, default="unmodified")
127
- doc.spans["sc"] = spans
128
-
129
- options = {
130
- "colors": colors,
131
- "ents": list(colors.keys()),
132
- "style": "ent",
133
- "manual": True
134
- }
135
-
136
- html = displacy.render(doc, style="span", options=options)
137
-
138
- # Add custom styling to the entity visualization
139
- styled_html = f"""
140
- <div style="border-radius: 0.5rem; padding: 1rem; background-color: white;
141
- border: 1px solid #e2e8f0; box-shadow: 0 1px 3px 0 rgba(0, 0, 0, 0.1);">
142
- <div style="margin-bottom: 0.75rem; font-weight: 500; color: #4b5563;">
143
- Entity types found:
144
- {' '.join([f'<span style="display: inline-block; margin-right: 0.5rem; margin-bottom: 0.5rem; padding: 0.25rem 0.5rem; border-radius: 9999px; font-size: 0.75rem; background-color: {colors[entity_type]}; color: #1e293b;">{entity_type}</span>' for entity_type in colors.keys()])}
145
- </div>
146
- {html}
147
- </div>
148
- """
149
-
150
- return styled_html
151
-
152
- def create_graph(json_data):
153
- G = nx.DiGraph() # Using DiGraph for directed graph
154
-
155
- # Add nodes
156
- for node in json_data['nodes']:
157
- G.add_node(node['id'],
158
- title=f"{node['type']}: {node['detailed_type']}",
159
- group=node['type']) # Group nodes by type
160
-
161
- # Add edges
162
- for edge in json_data['edges']:
163
- G.add_edge(edge['from'], edge['to'], title=edge['label'], label=edge['label'])
164
-
165
- # Create network visualization
166
- nt = Network(
167
- width="100%",
168
- height="600px",
169
- directed=True,
170
- notebook=False,
171
- bgcolor="#fafafa",
172
- font_color="#1e293b"
173
- )
174
-
175
- # Configure network
176
- nt.from_nx(G)
177
- nt.barnes_hut(
178
- gravity=-3000,
179
- central_gravity=0.3,
180
- spring_length=150,
181
- spring_strength=0.001,
182
- damping=0.09,
183
- overlap=0,
184
- )
185
-
186
- # Create color groups for node types
187
- node_types = {node['type'] for node in json_data['nodes']}
188
- colors = {}
189
- for i, node_type in enumerate(node_types):
190
- hue = (i * 137) % 360 # Golden ratio to distribute colors
191
- colors[node_type] = f"hsl({hue}, 70%, 70%)"
192
-
193
- # Customize nodes
194
- for node in nt.nodes:
195
- node_data = next((n for n in json_data['nodes'] if n['id'] == node['id']), None)
196
- if node_data:
197
- node_type = node_data['type']
198
- node['color'] = colors.get(node_type, "#bfdbfe")
199
- node['shape'] = 'dot'
200
- node['size'] = 20
201
- node['borderWidth'] = 2
202
- node['borderWidthSelected'] = 4
203
- node['font'] = {'size': 14, 'color': '#1e293b', 'face': 'Inter, Arial'}
204
-
205
- # Customize edges
206
- for edge in nt.edges:
207
- edge['color'] = {'color': '#94a3b8', 'highlight': '#6366f1', 'hover': '#818cf8'}
208
- edge['width'] = 1.5
209
- edge['selectionWidth'] = 2
210
- edge['hoverWidth'] = 2
211
- edge['arrows'] = {'to': {'enabled': True, 'type': 'arrow'}}
212
- edge['smooth'] = {'type': 'continuous', 'roundness': 0.2}
213
- edge['font'] = {'size': 12, 'color': '#4b5563', 'face': 'Inter, Arial', 'strokeWidth': 2, 'strokeColor': '#ffffff'}
214
-
215
- # Generate HTML
216
- html = nt.generate_html()
217
- html = html.replace("'", '"')
218
- html = html.replace('height: 600px;', 'height: 600px; border-radius: 8px;')
219
-
220
- return f"""<iframe style="width: 100%; height: 620px; margin: 0 auto; border-radius: 8px; box-shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.1), 0 2px 4px -1px rgba(0, 0, 0, 0.06);"
221
- name="result" allow="midi; geolocation; microphone; camera; display-capture; encrypted-media;"
222
- sandbox="allow-modals allow-forms allow-scripts allow-same-origin allow-popups
223
- allow-top-navigation-by-user-activation allow-downloads" allowfullscreen=""
224
- allowpaymentrequest="" frameborder="0" srcdoc='{html}'></iframe>"""
225
-
226
- def process_and_visualize(text, model, progress=gr.Progress()):
227
- if not text or not model:
228
- raise gr.Error("⚠️ Please provide both text and model")
229
-
230
- # Progress updates
231
- progress(0.1, "Initializing...")
232
- time.sleep(0.2) # Small delay for UI feedback
233
-
234
- # Extract graph
235
- progress(0.2, "Extracting knowledge graph...")
236
- json_data = extract(text, model)
237
-
238
- # Entity visualization
239
- progress(0.6, "Identifying entities...")
240
- entities_viz = create_custom_entity_viz(json_data, text)
241
-
242
- # Graph visualization
243
- progress(0.8, "Building graph visualization...")
244
- graph_html = create_graph(json_data)
245
-
246
- # Statistics
247
- entity_types = {}
248
- for node in json_data['nodes']:
249
- entity_type = node['type']
250
- if entity_type in entity_types:
251
- entity_types[entity_type] += 1
252
- else:
253
- entity_types[entity_type] = 1
254
-
255
- stats_html = f"""
256
- <div class="info-box">
257
- <h3 style="margin-top: 0;">📊 Extraction Results</h3>
258
- <p>✅ Successfully extracted <b>{len(json_data['nodes'])}</b> entities and <b>{len(json_data['edges'])}</b> relationships.</p>
259
-
260
- <div>
261
- <h4>Entity Types:</h4>
262
- <div>
263
- {''.join([f'<span class="language-badge">{entity_type}: {count}</span>' for entity_type, count in entity_types.items()])}
264
- </div>
265
- </div>
266
- </div>
267
- """
268
-
269
- progress(1.0, "Done!")
270
- return graph_html, entities_viz, json_data, stats_html
271
-
272
- def language_info():
273
- return """
274
- <div class="info-box">
275
- <h3 style="margin-top: 0;">🌍 Multilingual Support</h3>
276
- <p>This application supports text analysis in multiple languages, including:</p>
277
- <div>
278
- <span class="language-badge">English 🇬🇧</span>
279
- <span class="language-badge">Korean 🇰🇷</span>
280
- <span class="language-badge">Spanish 🇪🇸</span>
281
- <span class="language-badge">French 🇫🇷</span>
282
- <span class="language-badge">German 🇩🇪</span>
283
- <span class="language-badge">Japanese 🇯🇵</span>
284
- <span class="language-badge">Chinese 🇨🇳</span>
285
- <span class="language-badge">And more...</span>
286
- </div>
287
- </div>
288
- """
289
-
290
- def tips_html():
291
- return """
292
- <div class="info-box">
293
- <h3 style="margin-top: 0;">💡 Tips for Best Results</h3>
294
- <ul>
295
- <li>Use clear, descriptive sentences with well-defined relationships</li>
296
- <li>Include specific entities, events, dates, and locations for better extraction</li>
297
- <li>Longer texts provide more context for relationship identification</li>
298
- <li>Try different models to compare extraction results</li>
299
- </ul>
300
- </div>
301
- """
302
-
303
- # Examples in multiple languages
304
- EXAMPLES = [
305
- [handle_text("""Legendary rock band Aerosmith has officially announced their retirement from touring after 54 years, citing
306
- lead singer Steven Tyler's unrecoverable vocal cord injury.
307
- The decision comes after months of unsuccessful treatment for Tyler's fractured larynx,
308
- which he suffered in September 2023.""")],
309
-
310
- [handle_text("""Pop star Justin Timberlake, 43, had his driver's license suspended by a New York judge during a virtual
311
- court hearing on August 2, 2024. The suspension follows Timberlake's arrest for driving while intoxicated (DWI)
312
- in Sag Harbor on June 18. Timberlake, who is currently on tour in Europe,
313
- pleaded not guilty to the charges.""")],
314
- ]
315
-
316
- # Main UI
317
- with gr.Blocks(css=CUSTOM_CSS, title="🧠 Phi-3 Knowledge Graph Explorer") as demo:
318
- # Header
319
- gr.Markdown("# 🧠 Phi-3 Knowledge Graph Explorer")
320
- gr.Markdown("### ✨ Extract and visualize knowledge graphs from text in any language")
321
-
322
- with gr.Row():
323
- with gr.Column(scale=2):
324
- input_text = gr.TextArea(
325
- label="📝 Enter your text",
326
- placeholder="Paste or type your text here...",
327
- lines=10
328
- )
329
-
330
- with gr.Row():
331
- input_model = gr.Dropdown(
332
- MODEL_LIST,
333
- label="🤖 Model",
334
- value=MODEL_LIST[0] if MODEL_LIST else None,
335
- info="Select the model to use for extraction"
336
- )
337
-
338
- with gr.Column():
339
- submit_button = gr.Button("🔍 Extract & Visualize", variant="primary")
340
- clear_button = gr.Button("🔄 Clear", variant="secondary")
341
-
342
- # Multilingual support info
343
- gr.HTML(language_info())
344
-
345
- # Examples section
346
- gr.Examples(
347
- examples=EXAMPLES,
348
- inputs=input_text,
349
- label="📚 Example Texts (English & Korean)"
350
- )
351
-
352
- # Tips
353
- gr.HTML(tips_html())
354
-
355
- with gr.Column(scale=3):
356
- # Stats output
357
- stats_output = gr.HTML(label="")
358
-
359
- # Tabs for different visualizations
360
- with gr.Tabs():
361
- with gr.TabItem("🔄 Knowledge Graph"):
362
- output_graph = gr.HTML()
363
-
364
- with gr.TabItem("🏷️ Entity Recognition"):
365
- output_entity_viz = gr.HTML()
366
-
367
- with gr.TabItem("📊 JSON Data"):
368
- output_json = gr.JSON()
369
-
370
- # Footer
371
- gr.HTML("""
372
- <div class="footer">
373
- <p>🌐 Powered by Phi-3 Instruct Graph | Created by Emergent Methods</p>
374
- <p>© 2025 | Knowledge Graph Explorer</p>
375
- </div>
376
- """)
377
-
378
- # Set up event handlers
379
- submit_button.click(
380
- fn=process_and_visualize,
381
- inputs=[input_text, input_model],
382
- outputs=[output_graph, output_entity_viz, output_json, stats_output]
383
- )
384
-
385
- clear_button.click(
386
- fn=lambda: [None, None, None, ""],
387
- inputs=[],
388
- outputs=[output_graph, output_entity_viz, output_json, stats_output]
389
- )
390
-
391
- # Launch the app
392
- demo.launch(share=False)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
requirements.txt CHANGED
@@ -1,10 +1,12 @@
1
  python-dotenv
2
  gradio
3
- transformers==4.45.2
4
- python-dotenv
5
  accelerate
6
  python-rapidjson
7
  spaces
8
  pyvis
9
  networkx
10
  spacy
 
 
 
 
1
  python-dotenv
2
  gradio
3
+ transformers
 
4
  accelerate
5
  python-rapidjson
6
  spaces
7
  pyvis
8
  networkx
9
  spacy
10
+ numpy
11
+ lightrag-hku
12
+ openai