jackkuo commited on
Commit
0bd5c8c
·
verified ·
1 Parent(s): 71d5a29

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +0 -40
app.py CHANGED
@@ -413,46 +413,6 @@ with gr.Blocks(title="Automated Enzyme Kinetics Extractor") as demo:
413
  search_output.value = initial_output # 直接将错误消息赋值
414
  else:
415
  search_output.value = initial_output.to_html(classes='data', index=False, header=True)
416
- with gr.Tab("Paper"):
417
- gr.Markdown(
418
- '''<h1 align="center"> Leveraging Large Language Models for Automated Extraction of Enzyme Kinetics Data from Scientific Literature </h1>
419
- <p><strong>Abstract:</strong>
420
- <br>Enzyme kinetics data reported in the literature are essential for guiding biomedical research, yet their extraction is traditionally performed manually, a process that is both time-consuming and prone to errors, while there is no automatic extraction pipeline available for enzyme kinetics data. Though Large Language Models (LLMs) have witnessed a significant advancement in information extraction in recent years, the inherent capabilities of processing comprehensive scientific data, both precise extraction and objective evaluation, have been less-investigated. Hence achieving fully automated extraction with satisfactory accuracy and offering a comprehensive performance evaluation standard remain a challenging task. This research introduces a novel framework leveraging LLMs for automatic information extraction from academic literature on enzyme kinetics. It integrated OCR conversion, content extraction, and output formatting through prompt engineering, marking a significant advancement in automated data extraction for scientific research. We contributed a meticulously curated golden benchmark of 156 research articles, which serves as both an accurate validation tool and a valuable resource for evaluating LLM capabilities in extraction tasks. This benchmark enables a rigorous assessment of LLMs in scientific language comprehension, biomedical concept understanding, and tabular data interpretation. The best-performing model achieved a recall rate of 92% and a precision rate of 88%. Our approach culminates in the LLM Enzyme Kinetics Archive (LLENKA), a comprehensive dataset derived from 3,435 articles, offering the research community a structured, high-quality resource for enzyme kinetics data facilitating future research endeavors. Our work leveraged the comprehensive inherent capabilities of LLMs and successfully developed an automated information extraction pipeline that enhances productivity, surpasses manual curation, and serves as a paradigm in various fields.
421
- <br>Figure 1: Pipeline for Enzyme Kinetics Data Extraction
422
- </p>'''
423
- )
424
- gr.Image("static/img.png", label="Pipeline for Enzyme Kinetics Data Extraction")
425
- gr.Markdown(
426
- '''
427
- <p align="center">Figure 1: Pipeline for Enzyme Kinetics Data Extraction
428
- </p>'''
429
- )
430
- gr.Markdown(
431
- '''
432
-
433
- | Model | Overall Entries Extracted | Overall Correct Entries | Overall Recall | Overall Precision | Mean Recall by Paper | Mean Precision by Paper | Km Entries Extracted | Km Correct Entries | Km Recall | Km Precision | Kcat Entries Extracted | Kcat Correct Entries | Kcat Recall | Kcat Precision | Kcat/Km Entries Extracted | Kcat/Km Correct Entries | Kcat/Km Recall | Kcat/Km Precision |
434
- |---------------------------|--------------------------|-------------------------|----------------|-------------------|-----------------------|--------------------------|----------------------|---------------------|-----------|--------------|------------------------|-----------------------|-------------|----------------|--------------------------|-------------------------|---------------|-------------------|
435
- | llama 3.1-405B | 8700 | 7839 | 0.72 | 0.90 | 0.80 | 0.89 | 2870 | 2648 | 0.74 | 0.92 | 2849 | 2594 | 0.73 | 0.91 | 2981 | 2597 | 0.69 | 0.87 |
436
- | claude-3.5-sonnet-20240620| 11348 | 9967 | 0.92 | 0.88 | 0.93 | 0.90 | 3840 | 3314 | 0.93 | 0.86 | 3732 | 3310 | 0.94 | 0.89 | 3776 | 3343 | 0.89 | 0.89 |
437
- | GPT-4o | 9810 | 8703 | 0.80 | 0.89 | 0.85 | 0.90 | 3294 | 2932 | 0.82 | 0.89 | 3188 | 2892 | 0.82 | 0.91 | 3328 | 2879 | 0.77 | 0.87 |
438
- | qwen-plus-0806 | 8673 | 7763 | 0.72 | 0.90 | 0.77 | 0.90 | 2932 | 2665 | 0.75 | 0.91 | 2914 | 2638 | 0.75 | 0.91 | 2827 | 2460 | 0.66 | 0.87 |
439
-
440
- '''
441
- )
442
- gr.Markdown(
443
- '''
444
- <p align="center">
445
- Table 1: Overall Performance of Various Models Examined on 156 Papers
446
- </p>
447
- <p><strong>Please note:</strong>
448
- <br>1. Test model versions: all models were tested in September 2024, The GPT-4o interface was tested on September 23, 2024, while the other model versions are labeled by name.
449
- <br>2. Llama 3.1 is locally deployed, while the other models use online interfaces.
450
- <br>3. The temperature used for all models during testing was 0.3.
451
- <br>4. The maximum outputs of different models also vary, which is discussed in our paper: GPT-4o has 4096 tokens, Claude 3.5 has 8192 tokens, and Qwen-Plus has 8000 tokens, and Llama 3.1 has 4096 tokens.
452
- <br>5. Due to local GPU resource limitations, Llama 3.1 uses a maximum input of 32k tokens.
453
- </p>
454
- '''
455
- )
456
 
457
  extract_button.click(extract_pdf_pypdf, inputs=file_input, outputs=text_output)
458
  exp.click(update_input, outputs=model_input)
 
413
  search_output.value = initial_output # 直接将错误消息赋值
414
  else:
415
  search_output.value = initial_output.to_html(classes='data', index=False, header=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
416
 
417
  extract_button.click(extract_pdf_pypdf, inputs=file_input, outputs=text_output)
418
  exp.click(update_input, outputs=model_input)