nielsr HF Staff commited on
Commit
448894f
·
verified ·
1 Parent(s): a7dde2b

Add pipeline tag and improve model card

Browse files

Hi! I'm Niels from the community science team at Hugging Face. I've opened this PR to improve the model card for your data curation models.

This update adds structured metadata, including the `image-classification` pipeline tag and domain-specific tags (`medical`, `surgical`, `endoscopy`). This will help researchers find these artifacts more easily on the Hugging Face Hub. I have also cleaned up the Markdown structure to make the documentation clearer while preserving all existing images, links, and usage examples.

Files changed (1) hide show
  1. README.md +23 -17
README.md CHANGED
@@ -1,26 +1,30 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
3
  ---
4
 
5
  <div align="center">
6
  <img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/cE7UgFfJJ2gUHJr0SSEhc.png"> </img>
7
  </div>
8
 
9
-
10
-
11
-
12
  [📚 Paper](https://arxiv.org/abs/2503.19740) - [🤖 GitHub](https://github.com/visurg-ai/LEMON)
13
 
14
- We provide the models used in our data curation pipeline in [📚 LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings](https://arxiv.org/abs/2503.19740) to assist with constructing the LEMON dataset (for more details about the LEMON dataset and our
15
- LemonFM foundation model, please visit our github repository at [🤖 GitHub](https://github.com/visurg-ai/LEMON)) .
16
 
 
 
 
17
 
18
  If you use our dataset, model, or code in your research, please cite our paper:
19
 
20
- ```
21
  @misc{che2025lemonlargeendoscopicmonocular,
22
  title={LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings},
23
- author={Chengan Che and Chao Wang and Tom Vercauteren and Sophia Tsoka and Luis C. Garcia-Peraza-Herrera},
24
  year={2025},
25
  eprint={2503.19740},
26
  archivePrefix={arXiv},
@@ -29,10 +33,9 @@ If you use our dataset, model, or code in your research, please cite our paper:
29
  }
30
  ```
31
 
 
32
 
33
-
34
- This Hugging Face repository includes video storyboard classification models, frame classification models, and non-surgical object detection models. The model loader file can be found at [model_loader.py](https://huggingface.co/visurg/Surg3M_curation_models/blob/main/model_loader.py)
35
-
36
 
37
  <div align="center">
38
  <table style="margin-left: auto; margin-right: auto;">
@@ -59,15 +62,16 @@ This Hugging Face repository includes video storyboard classification models, fr
59
  </table>
60
  </div>
61
 
62
-
63
  The data curation pipeline leading to the clean videos in the LEMON dataset is as follows:
64
  <div align="center">
65
  <img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/jzw36jlPT-V_I-Vm01OzO.png"> </img>
66
  </div>
67
 
68
- Usage
69
- --------
70
- **Video classification models** are employed in the step **2** of the data curation pipeline to classify a video storyboard as either surgical or non-surgical, the models usage is as follows:
 
 
71
  ```python
72
  import torch
73
  import torchvision
@@ -102,7 +106,8 @@ Usage
102
  outputs = net(img_tensor)
103
  ```
104
 
105
- **Frame classification models** are used in the step **3** of the data curation pipeline to classify a frame as either surgical or non-surgical, the models usage is as follows:
 
106
 
107
  ```python
108
  import torch
@@ -137,7 +142,8 @@ Usage
137
  outputs = net(img_tensor)
138
  ```
139
 
140
- **Non-surgical object detection models** are used to obliterate the non-surgical region in the surgical frames (e.g. user interface information), the models usage is as follows:
 
141
 
142
  ```python
143
  import torch
@@ -170,4 +176,4 @@ Usage
170
 
171
  # Extract features from the image
172
  outputs = net(img_tensor)
173
- ```
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: image-classification
4
+ tags:
5
+ - medical
6
+ - surgical
7
+ - endoscopy
8
  ---
9
 
10
  <div align="center">
11
  <img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/cE7UgFfJJ2gUHJr0SSEhc.png"> </img>
12
  </div>
13
 
 
 
 
14
  [📚 Paper](https://arxiv.org/abs/2503.19740) - [🤖 GitHub](https://github.com/visurg-ai/LEMON)
15
 
16
+ This repository provides the models used in the data curation pipeline for the paper [LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings](https://arxiv.org/abs/2503.19740). These models assist in constructing the LEMON dataset by filtering and processing surgical video content.
 
17
 
18
+ For more details about the LEMON dataset and our LemonFM foundation model, please visit our [GitHub repository](https://github.com/visurg-ai/LEMON).
19
+
20
+ ## Citation
21
 
22
  If you use our dataset, model, or code in your research, please cite our paper:
23
 
24
+ ```bibtex
25
  @misc{che2025lemonlargeendoscopicmonocular,
26
  title={LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings},
27
+ author={Chengan Che and Chao Wang and Tom Vercauteren messenger, Sophia Tsoka and Luis C. Garcia-Peraza-Herrera},
28
  year={2025},
29
  eprint={2503.19740},
30
  archivePrefix={arXiv},
 
33
  }
34
  ```
35
 
36
+ ## Model Overview
37
 
38
+ This Hugging Face repository includes video storyboard classification models, frame classification models, and non-surgical object detection models. The model loader file can be found at [model_loader.py](https://huggingface.co/visurg/Surg3M_curation_models/blob/main/model_loader.py).
 
 
39
 
40
  <div align="center">
41
  <table style="margin-left: auto; margin-right: auto;">
 
62
  </table>
63
  </div>
64
 
 
65
  The data curation pipeline leading to the clean videos in the LEMON dataset is as follows:
66
  <div align="center">
67
  <img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/jzw36jlPT-V_I-Vm01OzO.png"> </img>
68
  </div>
69
 
70
+ ## Usage
71
+
72
+ ### Video classification models
73
+ **Video classification models** are employed in step **2** of the data curation pipeline to classify a video storyboard as either surgical or non-surgical:
74
+
75
  ```python
76
  import torch
77
  import torchvision
 
106
  outputs = net(img_tensor)
107
  ```
108
 
109
+ ### Frame classification models
110
+ **Frame classification models** are used in step **3** of the data curation pipeline to classify a frame as either surgical or non-surgical:
111
 
112
  ```python
113
  import torch
 
142
  outputs = net(img_tensor)
143
  ```
144
 
145
+ ### Non-surgical object detection models
146
+ **Non-surgical object detection models** are used to obliterate the non-surgical region in the surgical frames (e.g. user interface information):
147
 
148
  ```python
149
  import torch
 
176
 
177
  # Extract features from the image
178
  outputs = net(img_tensor)
179
+ ```