ydin0771 commited on
Commit
e808535
β€’
1 Parent(s): d102e7c

Upload descrip.md

Browse files
Files changed (1) hide show
  1. descrip.md +35 -0
descrip.md ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # V-Doc : Visual questions answers with Documents
2
+ This repository contains code for the paper [V-Doc : Visual questions answers with Documents](https://arxiv.org/pdf/2205.13724.pdf). The demo videos can be accessed by this [link](https://drive.google.com/file/d/1Ztp9LBcrEcJA3NlbFWn1RfNyfwt8Y6Qk/view).
3
+
4
+ <h4 align="center">
5
+ <b>Ding, Y.*, Huang, Z.*, Wang, R., Zhang, Y., Chen, X., Ma, Y., Chung, H., & Han, C. (CVPR 2022) <br/><a href="https://arxiv.org/pdf/2205.13724.pdf">V-Doc : Visual questions answers with Documents</a><br/></b></span>
6
+ </h4>
7
+
8
+ <p align="center">
9
+ <img src="https://github.com/usydnlp/vdoc/blob/main/images/system_architecture.png">
10
+ </p>
11
+
12
+ ### Dataset in Dataset Storage Module
13
+
14
+ The dataset we used to trained the model is provided in following links:
15
+
16
+
17
+ [PubVQA Dataset](https://drive.google.com/drive/folders/1YMuctGPJbsy45Iz23ygcN1VGHWQp3aaU?ths=true) for training Mac-Network.
18
+
19
+ Dataset for training LayoutLMv2([FUNSD-QA](https://drive.google.com/file/d/1Ev_sLTx3U9nAr2TGgUT5BXB1rpfLMlcq/view?usp=sharing)).
20
+
21
+ ### Dataset Generation
22
+ To run the scene based question generation code, we need to fetch the JSON files from the source.
23
+
24
+ #### Extract OCR information
25
+ ```bash
26
+ python3 ./document_collection.py
27
+ ```
28
+ After the step above, a new folder called <code>./input_ocr</code> will be generated.
29
+ #### Generate questions
30
+ ```bash
31
+ python3 ./scene_based/pdf_generate_question.py
32
+ ```
33
+ To limit the number of generated questions, you can change the code in <code>pdf_generate_question.py</code> line 575 and line 591-596
34
+
35
+ After the steps above, you can see a json file under the <code>./output_qa_dataset</code>.