File size: 1,747 Bytes
4795cbc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d1543c4
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
---
annotations_creators:
- expert-generated
- machine-generated
language_creators:
- expert-generated
- found
languages:
- en
license: cc-by-4.0
multilinguality:
- monolingual
size_categories:
- 1K<n<10K
source_datasets:
- original
task_categories:
- text-generation
- text-classification
task_ids:
- future-work-generation
- scientific-section-classification
pretty_name: ACL Future Work Dataset (2023–2024)
tags:
- scientific-articles
- future-work
- NLP
- ACL
- NeurIPS
- LLM-evaluation
language:
- en
---

# 🧠 ACL Future Work Dataset (2023–2024)

This dataset consists of structured scientific paper data from ACL 2023 and ACL 2024 proceedings. Each paper is parsed into sections (e.g., Introduction, Related Work, Conclusion), and a **"Future Work"** section is automatically or manually extracted from the parsed text by searching for relevant future-oriented sentences in reverse section order.

## πŸ“ Dataset Structure

Each JSON file (`acl23_future_cleaned_final.json` and `acl24_future_cleaned_final.json`) has the following format:

```json
{
  "ACL23_1.pdf": {
    "abstractText": "Abstract of the paper...",
    "sections": [
      {
        "heading": "1 Introduction",
        "text": "..."
      },
      ...
      {
        "heading": "Future Work",
        "text": "We plan to extend this method by..."
      }
    ],
    "title": "Paper Title",
    "year": 2023
  },
  ...
}

## πŸ“œ License

This dataset is licensed under the [Creative Commons Attribution 4.0 International License (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/).  
You are free to use, share, and adapt the dataset as long as you give appropriate credit.

### ✍️ Curated by 
Ibrahim Al Azher, Northern Illinois University, DATALab