jgs-430 commited on
Commit
8749573
·
1 Parent(s): c41475b

updated readme

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md CHANGED
@@ -9,3 +9,75 @@ short_description: Predicts JIRA Story Point positional Increments
9
  ---
10
 
11
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
12
+
13
+
14
+ # **JIRA Story Point Increment Predictor**
15
+
16
+ T
17
+ The model uses JIRA **summary** and **description** text fields, converted into embeddings using the Hugging Face model
18
+ [`sentence-transformers/all-MiniLM-L6-v2`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2).
19
+
20
+ Using **Databricks AutoML**, this dataset was analyzed and an **XGBoostRegressor** model was generated to predict story point increments.
21
+
22
+ ---
23
+
24
+ ## **Why**
25
+
26
+ For any JIRA project facing time constraints in sizing issues, this solution provides a **generalized prediction strategy** using a **complexity scale of 8 increments**.
27
+
28
+ While mapping story points to hours is up to each team, this model offers a way to assign **complexity** using a **positional index** on that scale.
29
+
30
+ ---
31
+
32
+ ## **Integration**
33
+
34
+ Hosting this model as an **API** allows for seamless integration with client services.
35
+
36
+ A client can send issue **summary** and **description** text, which the API embeds and passes to the model.
37
+ The model returns a **predicted increment** for story points.
38
+
39
+ Clients can then map this increment to their internal story point scale as desired.
40
+
41
+ ---
42
+
43
+ ## **Attribution**
44
+
45
+ This model was built using data provided by:
46
+
47
+ ### **The Public Jira Dataset**
48
+
49
+ **Creators**
50
+ - Montgomery, Lloyd
51
+ - Lüders, Clara
52
+ - Maalej, Prof. Dr. Walid
53
+
54
+ **Description**
55
+ Jira is an issue tracking system that helps software companies (among others) manage their projects, communities, and processes.
56
+ This dataset is a collection of **public Jira repositories** downloaded using the Jira API V2.
57
+
58
+ It includes data from:
59
+ - **16 public Jira repositories**
60
+ - **1822 projects**
61
+ - **2.7 million issues**
62
+ - **32 million changes**
63
+ - **9 million comments**
64
+ - **1 million issue links**
65
+
66
+ The repository contains:
67
+ - MongoDB data dumps
68
+ - Scripts to download and interpret the data
69
+ - Qualitative analyses to make the data more approachable
70
+
71
+ > **Note:**
72
+ > All personal information (e.g., assignee, creator, reporter, comment authors) has been anonymized using UUID4 masks, maintaining uniqueness while protecting privacy.
73
+
74
+ ---
75
+
76
+ ### **Citation**
77
+
78
+ Montgomery L, Lüders C, Maalej W.
79
+ *An Alternative Issue Tracking Dataset of Public Jira Repositories.*
80
+ In: 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR).
81
+ 2022 May 23. pp. 73–77. IEEE.
82
+
83
+ 🔗 [https://zenodo.org/records/15719919](https://zenodo.org/records/15719919)