cjber commited on
Commit
82bbfd1
·
1 Parent(s): 73fcee0

improve documentation

Browse files
planning_ai/chains/prompts/extract.txt DELETED
@@ -1 +0,0 @@
1
- Extract the relevant text **verbatim** relating to the following themes:
 
 
planning_ai/chains/prompts/map.txt CHANGED
@@ -1,20 +1,10 @@
1
- This is a list of themes proposed the South Cambridgeshire Council:
2
 
3
- - Climate change: Help Greater Cambridge transition to net zero carbon by 2050, by ensuring that development is sited in places that help to limit carbon emissions, is designed to the highest achievable standards for energy and water use, and is resilient to current and future climate risks.
4
- - Biodiversity and green spaces: Increase and improve our network of habitats for wildlife, and green spaces for people, ensuring that development leaves the natural environment better than it was before.
5
- - Wellbeing and social inclusion: Help people in Greater Cambridge to lead healthier and happier lives, ensuring that everyone benefits from the development of new homes and jobs.
6
- - Great places: Sustain the unique character of Cambridge and South Cambridgeshire, and complement it with beautiful and distinctive development, creating a place where people want to live, work and play.
7
- - Jobs: Encourage a flourishing and mixed economy in Greater Cambridge which includes a wide range of jobs, while maintaining our area's global reputation for innovation.
8
- - Homes: Plan for enough housing to meet our needs, including significant quantities of housing that is affordable to rent and buy, and different kinds of homes to suit our diverse communities.
9
- - Infrastructure: Plan for transport, water, energy and digital networks; and health, education and cultural facilities; in the right places and built at the right times to serve our growing communities.
10
-
11
- Summarise the following response to a planning application that was proposed with these themes in mind, following these steps:
12
-
13
- 1. **Summary:** Provide a brief, neutral summary that captures the key points of the response with respect to the list of themes proposed by the council.
14
- 2. **Stance:** Indicate the author’s overall stance as one of the following: 'SUPPORT', 'OPPOSE', 'MIXED' or 'NEUTRAL'.
15
- 3. **Themes:** Identify what themes proposed by the council are discussed.
16
- 4. **Places:** Identify the places the author has considered for discussion.
17
- 4. **Constructiveness Rating:** Rate how constructive the response is on a scale from 1 to 10, with 1 being unconstructive and 10 being highly constructive.
18
 
19
  **Few-shot examples for reference:**
20
 
@@ -26,10 +16,9 @@ Response:
26
  "I am in favour of this new park development as it will provide much-needed green space for families. However, the parking situation needs to be reconsidered."
27
 
28
  - **Summary:** The author supports the park development for its benefit to families but expresses concern about parking.
29
- - **Stance:** SUPPORT
30
  - **Themes:** Biodiversity and green spaces, Infrastructure
31
  - **Places:** None
32
- - **Constructiveness Rating:** 8
33
 
34
  ---
35
 
@@ -39,10 +28,9 @@ Response:
39
  "This development in Cambridge will destroy local wildlife and create traffic chaos. It should not go ahead."
40
 
41
  - **Summary:** The author opposes the development due to concerns about wildlife and traffic congestion.
42
- - **Stance:** OPPOSE
43
  - **Themes:** Biodiversity and green spaces, Infrastructure
44
  - **Places:** Cambridge
45
- - **Constructiveness Rating:** 3
46
 
47
  ---
48
 
 
1
+ Summarise the following response to a planning application, focusing on the themes and policies proposed by the council. Follow these steps:
2
 
3
+ 1. **Summary:** Provide a concise, neutral summary that captures the key points of the response, particularly in relation to the council's proposed themes.
4
+ 2. **Themes:** List the council's themes discussed in the response.
5
+ 3. **Policies:** Identify relevant policies associated with the extracted themes.
6
+ 4. **Places:** Mention any geographical locations considered by the author.
7
+ 5. **Constructiveness:** Indicate whether the response is constructive. A response is constructive if it provides any feedback or commentary on the plan, regardless of its depth or specificity.
 
 
 
 
 
 
 
 
 
 
8
 
9
  **Few-shot examples for reference:**
10
 
 
16
  "I am in favour of this new park development as it will provide much-needed green space for families. However, the parking situation needs to be reconsidered."
17
 
18
  - **Summary:** The author supports the park development for its benefit to families but expresses concern about parking.
 
19
  - **Themes:** Biodiversity and green spaces, Infrastructure
20
  - **Places:** None
21
+ - **Constructiveness:** True
22
 
23
  ---
24
 
 
28
  "This development in Cambridge will destroy local wildlife and create traffic chaos. It should not go ahead."
29
 
30
  - **Summary:** The author opposes the development due to concerns about wildlife and traffic congestion.
 
31
  - **Themes:** Biodiversity and green spaces, Infrastructure
32
  - **Places:** Cambridge
33
+ - **Constructiveness:** True
34
 
35
  ---
36
 
planning_ai/chains/prompts/ocr.txt ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ The images provided are from a planning response form filled out by a member of the public, containing free-form responses related to a planning application. These responses may be handwritten or typed.
2
+
3
+ Please follow these instructions to process the images:
4
+
5
+ 1. **Extract Free-Form Information Only**: Focus on extracting and outputting the free-form written content from the images. Do not include single-word answers, brief responses, or any extra content that is not part of the detailed responses.
6
+ 2. **Verbatim Output**: Ensure that the extracted information is output exactly as it appears in the images. Add a heading before each section of free-form text if it helps with organisation, but ensure the heading is not added by the model itself. Ignore blank sections entirely—do not generate or include any additional thoughts or content.
7
+ 3. **Sequential Processing**: The images are sequentially ordered. A response might continue from one image to the next, so capture the full context across multiple images if necessary.
8
+ 4. **Ignore Non-Relevant Content**: Exclude any content that does not fit the criteria of free-form, detailed responses.
9
+
10
+ Thank you for your attention to these details.
planning_ai/chains/prompts/themes.txt ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ The following themes are proposed by the South Cambridgeshire Council with each of their associated policies.
2
+
3
+ # Climate change
4
+
5
+ Net zero carbon new buildings
6
+ Water efficiency in new developments
7
+ Designing for a changing climate
8
+ Flooding and integrated water management
9
+ Renewable energy projects and infrastructure
10
+ Reducing waste and supporting the circular economy
11
+ Supporting land-based carbon sequestration
12
+
13
+ # Biodiversity and green spaces
14
+
15
+ Biodiversity and geodiversity
16
+ Green infrastructure
17
+ Improving Tree Canopy Cover and the Tree Population
18
+ River corridors
19
+ Protecting open spaces
20
+ Providing and enhancing open spaces
21
+
22
+ # Wellbeing and social inclusion
23
+
24
+ Creating healthy new developments
25
+ Community, sports and leisure facilities
26
+ Meanwhile uses during long term redevelopments
27
+ Creating inclusive employment and business opportunities through new developments
28
+ Pollution, health and safety
29
+
30
+ # Great places
31
+
32
+ People and place responsive design
33
+ Protection and enhancement of landscape character
34
+ Protection and enhancement of the Cambridge Green Belt
35
+ Achieving high quality development
36
+ Establishing high quality landscape and public realm
37
+ Conservation and enhancement of heritage assets
38
+ Adapting heritage assets to climate change
39
+ Protection of public houses
40
+
41
+ # Jobs
42
+
43
+ New employment and development proposals
44
+ Supporting the rural economy
45
+ Protecting the best agricultural land
46
+ Protecting existing business space
47
+ Enabling remote working
48
+ Affordable workspace and creative industries
49
+ Supporting a range of facilities in employment parks
50
+ Retail and centres
51
+ Visitor accommodation, attractions and facilities
52
+ Faculty development and specialist / language schools
53
+
54
+ # Homes
55
+
56
+ Affordable housing
57
+ Exception sites for affordable housing
58
+ Housing mix
59
+ Housing density
60
+ Garden land and subdivision of existing plots
61
+ Residential space standards and accessible homes
62
+ Specialist housing and homes for older people
63
+ Self and custom build homes
64
+ Build to rent homes
65
+ Houses in multiple occupation (HMOs)
66
+ Student accommodation
67
+ Dwellings in the countryside
68
+ Residential moorings
69
+ Residential caravan sites
70
+ Gypsy and Traveller and Travelling Showpeople sites
71
+ Community-led housing
72
+
73
+ # Infrastructure
74
+
75
+ Sustainable transport and connectivity
76
+ Parking and electric vehicles
77
+ Freight and delivery consolidation
78
+ Safeguarding important infrastructure
79
+ Aviation development
80
+ Energy infrastructure masterplanning
81
+ Infrastructure and delivery
82
+ Digital infrastructure
planning_ai/nodes/hallucination_node.py CHANGED
@@ -9,6 +9,20 @@ from planning_ai.states import DocumentState, OverallState
9
 
10
 
11
  def check_hallucination(state: DocumentState):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  if state["iteration"] > 5:
13
  state["iteration"] = -99
14
  return {"summaries_fixed": [state]}
@@ -33,10 +47,36 @@ def check_hallucination(state: DocumentState):
33
 
34
 
35
  def map_hallucinations(state: OverallState):
 
 
 
 
 
 
 
 
 
 
 
 
 
36
  return [Send("check_hallucination", summary) for summary in state["summaries"]]
37
 
38
 
39
  def fix_hallucination(state: DocumentState):
 
 
 
 
 
 
 
 
 
 
 
 
 
40
  response = fix_chain.invoke(
41
  {
42
  "context": state["document"],
@@ -58,6 +98,19 @@ def fix_hallucination(state: DocumentState):
58
 
59
 
60
  def map_fix_hallucinations(state: OverallState):
 
 
 
 
 
 
 
 
 
 
 
 
 
61
  hallucinations = []
62
  if "hallucinations" in state:
63
  hallucinations = [
 
9
 
10
 
11
  def check_hallucination(state: DocumentState):
12
+ """Checks for hallucinations in the summary of a document.
13
+
14
+ This function uses the `hallucination_chain` to evaluate the summary of a document.
15
+ If the hallucination score is 1, it indicates no hallucination, and the summary is
16
+ considered fixed. If the iteration count exceeds 5, the process is terminated.
17
+
18
+ Args:
19
+ state (DocumentState): The current state of the document, including its summary
20
+ and iteration count.
21
+
22
+ Returns:
23
+ dict: A dictionary containing either a list of fixed summaries or hallucinations
24
+ that need to be addressed.
25
+ """
26
  if state["iteration"] > 5:
27
  state["iteration"] = -99
28
  return {"summaries_fixed": [state]}
 
47
 
48
 
49
  def map_hallucinations(state: OverallState):
50
+ """Maps summaries to the `check_hallucination` function.
51
+
52
+ This function prepares a list of summaries to be checked for hallucinations by
53
+ sending them to the `check_hallucination` function. Allows summaries to be checked
54
+ in parrallel.
55
+
56
+ Args:
57
+ state (OverallState): The overall state containing all summaries.
58
+
59
+ Returns:
60
+ list: A list of Send objects directing each summary to the check_hallucination
61
+ function.
62
+ """
63
  return [Send("check_hallucination", summary) for summary in state["summaries"]]
64
 
65
 
66
  def fix_hallucination(state: DocumentState):
67
+ """Attempts to fix hallucinations in a document's summary.
68
+
69
+ This function uses the `fix_chain` to correct hallucinations identified in a summary.
70
+ The corrected summary is then updated in the document state.
71
+
72
+ Args:
73
+ state (DocumentState): The current state of the document, including its summary
74
+ and hallucination details.
75
+
76
+ Returns:
77
+ dict: A dictionary containing the updated summaries after attempting to fix
78
+ hallucinations.
79
+ """
80
  response = fix_chain.invoke(
81
  {
82
  "context": state["document"],
 
98
 
99
 
100
  def map_fix_hallucinations(state: OverallState):
101
+ """Maps hallucinations to the `fix_hallucination` function.
102
+
103
+ This function filters out hallucinations that need fixing and prepares them to be
104
+ sent to the `fix_hallucination` function. Allows hallucinations to be fixed in
105
+ parrallel.
106
+
107
+ Args:
108
+ state (OverallState): The overall state containing all hallucinations.
109
+
110
+ Returns:
111
+ list: A list of Send objects directing each hallucination to the
112
+ fix_hallucination function.
113
+ """
114
  hallucinations = []
115
  if "hallucinations" in state:
116
  hallucinations = [
planning_ai/nodes/map_node.py CHANGED
@@ -1,15 +1,30 @@
 
 
 
1
  from langgraph.constants import Send
2
  from presidio_analyzer import AnalyzerEngine
3
  from presidio_anonymizer import AnonymizerEngine
4
 
5
  from planning_ai.chains.map_chain import map_chain
 
6
  from planning_ai.states import DocumentState, OverallState
7
 
8
  anonymizer = AnonymizerEngine()
9
  analyzer = AnalyzerEngine()
10
 
11
 
12
- def remove_pii(document: str):
 
 
 
 
 
 
 
 
 
 
 
13
  results = analyzer.analyze(
14
  text=document,
15
  entities=["PERSON", "PHONE_NUMBER", "EMAIL_ADDRESS"],
@@ -19,22 +34,71 @@ def remove_pii(document: str):
19
  return document
20
 
21
 
22
- def generate_summary(state: DocumentState):
 
 
 
 
 
 
 
 
 
 
 
 
23
  state["document"] = remove_pii(state["document"])
24
  response = map_chain.invoke({"context": state["document"]})
25
- return {
26
- "summaries": [
27
- {
28
- "summary": response,
29
- "document": state["document"],
30
- "filename": state["filename"],
31
- "iteration": 1,
32
- }
33
- ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  }
35
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
- def map_summaries(state: OverallState):
 
 
 
38
  return [
39
  Send(
40
  "generate_summary",
 
1
+ import json
2
+ from pathlib import Path
3
+
4
  from langgraph.constants import Send
5
  from presidio_analyzer import AnalyzerEngine
6
  from presidio_anonymizer import AnonymizerEngine
7
 
8
  from planning_ai.chains.map_chain import map_chain
9
+ from planning_ai.common.utils import Paths
10
  from planning_ai.states import DocumentState, OverallState
11
 
12
  anonymizer = AnonymizerEngine()
13
  analyzer = AnalyzerEngine()
14
 
15
 
16
+ def remove_pii(document: str) -> str:
17
+ """Removes personally identifiable information (PII) from a document.
18
+
19
+ This function uses the Presidio Analyzer and Anonymizer to detect and anonymize
20
+ PII such as names, phone numbers, and email addresses in the given document.
21
+
22
+ Args:
23
+ document (str): The document text from which PII should be removed.
24
+
25
+ Returns:
26
+ str: The document text with PII anonymized.
27
+ """
28
  results = analyzer.analyze(
29
  text=document,
30
  entities=["PERSON", "PHONE_NUMBER", "EMAIL_ADDRESS"],
 
34
  return document
35
 
36
 
37
+ def generate_summary(state: DocumentState) -> dict:
38
+ """Generates a summary for a document after removing PII.
39
+
40
+ This function first anonymizes the document to remove PII, then generates a summary
41
+ using the `map_chain`. The summary is added to the document state.
42
+
43
+ Args:
44
+ state (DocumentState): The current state of the document, including its text
45
+ and filename.
46
+
47
+ Returns:
48
+ dict: A dictionary containing the generated summary and updated document state.
49
+ """
50
  state["document"] = remove_pii(state["document"])
51
  response = map_chain.invoke({"context": state["document"]})
52
+ summary = response.summary
53
+ themes = [theme.value for theme in response.themes]
54
+ policies = [policy.dict() for policy in response.policies]
55
+
56
+ out_policies = []
57
+ for theme in policies:
58
+ name = theme["theme"].value
59
+ policy_list = theme["policies"]
60
+ out_policies.append({"theme": name, "policies": policy_list})
61
+
62
+ out_places = []
63
+ for place in response.places:
64
+ name = place.place
65
+ sentiment = place.sentiment.value
66
+ out_places.append({"place": name, "sentiment": sentiment})
67
+
68
+ save_output = {
69
+ "summary": summary,
70
+ "themes": themes,
71
+ "policies": out_policies,
72
+ "places": out_places,
73
+ }
74
+
75
+ outfile = f"{Path(state["filename"]).stem}_summary.json"
76
+ with open(Paths.SUMMARIES / outfile, "w") as file:
77
+ json.dump(save_output, file, indent=4)
78
+
79
+ output = {
80
+ "summary": response,
81
+ "document": state["document"],
82
+ "filename": str(state["filename"]),
83
+ "iteration": 1,
84
  }
85
 
86
+ return {"summaries": [output]}
87
+
88
+
89
+ def map_summaries(state: OverallState) -> list[Send]:
90
+ """Maps documents to the `generate_summary` function for processing.
91
+
92
+ This function prepares a list of documents to be summarized by sending them to the
93
+ `generate_summary` function. It allows for parallel processing of document summaries.
94
+
95
+ Args:
96
+ state (OverallState): The overall state containing all documents and their filenames.
97
 
98
+ Returns:
99
+ list: A list of Send objects directing each document to the `generate_summary`
100
+ function.
101
+ """
102
  return [
103
  Send(
104
  "generate_summary",
planning_ai/nodes/reduce_node.py CHANGED
@@ -3,6 +3,21 @@ from planning_ai.states import OverallState
3
 
4
 
5
  def generate_final_summary(state: OverallState):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  if len(state["documents"]) == len(state["summaries_fixed"]):
7
  summaries = [
8
  str(summary["summary"])
 
3
 
4
 
5
  def generate_final_summary(state: OverallState):
6
+ """Generates a final summary from fixed summaries.
7
+
8
+ This function checks if the number of documents matches the number of fixed summaries.
9
+ It then filters the summaries to include only those with a non-neutral stance and a
10
+ rating of 5 or higher (constructiveness). These filtered summaries are then combined
11
+ into a final summary using the `reduce_chain`.
12
+
13
+ Args:
14
+ state (OverallState): The overall state containing documents, summaries, and
15
+ other related information.
16
+
17
+ Returns:
18
+ dict: A dictionary containing the final summary, along with the original
19
+ documents, summaries, fixed summaries, and hallucinations.
20
+ """
21
  if len(state["documents"]) == len(state["summaries_fixed"]):
22
  summaries = [
23
  str(summary["summary"])