HUANG1993
/

GreedRL-VRP-pretrained-v1

Reinforcement Learning

Deep Reinforcement Learning

Combinatorial Optimization

Reinforcement Learning

Vehicle Routing Problem

Model card Files Files and versions Community

先坤 commited on Apr 20, 2023

Commit

0dcfc8f

•

1 Parent(s): f407e0e

update

Browse files

Files changed (3) hide show

README.md +82 -13
images/GREEDRL-Logo-Original-640.png +0 -0
images/GREEDRL-Network.png +0 -0

README.md CHANGED Viewed

@@ -7,9 +7,9 @@ tags:
 - Reinforcement Learning
 - Vehicle Routing Problem
 ---
-<div align="center">
-    <img src="./images/GREEDRL-Logo-Original-640.png" width = "515" height = "380"/>
-</div>
 # ✊‍GreedRL
@@ -37,16 +37,12 @@ Currently, various VRP variants such as CVRP, VRPTW and PDPTW, as well as proble
     In order to achieve the ultimate performance, the framework implements some high-performance operators specifically for Combinatorial Optimization(CO) problems to replace pytorch operators, such as the Masked Addition Attention and Masked Softmax Sampling."
-<div align="center">
-    <img src="./images/GREEDRL-Framwork.png" width = "515" height = "380"/>
-</div>
 ## Network design
 The neural network adopts the Seq2Seq architecture commonly used in Natural Language Processing(NLP), with the Transformer used in the Encoding part and RNN used in the decoding part, as shown in the diagram below.
-<div align="center">
-    <img src="./images/GREEDRL-Network.png" width = "515" height = "380"/>
-</div>
 ## Modeling examples
@@ -73,11 +69,7 @@ features = [continuous_feature('worker_weight_limit'),
             continuous_feature('task_due_time'),
             continuous_feature('task_service_time'),
             continuous_feature('distance_matrix')]
-```
-</details>
-<details>
-```python
 variables = [task_demand_now('task_demand_now', feature='task_demand'),
              task_demand_now('task_demand_this', feature='task_demand', only_this=True),
              feature_variable('task_weight'),
@@ -137,7 +129,84 @@ class Objective:
 </details>
 ### Pickup and Delivery Problem with Time Windows(PDPTW)
 ## 🏆Award

 - Reinforcement Learning
 - Vehicle Routing Problem
 ---
+![](./images/GREEDRL-Logo-Original-640.png)
 # ✊‍GreedRL
     In order to achieve the ultimate performance, the framework implements some high-performance operators specifically for Combinatorial Optimization(CO) problems to replace pytorch operators, such as the Masked Addition Attention and Masked Softmax Sampling."
+![](./images/GREEDRL-Framwork.png)
 ## Network design
 The neural network adopts the Seq2Seq architecture commonly used in Natural Language Processing(NLP), with the Transformer used in the Encoding part and RNN used in the decoding part, as shown in the diagram below.
+![](./images/GREEDRL-Network.png)
 ## Modeling examples
             continuous_feature('task_due_time'),
             continuous_feature('task_service_time'),
             continuous_feature('distance_matrix')]
 variables = [task_demand_now('task_demand_now', feature='task_demand'),
              task_demand_now('task_demand_this', feature='task_demand', only_this=True),
              feature_variable('task_weight'),
 </details>
 ### Pickup and Delivery Problem with Time Windows(PDPTW)
+<details>
+    <summary>PDPTW</summary>
+```python
+from greedrl.model import runner
+from greedrl.feature import *
+from greedrl.variable import *
+from greedrl.function import *
+from greedrl import Problem, Solution, Solver
+features = [local_category('task_group'),
+            global_category('task_priority', 2),
+            variable_feature('distance_this_to_task'),
+            variable_feature('distance_task_to_end')]
+variables = [task_demand_now('task_demand_now', feature='task_demand'),
+             task_demand_now('task_demand_this', feature='task_demand', only_this=True),
+             feature_variable('task_weight'),
+             feature_variable('task_group'),
+             feature_variable('task_priority'),
+             feature_variable('task_due_time2', feature='task_due_time'),
+             task_variable('task_due_time'),
+             task_variable('task_service_time'),
+             task_variable('task_due_time_penalty'),
+             worker_variable('worker_basic_cost'),
+             worker_variable('worker_distance_cost'),
+             worker_variable('worker_due_time'),
+             worker_variable('worker_weight_limit'),
+             worker_used_resource('worker_used_weight', task_require='task_weight'),
+             worker_used_resource('worker_used_time', 'distance_matrix', 'task_service_time', 'task_ready_time',
+                                  'worker_ready_time'),
+             edge_variable('distance_last_to_this', feature='distance_matrix', last_to_this=True),
+             edge_variable('distance_this_to_task', feature='distance_matrix', this_to_task=True),
+             edge_variable('distance_task_to_end', feature='distance_matrix', task_to_end=True)]
+class Constraint:
+    def do_task(self):
+        return self.task_demand_this
+    def mask_worker_end(self):
+        return task_group_split(self.task_group, self.task_demand_now <= 0)
+    def mask_task(self):
+        mask = self.task_demand_now <= 0
+        mask |= task_group_priority(self.task_group, self.task_priority, mask)
+        worker_used_time = self.worker_used_time[:, None] + self.distance_this_to_task
+        mask |= (worker_used_time > self.task_due_time2) & (self.task_priority == 0)
+        # 容量约束
+        worker_weight_limit = self.worker_weight_limit - self.worker_used_weight
+        mask |= self.task_demand_now * self.task_weight > worker_weight_limit[:, None]
+        return mask
+    def finished(self):
+        return torch.all(self.task_demand_now <= 0, 1)
+class Objective:
+    def step_worker_start(self):
+        return self.worker_basic_cost
+    def step_worker_end(self):
+        feasible = self.worker_used_time <= self.worker_due_time
+        return self.distance_last_to_this * self.worker_distance_cost, feasible
+    def step_task(self):
+        worker_used_time = self.worker_used_time - self.task_service_time
+        feasible = worker_used_time <= self.task_due_time
+        feasible &= worker_used_time <= self.worker_due_time
+        cost = self.distance_last_to_this * self.worker_distance_cost
+        return torch.where(feasible, cost, cost + self.task_due_time_penalty), feasible
+```
+</details>
 ## 🏆Award

images/GREEDRL-Logo-Original-640.png CHANGED Viewed

images/GREEDRL-Network.png CHANGED Viewed