Jarvis-K commited on
Commit
8c8cf65
1 Parent(s): 2a33798

update readme

Browse files
Files changed (1) hide show
  1. README.md +0 -39
README.md CHANGED
@@ -56,42 +56,3 @@ Here is an example of how to run the script:
56
  ```
57
  ./test.sh
58
  ```
59
- The commands in test.sh are structured as follows:
60
-
61
- ```
62
- python main.py --env_name ENV_NAME --init_summarizer INIT_SUMMARIZER --curr_summarizer CURR_SUMMARIZER [--future_summarizer FUTURE_SUMMARIZER --future_horizon FUTURE_HORIZON]
63
- ```
64
- Where:
65
-
66
- * ENV_NAME: The name of the Gym environment to be used (e.g., CartPole-v0).
67
- * INIT_SUMMARIZER: The initial summarizer to be used (e.g., cart_init_translator).
68
- * CURR_SUMMARIZER: The current summarizer to be used (e.g., cart_basic_translator).
69
- * FUTURE_SUMMARIZER (optional): The future summarizer to be used (e.g., cart_basic_translator).
70
- * FUTURE_HORIZON (optional): The horizon that each policy will look to (e.g., 3).
71
-
72
- ## Supported Environment Translators and LLM Deciders
73
-
74
- | | Acrobot | Cart Pole | Mountain Car | Pendulum | Lunar Lander | Blackjack | Taxi | Cliff Walking | Frozen Lake |
75
- |------------------------------|:------------------------:|:----------------------------------:|:------------------------:|:------------------------:|:------------------------:|:------------------------:|:------------------------:|:------------------------:|:------------------------:|
76
- | Translator | :heavy_multiplication_x: | :white_check_mark: | :heavy_multiplication_x: | :heavy_multiplication_x: | :white_check_mark: | :heavy_multiplication_x: | :heavy_multiplication_x: | :heavy_multiplication_x: | :heavy_multiplication_x: |
77
- | Chain-of-Thought | :heavy_minus_sign: | :white_check_mark:(L1)<br>:gift:<sup>[1]</sup>(~30) | :heavy_minus_sign: | :heavy_minus_sign: | :white_check_mark:(L1)<br/>:gift:<sup>[1]</sup>(-367) | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: |
78
- | Program-aided Language Model | :heavy_minus_sign: | :white_check_mark:(L1)<br>:gift:(168) | :heavy_minus_sign: | :heavy_minus_sign: | :white_check_mark:(L1)<br/>:gift:(-68) | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: |
79
- | Self-ask Prompting | :heavy_minus_sign: | :white_check_mark:(L1)<br>:gift:(~10) | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_multiplication_x: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: |
80
- | Self-consistency Prompting | :heavy_minus_sign: | :white_check_mark:(L1)<br>:gift:(~30) | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_multiplication_x: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: |
81
- | Reflexion | :heavy_minus_sign: | :heavy_multiplication_x: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_multiplication_x: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: |
82
- | Solo Performance Prompting | :heavy_minus_sign: | :white_check_mark:(L1)<br/>:gift:(43) | :heavy_minus_sign: | :heavy_minus_sign: | :white_check_mark:(L1)<br/>:gift:(-583) | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: |
83
-
84
- <sup>[1]: Cumulative reward.</sup>
85
- ![Image text](https://github.com/mail-ecnu/LLM-Decider-Bench/blob/master/vis/Classic%20Control.png)
86
- ![Image text](https://github.com/mail-ecnu/LLM-Decider-Bench/blob/master/vis/Box%202D.png)
87
- ![Image text](https://github.com/mail-ecnu/LLM-Decider-Bench/blob/master/vis/Toy%20Text.png)
88
-
89
- >
90
- > 1. Except for the reflexion L3 decider, all other L3 deciders in this task do not have memory.
91
- > 2. reflexion L1 and L3 both have memory.
92
- > 3. reflexion L1 run 5 trails.
93
- > 4. Blackjack、MountainCar、Cliffwalking(PAL)、CartPole(PAL)、Taxi(SPP、PAL)、Frozen Lake use deciders modified at 15:29 09.18
94
- > 5. update Frozen Lake translator, add prior knowledge.
95
- # Remarks
96
- 1. how to use future info
97
- We provide future info in the env_info part. It is a dict and you can convert it to a text further to make your agent aware the world model.
 
56
  ```
57
  ./test.sh
58
  ```