Spaces:

MAIL-CS-ECNU
/

Text-Gym-Agents

Runtime error

App Files Files Community

Jarvis-K commited on Nov 22, 2023

Commit

8c8cf65

1 Parent(s): 2a33798

update readme

Browse files

Files changed (1) hide show

README.md +0 -39

README.md CHANGED Viewed

@@ -56,42 +56,3 @@ Here is an example of how to run the script:
 ```
 ./test.sh
 ```
-The commands in test.sh are structured as follows:
-```
-python main.py --env_name ENV_NAME --init_summarizer INIT_SUMMARIZER --curr_summarizer CURR_SUMMARIZER [--future_summarizer FUTURE_SUMMARIZER --future_horizon FUTURE_HORIZON]
-```
-Where:
-* ENV_NAME: The name of the Gym environment to be used (e.g., CartPole-v0).
-* INIT_SUMMARIZER: The initial summarizer to be used (e.g., cart_init_translator).
-* CURR_SUMMARIZER: The current summarizer to be used (e.g., cart_basic_translator).
-* FUTURE_SUMMARIZER (optional): The future summarizer to be used (e.g., cart_basic_translator).
-* FUTURE_HORIZON (optional): The horizon that each policy will look to (e.g., 3).
-## Supported Environment Translators and LLM Deciders
-|                              |          Acrobot         |              Cart Pole             |       Mountain Car       |         Pendulum         |       Lunar Lander       |         Blackjack        |           Taxi           |       Cliff Walking      |        Frozen Lake       |
-|------------------------------|:------------------------:|:----------------------------------:|:------------------------:|:------------------------:|:------------------------:|:------------------------:|:------------------------:|:------------------------:|:------------------------:|
-| Translator                   | :heavy_multiplication_x: |         :white_check_mark:         | :heavy_multiplication_x: | :heavy_multiplication_x: |    :white_check_mark:    | :heavy_multiplication_x: | :heavy_multiplication_x: | :heavy_multiplication_x: | :heavy_multiplication_x: |
-| Chain-of-Thought             |    :heavy_minus_sign:    | :white_check_mark:(L1)<br>:gift:<sup>[1]</sup>(~30) |    :heavy_minus_sign:    |    :heavy_minus_sign:    | :white_check_mark:(L1)<br/>:gift:<sup>[1]</sup>(-367) |    :heavy_minus_sign:    |    :heavy_minus_sign:    |    :heavy_minus_sign:    |    :heavy_minus_sign:    |
-| Program-aided Language Model |    :heavy_minus_sign:    | :white_check_mark:(L1)<br>:gift:(168) |    :heavy_minus_sign:    |    :heavy_minus_sign:    |        :white_check_mark:(L1)<br/>:gift:(-68)         |    :heavy_minus_sign:    |    :heavy_minus_sign:    |    :heavy_minus_sign:    |    :heavy_minus_sign:    |
-| Self-ask Prompting           |    :heavy_minus_sign:    | :white_check_mark:(L1)<br>:gift:(~10) |    :heavy_minus_sign:    |    :heavy_minus_sign:    | :heavy_multiplication_x: |    :heavy_minus_sign:    |    :heavy_minus_sign:    |    :heavy_minus_sign:    |    :heavy_minus_sign:    |
-| Self-consistency Prompting   |    :heavy_minus_sign:    |      :white_check_mark:(L1)<br>:gift:(~30)      |    :heavy_minus_sign:    |    :heavy_minus_sign:    | :heavy_multiplication_x: |    :heavy_minus_sign:    |    :heavy_minus_sign:    |    :heavy_minus_sign:    |    :heavy_minus_sign:    |
-| Reflexion                    |    :heavy_minus_sign:    |      :heavy_multiplication_x:      |    :heavy_minus_sign:    |    :heavy_minus_sign:    | :heavy_multiplication_x: |    :heavy_minus_sign:    |    :heavy_minus_sign:    |    :heavy_minus_sign:    |    :heavy_minus_sign:    |
-| Solo Performance Prompting | :heavy_minus_sign: | :white_check_mark:(L1)<br/>:gift:(43) | :heavy_minus_sign: | :heavy_minus_sign: | :white_check_mark:(L1)<br/>:gift:(-583) | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: |
-<sup>[1]: Cumulative reward.</sup>
-![Image text](https://github.com/mail-ecnu/LLM-Decider-Bench/blob/master/vis/Classic%20Control.png)
-![Image text](https://github.com/mail-ecnu/LLM-Decider-Bench/blob/master/vis/Box%202D.png)
-![Image text](https://github.com/mail-ecnu/LLM-Decider-Bench/blob/master/vis/Toy%20Text.png)
->
-> 1. Except for the reflexion L3 decider, all other L3 deciders in this task do not have memory.
-> 2. reflexion L1 and L3 both have memory.
-> 3. reflexion L1 run 5 trails.
-> 4. Blackjack、MountainCar、Cliffwalking(PAL)、CartPole(PAL)、Taxi(SPP、PAL)、Frozen Lake use deciders modified at 15:29 09.18
-> 5. update Frozen Lake translator, add prior knowledge.
-# Remarks
-1. how to use future info
-We provide future info in the env_info part. It is a dict and you can convert it to a text further to make your agent aware the world model.

 ```
 ./test.sh
 ```