|
--- |
|
license: bsd-3-clause |
|
tags: |
|
- InvertedDoublePendulum-v2 |
|
- reinforcement-learning |
|
- decisions |
|
- TLA |
|
- deep-reinforcement-learning |
|
model-index: |
|
- name: TLA |
|
results: |
|
- metrics: |
|
- type: mean_reward |
|
value: 9356.67 |
|
name: mean_reward |
|
- type: Action Repetition |
|
value: .7522 |
|
name: Action Repetition |
|
- type: Average Decisions |
|
value: 247.76 |
|
name: Average Decisions |
|
task: |
|
type: OpenAI Gym |
|
name: OpenAI Gym |
|
dataset: |
|
name: InvertedDoublePendulum-v2 |
|
type: InvertedDoublePendulum-v2 |
|
Paper: https://arxiv.org/abs/2305.18701 |
|
Code: https://github.com/dee0512/Temporally-Layered-Architecture |
|
--- |
|
# Temporally Layered Architecture: InvertedDoublePendulum-v2 |
|
|
|
These are 10 trained models over **seeds (0-9)** of **[Temporally Layered Architecture (TLA)](https://github.com/dee0512/Temporally-Layered-Architecture)** agent playing **InvertedDoublePendulum-v2**. |
|
|
|
## Model Sources |
|
|
|
**Repository:** [https://github.com/dee0512/Temporally-Layered-Architecture](https://github.com/dee0512/Temporally-Layered-Architecture) |
|
**Paper:** [https://doi.org/10.1162/neco_a_01718](https://doi.org/10.1162/neco_a_01718) |
|
**Arxiv:** [arxiv.org/abs/2305.18701](https://arxiv.org/abs/2305.18701) |
|
|
|
# Training Details: |
|
Using the repository: |
|
|
|
``` |
|
python main.py --env_name <environment> --seed <seed> |
|
``` |
|
|
|
# Evaluation: |
|
|
|
Download the models folder and place it in the same directory as the cloned repository. |
|
Using the repository: |
|
|
|
``` |
|
python eval.py --env_name <environment> |
|
``` |
|
|
|
## Metrics: |
|
|
|
**mean_reward:** Mean reward over 10 seeds |
|
**action_repeititon:** percentage of actions that are equal to the previous action |
|
**mean_decisions:** Number of decisions required (neural network/model forward pass) |
|
|
|
|
|
# Citation |
|
|
|
The paper can be cited with the following bibtex entry: |
|
|
|
## BibTeX: |
|
|
|
``` |
|
@article{10.1162/neco_a_01718, |
|
author = {Patel, Devdhar and Sejnowski, Terrence and Siegelmann, Hava}, |
|
title = "{Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures}", |
|
journal = {Neural Computation}, |
|
pages = {1-30}, |
|
year = {2024}, |
|
month = {10}, |
|
issn = {0899-7667}, |
|
doi = {10.1162/neco_a_01718}, |
|
url = {https://doi.org/10.1162/neco\_a\_01718}, |
|
eprint = {https://direct.mit.edu/neco/article-pdf/doi/10.1162/neco\_a\_01718/2474695/neco\_a\_01718.pdf}, |
|
} |
|
``` |
|
|
|
## APA: |
|
``` |
|
Patel, D., Sejnowski, T., & Siegelmann, H. (2024). Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures. Neural Computation, 1-30. |
|
``` |